I'd like to localize an iOS and Android app for over 25 different languages using a custom font. The problem is no new fonts cover that type of ground. What's the current best practice for this problem in app development?
I've only come up with the following 2 solutions, however I'm unsure either are possible or a good idea.
1.) Hire a font designer to create a massive custom font across at least 3 different weights (regular, bold, italic). But that could be extremely costly considering the app license for some single-weight simplified Chinese fonts are 5k alone.
2.) Use a custom font that covers about 10 languages thanks to Latin characters (e.g. Proxima Nova) and then similar-looking fonts for unsupported languages.
It seems to me the current best practice is to use a custom font that covers a bulk of Latin-based languages and all unsupported languages fallback to the device fonts. But I've experienced problems there as well particularly with localizing dynamic third-party data from Facebook connect. If I'm in America and my friend in China has Chinese characters in their username a custom font outputs little square glyphs instead of falling back on the device's Chinese character set.
In any case both solutions add quite a bit of file size to the app which itself could be a deal-breaker. For solution 2 I've also considered using static images instead of embedding additional fonts, but that also presents a problem in localizing dynamic third-party data and creates a ton of work if the app should ever need updated.
Any suggestions would be greatly appreciated. Thanks.
Related
After trying several available eng.traineddata files with an Android app that employs Tesseract, I have less than stellar accuracy. Since my application will be using just a few fonts (font sizes, bold and regular), I thought I could get much better accuracy by building my own data. An example of the kind of thing (an 8.5x11 inch paper) that users will taking a picture of is here:
I have looked at jTessBoxEditor, but wondered if that was an appropriate path to investigate. And if so, I was unsure how to proceed with respect to a starting point, or to try from scratch. The font (which looks like Times New Roman) is very common, and didn't want to re-invent the wheel. I also wondered about how to treat the font on the two different color backgrounds.
Also, I wondered if I could just print-out ABC... abc... 123... in Times New Roman font and get that into a custom eng.traineddata file. If I understand correctly, you want the 'cleanest' data (i.e. no 'bad examples' of letters) in the source material used to train your system. But it would seem as if there would be a tutorial or procedure defined for how to build trained data for a specific font. If there is, it's been eluding me.
I would consider using machine learning, but so you don't have to do it on your own, look at Tensorflow Mobile. This is a version that is for mobile devices, and to help with character recognition you can look at this article.
To train any neural net a set of training data along with correct
outputs must be provided. In this case this will be a set of 128x64
images along with the expected output.
This will help you easily implement a solution to recognize the characters, and by going with this approach you can extend to more fonts if you desire by just doing more training.
I have been tasked to create a new android 3rd party keyboard that supports customized emojis (My own Icons) from assets.
I want to implement a softkeyboard with my own emoji icons without using UniCode or my custom UniCode.
Questions:
If I create a custom emoji, with some string of characters which does not map to the standard set of emojis, and text this message to a friend with the customized app/keyboard, what shows up on their device? The regular ASCII characters string? or the image.
I have read two ways to add image to textView.
Html.ImageGetter
Spannable Image (String consisting of image)
Which way should i prefer?
Is there anyway to display(send) the customized emoji on the recipients device without downloading the app/keyboard?
Is it possible to send text with Image(Emoji) to other apps like facebook,skype and for messaging.
Need suggestions.
Simple Words
I simply want to send my custom(Emoji icon) to other apps as this app does with out using unicode or with my custom UniCode.
Thanks.
To answer the first part of your question, by definition Emoji are encoded characters - they are a part of unicode. See here:
http://emojipedia.org/unicode-8/
There are many references to this if you look. You will also discover that for a long time Apple and Google used two different sets. They are now merged, but then Android manufacturers and carriers have added their own emoji "versions."
Changing the keyboard to have custom images will not change the data that is transmitted to the other device. So, to answer the next part of your question: what shows up on their device is whatever the ASCII or Unicode character that was transmitted, not what the sender "typed."
In other words, to answer the next part of you question, generally speaking there is not a way to send custom characters to another device without them having your app. A keyboard would not suffice because apps do the job of displaying text/images. So unless an app knows that you are the content provider or source or whatever of the image, it will display whatever it knows to do. So, a custom keyboard won't even display custom emoji on your own device, unless you are also using your own app.
I said "generally not possible" because here are your options:
You can become a part of the Unicode Consortium (http://unicode.org/) and submit your emoji images for approval to go into a future version of Unicode. There are future emoji already in the works, FYI. That will likely take several years, by the way, and it's unlikely they will approve commercially biased images. However, unicode has the capacity to handle billions of characters and is hardly even close to being full (Unicode 16, not Unicode 8 - Unicode 8 is full). Even then, the Android team would need to adopt it and include it in a future release like smileys and the current set of emoji are.
You build your own app with your own emoji and get people on both sides of the communication to download it, like everyone else does. IMO, this not ideal for anyone but the developer of the app. Still, the ones that people enjoy I applaud for their work and success. That industry is fickle and difficult to really gain a presence in.
I'm a part of sdmmllc.com - and we're trying to develop a messaging "platform" exactly for situations like yours. We want to allow messaging apps to "discover" other messaging apps, incorporate features like custom emoji, without the user getting confused or having to download tons of apps. This is similar to plug-ins in web browsers. Our developers love us, our users love us, but it's a slow process.
Develop a competing platform. (And good luck with that - no one really seems to be getting the concept, except the few developers we have, and the hundreds of users that download our app every day and love our idea and platform... but there's no money it so far...)
you can only use those uniCode which are supported. you cannot add your own for generic use. But you can use it with in your app and between your app. It is not possible.
In short it is not possible to create your own Unicode. But you can do it with app to app. and on both ends you have to store those character in database. and match them when they get..
Background
TextView always had issues with RTL (Right-To-Left) languages. Since I know only how to read Hebrew (in addition to English), I will talk about its issues:
Text alignment (and I'm not talking about gravity) . As an RTL language, Hebrew puts words from right to left (compared to English which is the opposite).
For demonstrating how annoying it is, imagine that instead of showing "Hello world." you usually get ".Hello world" . This could be easily fixed if you had it in a single sentence, but it's harder when there are multiple punctuations characters.
Vowels positions. Hebrew doesn't require vowels in order to read text, but sometimes it's very hard to read without them (especially the bible). For vowels, Hebrew has what is called "NIKUD", which are actually like dots inside the letters. The problem in Android was that they were usually positioned in the wrong location .
For demonstrating how annoying it is, imagine that instead of showing "Hello world." you usually get ".eHlol owrld" . Even if you try to fix it (put the vowels always one character after the current one), the position in the letter wasn't correct (imagine that the "e" in "Hello" would be like above the "H", for example) .
Only on version 4.2 (read here, under "Native RTL support") , Google has fixed all of the Hebrew related issues (or at least it seems so).
The problem
the problems with Hebrew has caused each Israeli carrier and each custom ROM maker have its own solution of how to fix the different issues, which makes it practically impossible to handle RTL text on pre 4.2 devices.
Things can get even more frustrating in case the text include both Hebrew and English letters.
What I've tried
I've read many websites talking about those problems, and I've tried many variants of the solutions, none has solved the problem on all devices:
Some suggest to put the character '\u200F' (or '\u202D') at the end/start/both of the text.
Some suggest using Html.fromHtml() method and put something special there.
Some even suggest to use the WebView instead (and maybe use WebSettings.setDefaultTextEncodingName() ).
The question
Is there a definite solution for this problem?
I would assume the best thing is that because Android 4.2 solves this, and Android is open source, we should have its TextView imported into a library that we can use, but Google hasn't provided such a library yet.
Sadly, I don't think there's a good solution ("good" meaning "does the job and is readily available"). For our own Android apps that support Hebrew, we use a custom rendering mechanism that we developed over many years. The rendering mechanism does everything: proprietary fonts; bidirectional (bidi) analysis; glyph placement; line break analysis; text flow; etc. Some of the problems trying to use native Android text handling capabilities (especially pre-4.2) are:
Really crappy fonts. However, you can package third-party fonts like DejaVu that are pretty good. The right font can do wonders with positioning of nekudot—and te'amim1, if you need that. (I agree with you about how important correct pointing placement is; reading Hebrew text with misplaced nekudot is like reading a screen-full of CAPTCHAs.)
Buggy bidi analysis. What makes it worse is that the bugs seem to be different for different versions of Android. Modifying the text to include strategically placed bidi formatting codes (RTL mark; LTR mark; etc.) can overcome many of these bugs (see the discussion here, which isn't specific to Android). However, it's a nuisance to do this and, because of the inconsistencies among Android versions, it is difficult to predict in advance what help the framework is going to need.
No (or poorly thought out) framework-level awareness of right-to-left issues. For instance, good luck getting the scroll bar to display on the left side of a Hebrew TextView. For our apps, we had to build an entire scroll-bar system just to get this to work how we wanted. (Good think Android is open source!)
Poor line and word break analysis. At least one early version of Android on which we tested thought that each nikud mark was a word boundary. When it comes to line breaks, the system often doesn't know how to handle Hebrew punctuation like maqaf, gershayim, or sof pasuk.
Some of the newer Unicode characters (like HOLAM HASER FOR VAV—U+05BA—new to Unicode 5.0) are not recognized as Hebrew script by the system.
My recommendation is that, unless you are prepared to build a top-to-bottom text handling system yourself, you give up on high-quality text display on pre-4.2 versions of Android, particularly if you need to support nekudot and te'amim. Also, plan to use the techniques I mentioned in the first two points above.
1 biblical cantillation marks
As of August 2013, Android has posted API documentation for a Bi Directional Formatter which might suit your needs. This is contained in the Android Support v4 library which I believe should run in versions prior to Android 4.2.
Refer to:
http://developer.android.com/reference/android/support/v4/text/BidiFormatter.html
I am trying to determine the best approach on Android for supporting multiple languages. I understand how resource folders work, and how they get selected when the activity loads and/or has a configuration change. I also have seen a technique of creating a new locale, assigning it as the default, and broadcasting a config change. This works. But I get the impression from this thread (https://groups.google.com/forum/?fromgroups=#!topic/android-developers/_ZGOTHwzl-w) and the answers from the google framework team this way of doing things is not recommended / supported. So my questions are:
What is the recommended way to support multi languages on the fly without sending the user to the OS menus for language selection?
Same question for keyboard input.
Finally, I see on my Motorola Xoom when I ask the Locale class for supported languages an impressive list. For instance, ja-JP, which I've tested and seen allows me to display Japanese chars. However there is no SIP for this language on the device. Can I download new keyboards to my platform in these cases? It just seems odd to me that the platform would support displaying many more languages than it could input.
Just leave the system do the work.
A user with a language and a keyboard selected in settings will just expect the same conditions from your app.
As far as I knew, there's no better approach as the strings.xml in the different values folders.
I am planning to do an Android application to support non-Latin complex scripting language in Android. Unicode support is already there in the Android but some Unicode text rendering has issues that make the languages untidy.
Main idea is to identify the language and perform rules based on the identified language. So it will be:
Define rules and store them
Identify language
Apply rules
Some languages have NZWJ (non zero width joining) rules that behave in differently like left, right, both left and right, top, both top and left etc.
My questions are:
Is there are a proper/documented resources to get this done? (best text book to be referred etc.)
What will be the affects in browser and other applications that use the same language and how this can be applied to those applications ?
What are the changes has to be done to fonts or standards to be followed ?
Thanks.
A thesis done for a master course is available at this link. The project has done to support non-Latin language using Devanagiri font script as a model.
Most parts of the question is answered in the thesis.