Android: Few characters missing from server text for Internationalization

Android: Few characters missing from server text for Internationalization - android

I am working on an Android app which supports few other locales apart from English (like PTBr). I get the text from server in PTBr (Brazilian Portuguese) and I display it as it is in a Text view. I have also set my phone's locale to PTBr but still don't see the complete text. A few characters are missing.
I am stressing on "server-data" since I don't have it locally, and thus I can't put it in res-string folders.
Thanks!

Related

Google Play Store Description Size & Color [duplicate]

I've made an Android application that is available on Google Play. Now I want to add some more formatting to my app description (eg. indent, links, lists..). But I cannot find any website where possible formatting is listed. Google Help pages cannot help me either on this subject. There exists a lot of different formats and I don't really know which one to use (eg. HTML or wiki formatting..)
I could test it with trial and error, but that would take some time, because Google Play only refreshes after 2-3 hours. And while I'm testing, my app description would be rather ugly if the wrong format was used.
tl;dr Is there a list of all possible formatting I could use in the app description for Google Play?

Experimentally, I've discovered that you can provide:
Single line breaks are ignored; double line breaks open a new paragraph.
Single line breaks can be enforced by ending a line with two spaces (similar to Markdown).
A limited set of HTML tags (optionally nested), specifically:
<b>…</b> for boldface,
<i>…</i> for italics,
<u>…</u> for underline,
<br /> to enforce a single line break,
I could not find any way to get strikethrough working (neither HTML or Markdown style).
A fully-formatted URL such as http://google.com; this appears as a hyperlink.
(Beware that trying to use an HTML <a> tag for a custom description does not work and breaks the formatting.)
HTML character entities are supported, such as → (→), ™ (™) and ® (®); consult this W3 reference for the exhaustive list.
UTF-8 encoded characters are supported, such as é, €, £, ‘, ’, ★ and ☆.
Indentation isn't strictly possible, but using a bullet and em space character looks reasonable (•  yields "• ").
Emoji are also supported (though on the website depends on the user's OS & browser).
Special notes concerning only Google Play app:
Some HTML tags only work in the app:
<blockquote>…</blockquote> to indent a paragraph of text,
<small>…</small> for slightly smaller text,
<big>…</big> for slightly larger text,
<sup>…</sup> and <sub>…</sub> for super- and subscripts.
<font color="#a32345">…</font> for setting font colors in HEX code.
Some symbols do not appear correctly, such as ‣.
All these notes also apply to the app's "What's New" section.
Special notes concerning only Google Play website:
All HTML formatting appears as plain text in the website's "What's New" section (i.e. users will see the HTML source).

Currently (July 2015), HTML escape sequences (• •) do not work in browser version of Play Store, they're displayed as text. Though, Play Store app handles them as expected.
So, if you're after the unicode bullet point in your app/update description [that's what's got you here, most likely], just copy-paste the bullet character
•
PS You can also use unicode input combo to get the character
Linux: CtrlShiftu 2022 Enter or Space
Mac: Hold ⌥ 2022 release ⌥
Windows: Hold Alt 2022 release Alt
Mac and Windows require some setup, read on Wikipedia
PPS If you're feeling creative, here's a good link with more copypastable symbols, but don't go too crazy, nobody likes clutter in what they read.

As a matter of fact, HTML character entites also work : http://www.w3.org/TR/html4/sgml/entities.html.
It lets you insert special characters like bullets '•' (•), '™' (™), ... the HTML way.
Note that you can also (and probably should) type special characters directly in the form fields if you can enter international characters.
=> one consideration here is whether or not you care about third-party sites that collect data on your app from Google Play : some might simply take it as HTML content, others might insert it in a native application that just understand plain Unicode...

This is not bullet but you can consider it. As there is nothing like big dot.
I used below symbol in the description and its working fine.
⚫ Black Circle
🌑 New Moon
🌕 Full Moon
💠 Diamond With a Dot
🔸 Small Orange Diamond
⚙ Gear
🏴 Black Flag
🏳 White Flag
▶ Play Button
⏩ Fast-Forward Button
⭕ Heavy Large Circle
✴ Eight-Pointed Star
◼ Black Medium Square
◽ White Medium-Small Square
◾ Black Medium-Small Square
⬛ Black Large Square
You just need to copy and paste it over description. Below is the result.

Currently (June 2016) typing in the link as http://www.example.com will only produce plain text.
You can now however put in an html anchor :
My Example Site

Title, Short Description and Developer Name
HTML formatting is not supported in these fields, but you can include UTF-8 symbols and Emoji: ✓☆👍
Full Description and What’s New:
For the Long Description and What’s New Section, there is a wider variety of HTML codes you can apply to format and structure your text. However, they will look slightly different in Google Play Store app and web.
Here is a table with codes that you can use for formatting Description and What’s New fields for your app on Google Play (originally appeared on ASO Stack blog):
Also you can refer this..
https://thetool.io/2020/html-emoji-google-play

Include emojis; copy and paste them to the description:
http://getemoji.com

<br> seems to be the best and only way that currently works on the app version to create a new line break. I have tried it successfully in a review, as well as unsuccessfully tried all other Unicode/HTML newline-related characters that the Wikipedia page for newlines would tell me.
I used <br> with | immediately on either side, using no closing tag, and it magically created a single line break without revealing the source or screwing anything up.
TLDR: <br> lets you successfully utilize single line breaks in Google Play app -- unlike everything else I tried (a lot).
P.S. I have no clue how to make the thing show source instead of being used as source. !^( Now I do, and I know it works on both the desktop and mobile sites. !!
Additionally, upon searching for how to make it show the source, I stumbled upon this. <del></del>

How to limit the use of certain character sets

I hope this question isnt going to be down-flagged for not showing some actual code, but thats the core of this situation. I simply have no clue where to start to solve this issue, even after trying to use several combinations of keywords on both Google, and here on SO.
My client suddenly decided that half of the Android App I'm developing for him has to be Chinese, so after I have made some changes in the Database so some fields can take in Simplified Chinese character sets, I need to make sure that my client (living in holland) only uses those characters in that particular EditText field in the app. (There are more Database fields that now only allow Simplified Chinese, however these values come from a dropdown list in the app, so I dont need to worry about wrong characters for them).
So how would one make sure that only Simplified Chinese is used in an EditText field?

Here is a project in Ruby that attempts to detect whether characters are Traditional Chinese, Simplified Chinese, or Japanese (maybe others?): https://github.com/jpatokal/script_detector
This detection is based on the Unihan Database, in which there is a file called Unihan_Variants.txt. (Download zip file containing this text file here.)
Conceivably, you could parse the txt file into a lookup table and check the unicode value as the text is entered during onTextChanged() for your EditText. However, the readme on the project linked above states: "It is important to understand that this requires long sections of text to work reliably, since a single character or even several characters may be valid Japanese, traditional Chinese and simplified Chinese simultaneously." So, weeding out characters on an individual basis might prove difficult.

What is the purpose of "[Developer] Accented English" (zz-ZZ) in Android?

In Android KitKat, if I choose Settings > Language & Input > Language, the first choice I am offered is [Developer] Accented English. This replaces each Roman letter with an accented version. You can find a list of all the character mappings here. (It helps if you can read French).
What is the purpose of this setting? Is it just to show how characters can be mapped to other characters? Or can it be used productively (to create specific phonemes in text-to-speech output for example?

It's a technique called 'Pseudolocalization', and it's used to help test that an app is handling aspects of localization correctly.
The idea is that instead of waiting for an app's string resources to be translated into other languages - which could take some time - a "fake" pseudo-language is used instead. If the app behaves well against this fake translation, then chances are it will perform well with actual translations. There's different variations of pseudolocalization out there, but most tend to do some of the following:
Add parens [ ... ] or other delimiters around the string: this makes it easier to ensure that strings are not getting clipped at either end.
Replace regular characters with accented characters: if you see a string without accented characters, than that's a sign that it might be hardcoded instead of being treated as a localizable resource. (In the past, this was also used to ensure that apps could handle non-ASCII characters correctly and didn't lose data in code page translation, though this is less of an issue now that modern platforms support Unicode.)
Add padding to the string: this is to simulate languages such as German which often have longer translations for the corresponding English string. If the padded string gets truncated instead of wrapping or flowing, then likely the German string will do similar.
Add known-to-be-tricky characters to act as 'canaries': on some platforms, symbols from specific parts of the Unicode range may be added to ensure that they are handled or supported properly. For example, a Chinese character might be added to ensure that Chinese fonts are supported: if this ends up showing as an empty square, than that would indicate a problem. Other common 'canary' characters include code points from outside the BMP, or using Combining Characters.
One advantage of using pseudolocalization over actual translation is that the testing can be performed by someone who does not understand the target language: "[Àççôûñţ Šéţţîñĝš___]" still visually appears similar to the original English text "Account Settings". If you try using it with a Screen-Reader such as TalkBack, or other wise send pseudolocalized text to Text-to-speech, you'll likely get nonsense, since it will try to treat the accented characters as actual accented characters.

Punjabi/Telugu/Tamil on Android using unicode UTF-8

I am trying to display text in Indian regional languages on an Android app.
I've set up all the localization folders even though, I just want to have only one language for my app (say Punjabi).
In my strings.xml I have tried putting Hindi characters and Chinese characters and these are displayed correctly on the emulator. But when I put in Punjabi characters nothing shows up on the emulator.
Any reason for this? Can I overcome this problem?
I have the option of using a .ttf file in the assets folder for punjabi font. But that is not what I want to do because it does not give me complete control over the contents being displayed. Each .ttf behaves differently.
Any help is truly appreciated.

There is no support for local Indian languages on Android as yet. Hence the UTF-8 characters that fall outside the acceptable range for Android are ignored. Hence we see a blank being displayed.

Android: Find out which font file is appropriate for the characters I want to display

I am maintaining an Android app that people use to display strings in various exotic languages like Tibetan or old Greek. Because Android devices come with very few fonts, users can put font files on the SD card, and the app will use them.
QUESTION: Given a string, how can I automatically decide which font file is the most appropriate, so that this string appears without characters being replaced with squares/boxes?
Notes:
Each string is in one language.
Strings are displayed in a WebView.
Custom fonts work, the only problem is deciding which font file to use.
Instead of a single font, it could provide a list of fonts that are acceptable for that string.
Unnecessary context, for the curious: I am trying to develop this feature:
http://code.google.com/p/ankidroid/issues/detail?id=779
UPDATE: I ended up creating the Antisquare Open Source library based on Mostafa's idea.
It has a getSuitableFonts method which is blazingly fast.

Android by itself does not provide enough for such a task. Loading and rendering fonts in Android happens in Skia, which is written in C. Skia detects if a character can't be found in a font and falls back to another font for such characters (not the whole string). That's how Japanese, Hebrew, or Arabic text is shown in Android and that's exactly why these scripts don't have bold face! (Their font is selected through fallback and fallback only selects one font file.)
Unfortunately, this mechanism is not provided in APIs and you have to build similar thing on your own. It seems complicated, but is easier than it looks. All you have to do is:
Prepare lists of characters available in each font file.
For every string find the font that has more characters of the string.
Getting list of characters in each font
You don't have to do this on-the-fly in your Android app. You can prepare the list of characters in each font and put these lists in your app. I say that because this is way easier with tools that may not be available in Android. I would do that through Python scripting in a font app (most serious font tools have awesome Python scripting environments), but these apps are expensive and are for serious type designers. Since you're an Android developer, I recommend using sfntly, a library in Java and C++. Doing what you need (getting a list of Unicode characters available in a font file) is easy with sfntly. This sample works with CMap tables (tables that hold character to glyph mapping) and should be a good starting point for you.
Now the interesting part is that snftly is in Java and you may be able to include that in your Android app and do everything automatically. That's awesome by I recommend you start by getting familiar with snftly.
Selecting the font
After the previous part you'll have a list of Unicode character for every font, and based on these lists selecting the font file that provides most characters of every string is trivial.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.