Replace Unified compouse unicode for Google unique unicode - android

I need to replace the Unified composed unicodes (They are Keycaps from 0 to 9 and country flags).
Actually im implementing Emojis for my android application, then, i need to convert the Unified compouse unicode for the Google unicode, like this:
For the Keycap 0, the compused unicodes are \u00300-\u20E3 and google single unicode its U+FE837 . I have the regex for find unicode compoused (only for keycaps number), its this:
[\u0030-\u20E3]
Because keycaps unicode goes from tange \u00300 to \u00309. Then, there are any way to convert it with regex to the google unicode equivalence? Or you know other way to do this?
Thanks.

Related

Kotlin Android allow only emojis and letters in a text

I've been trying to find a good way to be able to keep only emojis and letters in a given text, but every article I found, I didn't have success with .
I've tried to use regex, but seems that I can not make it work.
I've tried to use emoji4j but it seems that this library is working with emojis in this form ":)", which don't help me, because my emojis are groups of unicode characters.
The result I want is the following :
"This is. a text ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ,,1234" => "This is a text ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ"
"๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ" => "๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ"
"๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ˜ƒ123abc๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ" => "๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ๐Ÿ˜ƒabc๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ"
Here's the emoji regex : ?:[\u2700-\u27bf]|(?:[\ud83c\udde6-\ud83c\uddff]){2}|[\ud800\udc00-\uDBFF\uDFFF]|[\u2600-\u26FF])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|[\ud83c\udffb-\ud83c\udfff])?(?:\u200d(?:[^\ud800-\udfff]|(?:[\ud83c\udde6-\ud83c\uddff]){2}|[\ud800\udc00-\uDBFF\uDFFF]|[\u2600-\u26FF])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|[\ud83c\udffb-\ud83c\udfff])?)*|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|[\ud83c\udd70-\ud83c\udd71]|[\ud83c\udd7e-\ud83c\udd7f]|\ud83c\udd8e|[\ud83c\udd91-\ud83c\udd9a]|[\ud83c\udde6-\ud83c\uddff]|[\ud83c\ude01-\ud83c\ude02]|\ud83c\ude1a|\ud83c\ude2f|[\ud83c\ude32-\ud83c\ude3a]|[\ud83c\ude50-\ud83c\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff] .
If I try something like :
val regex = "the_whole_regex_above | [^a-zA-Z]".toRegex()
myText.replace(regex,""), it won't replace anything, basically every character will pass
Basically I want to achieve pretty much the same thing as in this question, but using Kotlin.
You want to remove all punctuation, symbols (other than those used to form emojis) and digits.
To do that, you may use
myText = myText.replace("""[\p{N}\p{P}\p{S}&&[^\p{So}]]+""".toRegex(), "")
See the online Kotlin demo.
Details
[ - start of a character class that matches:
\p{N} - any Unicode digit
\p{P} - any Unicode punctuation proper
\p{S} - any Unicode symbol
&&[^\p{So}] - BUT the Unicode symbols belonging to Symbol, other Unicode category that are mostly used to form emojis
]+ - 1 or more occurrences.

Best Way to store and render Unicode Emojis in Android

i have a simple Memory Game as Project. For the Memory Tiles I wanted to use Emojis. I tried to use it that way:
emojiCard.setText(new String(Character.toChars(Integer.parseInt(1F60D, 16))));
now I just have to save 1F60D to a variable and can show the emoji.
that works for simple emojis but I cannot use the "new" ones because then i have to use surrogate pairs and I don't know how to do this.
Is there a better way ? like saving the unicode ?
sorry i'am really new to android development and tried already a lot of things.
Thanks.
Integer.parseInt() takes a String as input, so presumably you meant to say Integer.parseInt("1F60D", 16) instead. Which would be wasted overhead when you can simply pass a numeric 0x1F60D literal to Character.toChars() instead.
Java strings use UTF-16 encoding. When encoded to UTF-16, codepoint U+1F60D uses surrogate pairs, so surrogates is not your issue.
Assuming you are referring to how newer emojis support modifiers (to change their genders, colors, etc), then that has nothing to do with surrogates. You simply append the modifier codepoint(s) you want after the base emoji codepoint. For example:
emojiCard.setText(new String(Character.toChars(0x1F466)) + new String(Character.toChars(0x1F3FE)));
(๐Ÿ‘ฆ + ๐Ÿพ = ๐Ÿ‘ฆ๐Ÿพ)

Error using native smiles of Android in text field (Unity3d)

I make application with Unity3d and build it for Android, when I write in input field android native smiles - I got error in line
(invalid utf-16 sequence at 1411555520 (missing surrogate tail)):
r.font.RequestCharactersInTexture(chars, size, style);
chars contains string than contains native android smiles. How I may support native smiles? I use own class for Input Field.
Unfortunately, supporting emojis with Unity is hard. When I implemented this feature, it took about a month to finish it, with a custom text layout engine and string class. So, if this requirement is not particularly important, I would suggest axing this feature.
The reason behind this particular error is that Unity gets characters from the input string one by one, and updates the visual string every character. From the layman point of view, this makes complete sense. However, it doesn't take into account how UTF-16 encoding, which is used in C#, works.
UTF-16 encoding uses 16 bits per a single unicode characters. It is enough for almost all characters that you would normally use. (And, as every developer knows, "almost all" is a red flag that will lay dormant for a long time and then will explode and destroy everything you love.) But it so happens, that Emoji characters are do not fit into 16 bit UTF-16 character, and use a special case โ€” surrogate pair:
Surrogate pair is a pair of UTF-16 characters that represent a single Unicode character. That means that they don't have any meaning on their own individually, and when you try to render a UTF-16 character that is a surrogate head or surrogate tail, you can expect to get an error like this, or something similar.
Essentially, what you need to implement is some kind of buffer, that will accept C# UTF-16 characters one by one, and then pass them to rendering code when it verifies that all surrogate pairs are closed.
Oh, and I almost forgot! Some Emoji characters, like country flags, are represented by two unicode characters. Which means that they can potentially take up to four UTF-16 characters. Aren't text encodings fun?

How can I put utf-16 characters in Android string resource?

I want to use Emojis in my app's strings. All strings reside, of course, in strings.xml
The problem is that not all Emojis are 16 bit friendly. Some Emojis can be represented as "normal" 16 bit hex: '\u26FF' but some are 32 bit hexes (UTF-16), usually represented as: '\x1F600'. I have no problem dealing with those inside the app, in code. But the strings.xml resource file is UTF8 encoded, and does not deal properly with non 16 bit escape chars.
I tried using '\x1F600' - because I saw that '\u26FF' works just fine. But it seems not to devour the 'x' escape char. Nor did it like the regexp notation '\x{1F600}'
So I ended up using a string placeholder '%1$s' and filling in the Emoji in code like this:
// greeting_3 is defined as: "hello there %1$s!"
String s = context.getString(R.string.greeting_3, "๐Ÿ˜œ");
// OR:
String s = context.getString(R.string.greeting_3, new String(Character.toChars(0x1F61C)));
This is not a very elegant solution... is there a proper way to put 32 bit UTF-8 chars in strings.xml ?
But the strings.xml resource file is UTF8
If it's UTF-8 encoded, you can put your emojis directly. But then you risk that your editor or another piece of software destroys them.
If you are putting them in XML, you can try using XML entities: ๐Ÿ˜€, I'm not sure how well Android supports them though.
You can also use surrogate pairs: convert the emoji to UTF-16 and use standard \u escape. You can for example check out this page, it even tells you how to create a string litaral in Java: http://www.fileformat.info/info/unicode/char/1F600/index.htm
๐Ÿ˜œ โ†’ U+1F600 โ†’ "\uD83D\uDE00"
The easiest way it just copying and pasting the emoji, it works from Android Studio 3.0 and newer
Add the resource like follows:
<string name="string_title">This is a emoji example <U+1F642></string>
In Android Studio 3.0 you can copy and paste an emoji:
And here how it looks:

Special characters such as PI, or subscripts on the xml of Android

I am coding a maths app and I want to show special characters such as PI, E, or subscripts and all those things.
I want to show them on the xml file of the layout.
How can I do it?
Thank you guys for all!
You can use the Unicode value for the symbol, preceded by \u. For example, the pi character is "\u03C0"
This site: http://www.dionysia.org/html/entities/symbols.html has list of elements which can be used in xml. Just watch the second element. For example:
square = &#8730
THen you need to conver it. For example:
String symbol = Html.fromHtml(square);
Alternative link is here: http://www.hrupin.com/2011/12/how-to-put-some-special-math-symbols-in-textview-editview-or-other-android-ui-element
The characters in a string resource are unicode. You can include special characters using the \unnnn notation.
There are many places to look up the unicode values on the web. Google found this one for me:
http://inamidst.com/stuff/unidata/

Categories

Resources