I'm writing an android application and I have issue with edit text.
When the user adds text with accent (sp: "Nous sommes en été");
The string isn't correct. It convert my accent symbole to utf-8 I guess.
How can I deal with it ?
ps: my application is a french app and I really need to use accent.
MY CODE :
String description = ((EditText) findViewById(id.description)).getText().toString();
Log.i("UTF8",description.toString());
description = description.replace("\n", "");
Log.i("UTF8",description.toString());
There shouldn't be a problem. String representations are UTF-16 [1] in java / android. I tried the following code on android and got it to work fine. The accented é displays correctly.
EditText findViewById =(EditText) findViewById(R.id.test_text);
findViewById.setText("École. Nous sommes en été");
Editable text = findViewById.getText();
Log.d(TAG, text.toString());
You know that the first parameter in your Log.i() code is a tag right ? Log.i() does not accept encoding type as a parameter. Can you post what those statements print ?
[1] - http://docs.oracle.com/javase/7/docs/api/java/lang/String.html
Related
I've been trying to find a good way to be able to keep only emojis and letters in a given text, but every article I found, I didn't have success with .
I've tried to use regex, but seems that I can not make it work.
I've tried to use emoji4j but it seems that this library is working with emojis in this form ":)", which don't help me, because my emojis are groups of unicode characters.
The result I want is the following :
"This is. a text 👨👩👧👦,,1234" => "This is a text 👨👩👧👦"
"👨👩👧👦" => "👨👩👧👦"
"👨👩👧👦😃123abc👨👩👧👦" => "👨👩👧👦😃abc👨👩👧👦"
Here's the emoji regex : ?:[\u2700-\u27bf]|(?:[\ud83c\udde6-\ud83c\uddff]){2}|[\ud800\udc00-\uDBFF\uDFFF]|[\u2600-\u26FF])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|[\ud83c\udffb-\ud83c\udfff])?(?:\u200d(?:[^\ud800-\udfff]|(?:[\ud83c\udde6-\ud83c\uddff]){2}|[\ud800\udc00-\uDBFF\uDFFF]|[\u2600-\u26FF])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|[\ud83c\udffb-\ud83c\udfff])?)*|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|[\ud83c\udd70-\ud83c\udd71]|[\ud83c\udd7e-\ud83c\udd7f]|\ud83c\udd8e|[\ud83c\udd91-\ud83c\udd9a]|[\ud83c\udde6-\ud83c\uddff]|[\ud83c\ude01-\ud83c\ude02]|\ud83c\ude1a|\ud83c\ude2f|[\ud83c\ude32-\ud83c\ude3a]|[\ud83c\ude50-\ud83c\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff] .
If I try something like :
val regex = "the_whole_regex_above | [^a-zA-Z]".toRegex()
myText.replace(regex,""), it won't replace anything, basically every character will pass
Basically I want to achieve pretty much the same thing as in this question, but using Kotlin.
You want to remove all punctuation, symbols (other than those used to form emojis) and digits.
To do that, you may use
myText = myText.replace("""[\p{N}\p{P}\p{S}&&[^\p{So}]]+""".toRegex(), "")
See the online Kotlin demo.
Details
[ - start of a character class that matches:
\p{N} - any Unicode digit
\p{P} - any Unicode punctuation proper
\p{S} - any Unicode symbol
&&[^\p{So}] - BUT the Unicode symbols belonging to Symbol, other Unicode category that are mostly used to form emojis
]+ - 1 or more occurrences.
I'm doing a Android project and facing a problem with EditText when I type Vietnamese.
Example, when i type the word "thử" into EditText and get string from it.
String text = edittext.getText().toString()
It always returns a String object with 4 characters "t", "h", "ư" and the accent character.
But if i create a String object by code like:String text = "thử";. It only contains 3 characters "t", "h" and "ử". So they do not match when I compare them. I want the String object contain 3 characters, not 4 characters.
I also think about a way that loop through all characters to replace them manually. But Vietnamese has 12 vowels and 6 accents so that it makes me have to check 72 cases. I don't think it is a good way. Anyway to get proper text from EditText? Or any good way to replace the text manually?UPDATE:I have found why the EditText always return weird String. It is cause by the phone keyboard app. I am using LG Magna and using default keyboard app. The app always encodes seperately base vowels and accents everything i input. I have just installed another keyboard app, then it works like a charm.Now, I have to find a way to make sure that the text always returns properly from any keyboard app.
Android use UTF-8 codepage, so please be sure that you're typing your vietnamese symbols using those UTF-8 but not any kind of Windows-1258`
I'm new to android. I've created an EditText and if I assign the property android:text from xml code using a word with accents (I try with àèìòù) I see the text displayed correctly.
If I try to assign with string value edit_message, I get the unknown character symbol. This is my code:
EditText editText=(EditText)findViewById(R.id.edit_message);
editText.setText("àèìòù");
I think it's an encoding problem, but it seem strange.
The string shouldn't be UTF-8 by default?
Use HTML entity-codes via Html.fromHtml:
editText.setText(Html.fromHtml("àé ...");
A list of entity codes is available here:
http://symbolcodes.tlt.psu.edu/web/codehtml.html
You could use the method htmlEncode of the TextUtils class to automaticaly convert your input-text to an encoded-format:
string encodedText = TextUtils.htmlEncode("àèìòù");
editText.setText(Html.fromHtml(encodedText));
In my app I have a message which can be customised by the user and then displayed in the app.
If the user enters "£100" it is shown as "£100"?
I tried to use a font which contains this symbol but it didn't fix my problem.
Typeface font = Typeface.createFromAsset(getActivity().getAssets(), "arial_unicode.ttf");
alertTextView.setTypeface(font);
alertTextView.setText(message);
I tried to use *Arial Unicode MS, Verdana, Arial, Code2000... but the problem persits.
Any ideas?
Use the codes like \u00A3 (lira) or try to change the encoding of the string like this:
byte[] myByteArray = message.getBytes(Charsets.UTF_8);
String encodedMessage = new String(myByteArray, Charsets.UTF_8);
alertTextView.setText(encodedMessage);
I fixed my issue.
The problem was about the way how I fetch and format this data in my message variable.
As I sais in my comment above, I fetch this information from Json data through an API. At this point, the character convertor that I was using wasn't correct I had to change it from "iso-8859-1" to "utf-8" to be able to read correctly my currency symbol.
In my app I have a Textview with some text. I'm trying to get an input from the user, and then highlight words in the Textview according to that input.
For instance if the text is
Hello stackoverflow
and the input for the user is
hello
I want to replace the text with:
<font color='red'>Hello</font>` stackoverflow
This is my code:
String input = //GETTING INPUT FROM THE USER
text= text.replaceAll(input,"<font color='red'>"+input+"</font>");
Textview.setText(Html.fromHtml(text));
And the replacement is working, but the problem is that my current code changes the original word cases, for example :
Text: HeLLo stackoverflow
Input: hello
What i get: <font color='red'>hello</font> stackoverflow
What i want: <font color='red'>HeLLo</font> stackoverflow
You have to think about regular expressions.
replaceAll allows you to use regular expressions, and so, you can replace the text for the exact occurrence that was found.
For instance if Hello was found, it replaces it for <font color='red'>Hello</font>.
If HeLLo is found, it replaces it for <font color='red'>HeLLo</font>
Your code should be somehing as easy as:
String highlighted = text.replaceAll("(?i)("+input+")","<font color='red'>$1</font>");
This means:
(?i) : i want to search for something, case insensitive
"("+input+")" : input is betwen ( and ) because we are creating a group, so this group can be refered later
"<font color='red'>$1</font>" : instead of replacing by input, that would change the case, we replace it by `$1, that is the reference to the first matched group. This means that we want to replace it using the exact word that was found.
But please, try it and keep playing since regular expressions are tricky.
Other reads
It is easier and more clear if you use the Patternclass.
You can read more here:
http://developer.android.com/reference/java/util/regex/Pattern.html
Also, you can take a look at how to do it:
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replaceAll%28java.lang.String,%20java.lang.String%29
public String replaceAll(String regex, String replacement)
.
Replaces each substring of this string that matches the given regular expression with the given replacement.
An invocation of this method of the form str.replaceAll(regex, repl) yields exactly the same result as the expression
Pattern.compile(regex).matcher(str).replaceAll(repl)
Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll. Use Matcher.quoteReplacement(java.lang.String) to suppress the special meaning of these characters, if desired.
Parameters:
regex - the regular expression to which this string is to be matched
replacement - the string to be substituted for each match
Returns:
The resulting String
UPDATE
You can test your regular expressions in this page:
http://www.regexplanet.com/advanced/java/index.html