In my app I have a Textview with some text. I'm trying to get an input from the user, and then highlight words in the Textview according to that input.
For instance if the text is
Hello stackoverflow
and the input for the user is
hello
I want to replace the text with:
<font color='red'>Hello</font>` stackoverflow
This is my code:
String input = //GETTING INPUT FROM THE USER
text= text.replaceAll(input,"<font color='red'>"+input+"</font>");
Textview.setText(Html.fromHtml(text));
And the replacement is working, but the problem is that my current code changes the original word cases, for example :
Text: HeLLo stackoverflow
Input: hello
What i get: <font color='red'>hello</font> stackoverflow
What i want: <font color='red'>HeLLo</font> stackoverflow
You have to think about regular expressions.
replaceAll allows you to use regular expressions, and so, you can replace the text for the exact occurrence that was found.
For instance if Hello was found, it replaces it for <font color='red'>Hello</font>.
If HeLLo is found, it replaces it for <font color='red'>HeLLo</font>
Your code should be somehing as easy as:
String highlighted = text.replaceAll("(?i)("+input+")","<font color='red'>$1</font>");
This means:
(?i) : i want to search for something, case insensitive
"("+input+")" : input is betwen ( and ) because we are creating a group, so this group can be refered later
"<font color='red'>$1</font>" : instead of replacing by input, that would change the case, we replace it by `$1, that is the reference to the first matched group. This means that we want to replace it using the exact word that was found.
But please, try it and keep playing since regular expressions are tricky.
Other reads
It is easier and more clear if you use the Patternclass.
You can read more here:
http://developer.android.com/reference/java/util/regex/Pattern.html
Also, you can take a look at how to do it:
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#replaceAll%28java.lang.String,%20java.lang.String%29
public String replaceAll(String regex, String replacement)
.
Replaces each substring of this string that matches the given regular expression with the given replacement.
An invocation of this method of the form str.replaceAll(regex, repl) yields exactly the same result as the expression
Pattern.compile(regex).matcher(str).replaceAll(repl)
Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string; see Matcher.replaceAll. Use Matcher.quoteReplacement(java.lang.String) to suppress the special meaning of these characters, if desired.
Parameters:
regex - the regular expression to which this string is to be matched
replacement - the string to be substituted for each match
Returns:
The resulting String
UPDATE
You can test your regular expressions in this page:
http://www.regexplanet.com/advanced/java/index.html
Related
I've been trying to find a good way to be able to keep only emojis and letters in a given text, but every article I found, I didn't have success with .
I've tried to use regex, but seems that I can not make it work.
I've tried to use emoji4j but it seems that this library is working with emojis in this form ":)", which don't help me, because my emojis are groups of unicode characters.
The result I want is the following :
"This is. a text ๐จโ๐ฉโ๐งโ๐ฆ,,1234" => "This is a text ๐จโ๐ฉโ๐งโ๐ฆ"
"๐จโ๐ฉโ๐งโ๐ฆ" => "๐จโ๐ฉโ๐งโ๐ฆ"
"๐จโ๐ฉโ๐งโ๐ฆ๐123abc๐จโ๐ฉโ๐งโ๐ฆ" => "๐จโ๐ฉโ๐งโ๐ฆ๐abc๐จโ๐ฉโ๐งโ๐ฆ"
Here's the emoji regex : ?:[\u2700-\u27bf]|(?:[\ud83c\udde6-\ud83c\uddff]){2}|[\ud800\udc00-\uDBFF\uDFFF]|[\u2600-\u26FF])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|[\ud83c\udffb-\ud83c\udfff])?(?:\u200d(?:[^\ud800-\udfff]|(?:[\ud83c\udde6-\ud83c\uddff]){2}|[\ud800\udc00-\uDBFF\uDFFF]|[\u2600-\u26FF])[\ufe0e\ufe0f]?(?:[\u0300-\u036f\ufe20-\ufe23\u20d0-\u20f0]|[\ud83c\udffb-\ud83c\udfff])?)*|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|[\ud83c\udd70-\ud83c\udd71]|[\ud83c\udd7e-\ud83c\udd7f]|\ud83c\udd8e|[\ud83c\udd91-\ud83c\udd9a]|[\ud83c\udde6-\ud83c\uddff]|[\ud83c\ude01-\ud83c\ude02]|\ud83c\ude1a|\ud83c\ude2f|[\ud83c\ude32-\ud83c\ude3a]|[\ud83c\ude50-\ud83c\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff] .
If I try something like :
val regex = "the_whole_regex_above | [^a-zA-Z]".toRegex()
myText.replace(regex,""), it won't replace anything, basically every character will pass
Basically I want to achieve pretty much the same thing as in this question, but using Kotlin.
You want to remove all punctuation, symbols (other than those used to form emojis) and digits.
To do that, you may use
myText = myText.replace("""[\p{N}\p{P}\p{S}&&[^\p{So}]]+""".toRegex(), "")
See the online Kotlin demo.
Details
[ - start of a character class that matches:
\p{N} - any Unicode digit
\p{P} - any Unicode punctuation proper
\p{S} - any Unicode symbol
&&[^\p{So}] - BUT the Unicode symbols belonging to Symbol, other Unicode category that are mostly used to form emojis
]+ - 1 or more occurrences.
I have to implement a function that check if a string is compliant to a regular expression, I have wrote a method that parse a list of filename, for each file name I need to check if respect the regexp.
The filename is composed like as follow (just an example):
verbale.pdf.001.001
image.jpg.002.001
The string is always composed by:
extension (only jpg or pdf) "." a group of three number "." a group of three number
With this regexp I need to check if the string in input end as described above, I have currently implemented this:
Pattern rexExp = Pattern.compile("((\\.jpg)|(\\.pdf))\\.[0-9]{3}\\.[0-9]{3}");
But not work properly, is it a good idea implement a regExp to check if a filename end with a certain path ?
Less greedy than the other answer, think it suits you:
\\w+\\.(jpg|pdf)(\\.\\d{3}){2}
file name, only composed of letters, numbers and _
dot
jpg or pdf formats
another dot
three digits
the dot and the three digits repeated
This should work :
.*\\w{3}\\.\\d{3}\\.\\d{3}
.* = any Characters (like "verbale123")
\\w{3} = any 3 alphabetic\numeric characters
\\. = a dot
\\d{3} = any three numeric characters
To check if a string ends with pdf or jpg and two sequences of . and 3 digits, you may use
(?i)(?:jpg|pdf)(?:\.[0-9]{3}){2}$
See the regex demo
Details
(?i) - case insensitive flag
(?:jpg|pdf) - either jpg or pdf
(?:\.[0-9]{3}){2} - 2 repetitions of a . and 3 digits
$ - end of string.
Use with Matcher#find() (as matches() anchors the match at the start and end of the string, while a partial match is required when using this pattern), example demo:
String s = "verbale.pdf.001.001";
Matcher matcher = Pattern.compile("(?i)(?:jpg|pdf)(?:\\.[0-9]{3}){2}$").matcher(s);
if (matcher.find()){
System.out.println("Valid!");
}
Hmm, I can't find the man page for 'replace' in Googles App scripts, I only see 'replaceText'. Anyway, from what I gather from the SO posts, the below should work, hopefully someone can spot it easily.
The String in the Cell is "[pro] all, everybody" and I want to remove the bracketed word '[pro]' so the result is 'all, everybody'.
It does work just fine with:
Cell = Cell.toString().replace("\[pro\]","");
but when I try to make it generic, it fails with all these (not sure what the pattern matching rules are, thus the question for the man page):
Cell = Cell.toString().replace("\[pr.\]","");
Cell = Cell.toString().replace("\[pr.*\]","");
Cell = Cell.toString().replace("\[.*\]","");
they should work, no ? What am I missing ?
Also, how would I use 'replaceText', I can't seem to apply it directly to the 'Cell' object.
The String#replace is a JavaScript function where you need to use a regex with a regex literal notation or with new RegExp("pattern", "modifiers") constructor notation:
Cell = Cell.toString().replace(/\[pr[^\]]*]/,"");
When using a regex literal, backslashes are treated as literal backslashes, and /\d/ matches a digit. The constructor notation equivalent is new RegExp("\\d").
The /\[pr[^\]]*]/ regex matches the first instance of:
\[pr - literal substring [pr
[^\]]* - 0+ chars other than ]
] - a literal ] symbol.
And replaces with an empty string.
I want to check if www page is in text. For example i have page address: www.taktik.com/trow and want check if text for www is in text.
I use Matcher mW = Pattern.compile("[a-zA-Z0-9_.+-]+.[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+\\/[a-zA-Z0-9-.]+").matcher(question); but I don't get any results. How can I check if text xxx.xxxx.xxx/xxx is in my String?
How can I check if text xxx.xxxx.xxx/xxx is in my String?
Fixing your regex, the pattern may look like
[a-zA-Z0-9_.+-]+\\.[a-zA-Z0-9-]+\\.[a-zA-Z0-9.-]+/[a-zA-Z0-9.-]+
Mind I escaped thr first dot and placed the hyphen at the end of the last two character classes (in yours, you have 9-. that creates a range that matches more than you'd want).
I tried to shorten the pattern a bit, but it's difficult since \w also matches Unicode characters in Android. Here is a possible regex:
(?i)[A-Z0-9_+-]+(?:\\.[A-Z0-9-]+){2}/[A-Z0-9-]+
I am using robotium to test an android project.I have a testcase where i need to test a message consisting of special characters is posted correctly.
So I created a constant consisting of special characters :
public static final String PostMessageWithSpecialchars = "Hey hi,* Have a good day*.:()[]-=/&!?"'+;##";
and i am using following code to search it and assert that the posted message is exactly like the message in the constant PostMessageWithSpecialchars
assertTrue(solo.searchText(PostMessageWithSpecialchars));
but the test fails at assertTrue line.
What to do to search the PostMessageWithSpecialchars text?I dont want to use escape characters because that will ignore special characters.I want to make sure that the special characters in the PostMessageWithSpecialchars message are posted correctly.
The method solo.searchText() accept regex pattern. In your search string you are using special characters that is used for patters. You can quote them to find any text:
assertTrue(solo.searchText(Pattern.quote(PostMessageWithSpecialchars)));