Extract specific string by regular expression in Android [duplicate]

Extract specific string by regular expression in Android [duplicate] - android

This question already has an answer here:
Select part of line in regular expression
(1 answer)
Closed 4 years ago.
I have this strings: https://regex101.com/r/7Er0Ch/6
I want put all my http://esupb.tabriz.ir:808x/srvSC.svc into array list.So in order to i used matcher like blow:
String regx= "#\\d+#";
Pattern pattern = Pattern.compile(regx);
Matcher matcher = pattern.matcher(url);
String[] metadata = new String[4];
while (matcher.find()) {
metadata[0] = matcher.group(1);
metadata[1] = matcher.group(2);
metadata[2] = matcher.group(3);
metadata[3] = matcher.group(4);
}
but I got not appropriate result. What is my mistake?

From requirement your regex will be
"(#\d+#)(http[^#]*svc)(#\d+#)"
group(0): (#\\d+#)(http[^#]*svc)(#\\d+#)
group(1): (#\\d+#)
group(2): (http[^#]*svc)
group(3): (#\\d+#)
Change your code to
List<String> urls = new ArrayList<>();
String url =
"#1#http://test.com:8080/srv.svc#1# " +
"#2#http://test.com:8081/srv.svc#2# " +
"#3#http://test.com:8082/srv.svc#3# " +
"#4#http://test.com:8083/srv.svc#4# " +
"#5#http://test.com:8084/srv.svc#5# ";
String regx = "(#\\d+#)(http[^#]*svc)(#\\d+#)";
Pattern pattern = Pattern.compile(regx);
Matcher matcher = pattern.matcher(url);
int from = 0;
while (matcher.find(from)) {
urls.add(matcher.group(2));
from = matcher.start() + 1;
}

You regex #\\d+#matches # followed by matching one or more times a digit and then another # .It does not use capturing groups.
For your example data you could remove that match from the string giving you the desired result leaving out matching any pattern for the string that is left. It could also match inside the string instead of only at the start and the end.
To match your example string(s) like http://esupb.tabriz.ir:808x/srvSC.svc you might use your regex to match the start and the end, and capture in a group what is in between.
^#\d+#(https?://test.ir:808\d/srvSC\.svc)#\d+#$
In Java
^#\\d+#(https?://test.ir:808\\d/srvSC\\.svc)#\\d+#$
Regex demo
Demo Java
Explanation
^ Assert the start of the string
#\d+# Match #, one or more times a digit and another #
( Start capturing group
https?://test.ir:808\d Match the start of the url with an optional s s? and a digit after 808. Use \d+ to match one or more digits.
/srvSC\.svc Match /srvSC.svc
#\d+# Match #, one or more times a digit and another #
) Close caputring group
$ Assert the end of the string

Related

Parsing a string?

So I have the following string:
String text = "\t\t\torder #168\n\t\t\tpaid\n\t\t\tview 4 items\n\t\t\tpicked up\n\t\t\tcomplete pickup\n\t\t\t2 stops";
How do I parse this string so that I always get the 2 in front of stops? I have tried the following, but it always returns 2 stops.
String substr = "complete pickup";
String numberOfStops = text.substring(text.indexOf(substr) + substr.length());
numberOfStops = numberOfStops.replaceAll("^\\s+","").replaceAll("\\s+$","");

The short way:
numberOfStops = numberOfStops.replaceAll("^\\s+","").replaceAll("\\s+$","").replace("stops","");
The flexible way is using Regex, and Pattern and Match classes. Let me know if you need it

Regex with error checking

I've done a bunch of searching but I'm terrible with regex statements and my google-fu in this instance as not been strong.
Scenario:
In push notifications, we're passed a URL that contains a 9-digit content ID.
Example URL: http://www.something.com/foo/bar/Some-title-Goes-here-123456789.html (123456789 is the content ID in this scenario)
Current regex to parse the content ID:
public String getContentIdFromPathAndQueryString(String path, String queryString) {
String contentId = null;
if (StringUtils.isNonEmpty(path)) {
Pattern p = Pattern.compile("([\\d]{9})(?=.html)");
Matcher m = p.matcher(path);
if (m.find()) {
contentId = m.group();
} else if (StringUtils.isNonEmpty(queryString)) {
p = Pattern.compile("(?:contentId=)([\\d]{9})(?=.html)");
m = p.matcher(queryString);
if (m.find()) {
contentId = m.group();
}
}
}
Log.d(LOG_TAG, "Content id " + (contentId == null ? "not found" : (" found - " + contentId)));
if (StringUtils.isEmpty(contentId)) {
Answers.getInstance().logCustom(new CustomEvent("eid_url")
.putCustomAttribute("contentId", "empty")
.putCustomAttribute("path", path)
.putCustomAttribute("query", queryString));
}
return contentId;
}
The problem:
This does the job but there's a specific error scenario that I need to account for.
Whoever creates the push may put in the wrong length content ID and we need to grab it regardless of that, so assume it can be any number of digits... the title can also contain digits, which is annoying. The content ID will ALWAYS be followed by ".html"

While the basic answer here would be just "replace {9} limiting quantifier matching exactly 9 occurrences with a + quantifier matching 1+ occurrences", there are two patterns that can be improved.
The unescaped dot should be escaped in the pattern to match a literal dot.
If you have no overlapping matches, no need to use a positive lookahead with a capturing group before it, just keep the capturing group and grab .group(1) value.
A non-capturing group (?:...) is still a consuming pattern, and the (?:contentId=) equals contentId= (you may remove (?: and )).
There is no need wrapping a single atom within a character class, use \\d instead of [\\d]. That [\\d] is actually a source of misunderstandings, some may think it is a grouping construct, and might try adding alternative sequences into the square brackets, while [...] matches a single char.
So, your code can look like
Pattern p = Pattern.compile("(\\d+)\\.html"); // No lookahead, + instead of {9}
Matcher m = p.matcher(path);
if (m.find()) {
contentId = m.group(1); // (1) refers to Group 1
} else if (StringUtils.isNonEmpty(queryString)) {
p = Pattern.compile("contentId=(\\d+)\\.html");
m = p.matcher(queryString);
if (m.find()) {
contentId = m.group(1);
}
}

accepting acents in regex android

mates...
I am having troubles to pass through a regex accepting accents in android... All the things we have try like for java is not working properly, and android don't want our accented vocals ..
I have the following regex:
Pattern pattern = Pattern.compile("[a-zA-ZñÑáéíóúÁÉÍÓÚ]+");
Any tip of how to include ñ and accents vocals in android?
Thanks very much in advance...
Here is our validation function:
public static boolean validarNombres(String nameToValidate){
byte step = 1;
byte minWords = 2;
byte maxWords = 5;
boolean validName = false;
String[] aux;
Matcher matcher = null;
Pattern pattern = Pattern.compile("[\\p{L}\\p{M}]+");
aux = nameToValidate.split(" ");
//PASO 2: check that the name has from 2 to 5 words
if(aux.length >= minWords && aux.length <= maxWords){
step++;
matcher = pattern.matcher(nameToValidate);
}
//PASO 3: check that the name matches out regex
if(step==2 && matcher.matches()){
validName = true;
}
return validName;
}
EDIT: Think found the mistake... We are not including the blank space bettwen the first and second name ... It works fine when we check just a word, but not for the full name... now....
Whats the code to include a blank space on our regex?, please
Thanks very much

To validate a string consisting of 2 to 5 words separated with whitespace(s), you may use
public static boolean validarNombres(String nameToValidate) {
return nameToValidate.matches("[\\p{L}\\p{M}]+(?:\\s[\\p{L}\\p{M}]+){1,4}");
}
The regex is anchored by default when used with the .matches() method, no need adding ^ and $.
Pattern details:
[\\p{L}\\p{M}]+ - 1 or more letters or/and diacritics
(?:\\s[\\p{L}\\p{M}]+){1,4} - 1 to 4 (so, 2 to 5 in total) sequences of:
\\s - a single whitespace
[\\p{L}\\p{M}]+ - 1 or more letters or/and diacritics
See the regex demo.

Is there any method to add some string to what found by regex in Android Studio?(ctrl+shift+R)

Basically, what I am trying to do is add double quote to the heads and tails of the numbers
String a = 1;
String b = 2;
String c = 3;
to
String a = "1";
String b = "2";
String c = "3";
So, I use [1-9] to find all numbers. Then, all of a sudden, it comes to me that I don't know how to get the values which regex found, like don't know what to set between double quotes.
Hence, I am wondering if it's possible.

You should use \d+ instead of [1-9] or at the very least [0-9]+ to include the 0
The reason why you need the + is because your regex would not find 10 or any digits that has more than 1 digit. You can reference the groups that you have found by using $1 (first group) $2 (second group) and so on. So you could do "$1" as your substitution and (\d+) as your search although you might want to use a better regex ie:
=\s*(\d)+;
replace to
= "$1";
See https://regex101.com/r/SaT6nK/1

how to get text with using substring

I try to get only this part "9916-4203" in "Region Code:9916-4203 " in android. How can I do this?
I tried below code, I used substring method but it doesn't work:
firstNumber = Integer.parseInt(message.substring(11, 19));

If you know that string contains "Region Code:" couldn't you do a replace?
message = message.replace("Region Code:", "");

Assumed that you have only one phone number in your String, the following will remove any non-digit characters and parse the resulting number:
public static int getNumber(String num){
String tmp = "";
for(int i=0;i<num.length();i++){
if(Character.isDigit(num.charAt(i)))
tmp += num.charAt(i);
}
return Integer.parseInt(tmp);
}
Output in your case: 99164203
And as already mentioned, you won't be able to parse any String to Integer in case there are any non-digit characters

Im going to guess that what you want to extract is the full region code text minus the title. So maybe using regex would be a good simple fit for you?
String myString = "Region Code:9916-4203";
String match = "";
String pattern = "\:(.*)";
Pattern regEx = Pattern.compile(pattern);
Matcher m = regEx.matcher(myString);
// Find instance of pattern matches
Matcher m = regEx.matcher(myString);
if (m.find()) {
match = m.group(0);
}
Variable match will contain "9916-4203"
This should work for you.
Java code sourced from http://android-elements.blogspot.in/2011/04/regular-expressions-in-android.html

In Java the substring() method works with the first parameter being inclusive and the second parameter being exclusive. Meaning "Hello".substring(0, 2); will result in the string He.
In addition to excluding the parsing of something that isn't a number like #Opiatefuchs mentioned, your substring method should instead be message.substring(12, 21).

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.

Extract specific string by regular expression in Android [duplicate] - android

Related

Parsing a string?

Regex with error checking

accepting acents in regex android

Is there any method to add some string to what found by regex in Android Studio?(ctrl+shift+R)

how to get text with using substring

Categories

Resources