I should parse a string like this:
Small Intestine (T N M - Stage 0)
what I want to save is all the stuff before the first bracket, but I don't know if i'll have one, two ore more strings.
How can I do this in java ? What is the correctly regexp that I must use ?
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class RegExpTest
{
public static void main( String args[] ){
// String to be scanned to find the pattern.
String line = "Small Intestine (T N M - Stage 0)";
String pattern = "^.+?(?= *\\()";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find( )) {
System.out.println("Found value: " + m.group(0));
} else {
System.out.println("NO MATCH");
}
}
}
You can use this regex:
^.+?(?= *\()
RegEx Demo
Related
I am creating and android app that randomly generates any category, I would like to get the given random category word. This is the example string
String="The category Animals that starts with a letter J";
or
String="The category Colors that starts with a letter V";
I need to get the word Animals or Colors every random String is generated
A not so advanced solution, but easy to understand:
public void findCategory() {
String string = "The category Colors that starts with a letter V";
String[] split = string.split(" ");
int i;
for (i = 0; i < split.length; i++) {
if ("category".equals(split[i])) {
break;
}
}
System.out.println(split[i + 1]);
}
You may use regex.
Matcher m = Pattern.compile("\\bcategory\\s+(\\S+)").matcher(str);
while(m.find()) {
System.out.println(m.group(1));
}
OR
Matcher m = Pattern.compile("(?<=\\bcategory\\s)\\S+").matcher(str);
while(m.find()) {
System.out.println(m.group());
}
Please use Matcher and Pattern -
String input = "The category Animals that starts with a letter J";
Matcher m1 = Pattern.compile("^The category (.*) that starts with a letter (.*)$").matcher(input);
if(m1.find()) {
String _thirdWord = m1.group(1); // Animals
String _lastWord = m1.group(2); // J
System.out.println("Third word : "+_thirdWord);
System.out.println("Last Word : "+_lastWord);
}
Use this, it might fix your issue
String string = "The category Colors that starts with a letter V";
String[] ar = string.split(" ");
System.out.println(ar[2]);
I want to split this string
String info = "0.542008835 meters height from ground";
from this i want to get only two decimals like this 0.54.
by using this i am getting that
String[] _new = rhs.split("(?<=\\G....)");
But i am facing problem here, if string does't contain any decimals like this string
String info = "1 meters height from ground";
for this string i am getting those characters upto first 4 in the split string like 1 me.
i want only numbers to split if it has decimals, How to solve this problem.
if(info.contains("."))
{
String[] _new = rhs.split("(?<=\\G....)");
}
I think you can check by white space after first value. see this
If you get the space then get first character only.
For checking if a string contains whitespace use a Matcher and call it's find method.
Pattern pattern = Pattern.compile("\\s");
Matcher matcher = pattern.matcher(s);
boolean found = matcher.find();
If you want to check if it only consists of whitespace then you can use String.matches:
boolean isWhitespace = s.matches("^\\s*$");
You could use a regex to do this as an alternative to Deepzz's method, this will handle the case where there is a '.' in the later part of the String, I've included an example below. It's not clear from your question is you actually want to remaining part of the String, but you could add a second group to the reg ex to capture this.
public static void main(String[] args) {
final String test1 = "1.23 foo";
final String test2 = "1 foo";
final String test3 = "1.234 foo";
final String test4 = "1.234 fo.o";
final String test5 = "1 fo.o";
getStartingDecimal(test1);
getStartingDecimal(test2);
getStartingDecimal(test3);
getStartingDecimal(test4);
getStartingDecimal(test5);
}
private static void getStartingDecimal(final String s) {
System.out.print(s + " : ");
Pattern pattern = Pattern.compile("^(\\d+\\.\\d\\d)");
Matcher matcher = pattern.matcher(s);
if(matcher.find()) {
System.out.println(matcher.group(1));
} else {
System.out.println("Doesn't start with decimal");
}
}
Assuming the number is always the first part of the string:
String numStr = rhs.split(" ")[0];
Double num = Double.parseDouble(numStr);
After that you can use the String Formatter to get the desired representation of the number.
This will work when you know the String near the numbers, with int and double numbers as well.
String a ="0.542008835 meters height from ground";
String b = a.replace(" meters height from ground", "");
int c = (int) ((Double.parseDouble(b))*100);
double d = ((double)c/100);
I want to split a string and get a word finally. My data in database is as follows.
Mohandas Karamchand Gandhi (1869-1948), also known as Mahatma Gandhi, was born in Porbandar in the present day state of Gujarat in India on October 2, 1869.
He was raised in a very conservative family that had affiliations with the ruling family of Kathiawad. He was educated in law at University College, London.
src="/Leaders/gandhi.png"
From the above paragraph I want get the image name "gandhi". I am getting the index of "src=". But now how can I get the image name i.e "gandhi" finally.
My Code:
int index1;
public static String htmldata = "src=";
if(paragraph.contains("src="))
{
index1 = paragraph.indexOf(htmldata);
System.out.println("index1 val"+index1);
}
else
System.out.println("not found");
You can use the StringTokenizer class (from java.util package ):
StringTokenizer tokens = new StringTokenizer(CurrentString, ":");
String first = tokens.nextToken();// this will contain one word
String second = tokens.nextToken();// this will contain rhe other words
// in the case above I assumed the string has always that syntax (foo: bar)
// but you may want to check if there are tokens or not using the hasMoreTokens method
Try this code. Check if it working for you..
public String getString(String input)
{
Pattern pt = Pattern.compile("src=.*/(.*)\\..*");
Matcher mt = pt.matcher(input);
if(mt.find())
{
return mt.group(1);
}
return null;
}
Update:
Change for multiple item -
public ArrayList<String> getString(String input)
{
ArrayList<String> ret = new ArrayList<String>();
Pattern pt = Pattern.compile("src=.*/(.*)\\..*");
Matcher mt = pt.matcher(input);
while(mt.find())
{
ret.add(mt.group(1));
}
return ret;
}
Now you'll get an arraylist with all the name. If there is no name then you'll get an empty arraylist (size 0). Always make a check for size.
What I want to do...
I have a webview in my android app. I get a huge html content from the server as a string and a search string from the application user(the android phone user). Now I break the search string and create a regex out of it. I want all the html content that matches my regex to be highlighted when I display it into my WebView.
What I tried...
Since it is html, I just want to wrap the regex matched words into a pair of tags with yellow background.
Simple regex and replaceAll on the html Content that i get. Very wrong because it screws and replaces even what is inside the '<' and '>'.
I tried using Matcher and Pattern combo. It is difficult to omit what is inside the tags.
I used JSOUP Parser and it worked!
I traverse the html using NodeTraversor class. I used Matcher and Pattern classes to find and replace matched words with tags as i wanted to do.
But it is very slow. And I basically want to use it on Android and the size of it is like 284kB. I removed some unwanted classes and it is now 201kB but it is still too much for an android device. Additionally, the html content can be really large. I looked into JSoup source as well. It kind of iterates over every single character when it parses. I do not know whether all the parsers do the same but it is definitely slow for large html documents.
Here is my code -
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Highlighter {
private String regex;
private String htmlContent;
Pattern pat;
Matcher mat;
public Highlighter(String searchString, String htmlString) {
regex = buildRegexFromQuery(searchString);
htmlContent = htmlString;
pat = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
}
public String getHighlightedHtml() {
Document doc = Jsoup.parse(htmlContent);
final List<TextNode> nodesToChange = new ArrayList<TextNode>();
NodeTraversor nd = new NodeTraversor(new NodeVisitor() {
#Override
public void tail(Node node, int depth) {
if (node instanceof TextNode) {
TextNode textNode = (TextNode) node;
String text = textNode.getWholeText();
mat = pat.matcher(text);
if(mat.find()) {
nodesToChange.add(textNode);
}
}
}
#Override
public void head(Node node, int depth) {
}
});
nd.traverse(doc.body());
for (TextNode textNode : nodesToChange) {
Node newNode = buildElementForText(textNode);
textNode.replaceWith(newNode);
}
return doc.toString();
}
private static String buildRegexFromQuery(String queryString) {
String regex = "";
String queryToConvert = queryString;
/* Clean up query */
queryToConvert = queryToConvert.replaceAll("[\\p{Punct}]*", " ");
queryToConvert = queryToConvert.replaceAll("[\\s]*", " ");
String[] regexArray = queryString.split(" ");
regex = "(";
for(int i = 0; i < regexArray.length - 1; i++) {
String item = regexArray[i];
regex += "(\\b)" + item + "(\\b)|";
}
regex += "(\\b)" + regexArray[regexArray.length - 1] + "[a-zA-Z0-9]*?(\\b))";
return regex;
}
private Node buildElementForText(TextNode textNode) {
String text = textNode.getWholeText().trim();
ArrayList<MatchedWord> matchedWordSet = new ArrayList<MatchedWord>();
mat = pat.matcher(text);
while(mat.find()) {
matchedWordSet.add(new MatchedWord(mat.start(), mat.end()));
}
StringBuffer newText = new StringBuffer(text);
for(int i = matchedWordSet.size() - 1; i >= 0; i-- ) {
String wordToReplace = newText.substring(matchedWordSet.get(i).start, matchedWordSet.get(i).end);
wordToReplace = "<b>" + wordToReplace+ "</b>";
newText = newText.replace(matchedWordSet.get(i).start, matchedWordSet.get(i).end, wordToReplace);
}
return new DataNode(newText.toString(), textNode.baseUri());
}
class MatchedWord {
public int start;
public int end;
public MatchedWord(int start, int end) {
this.start = start;
this.end = end;
}
}
}
Here is how I call it -
htmlString = getHtmlFromServer();
Highlighter hl = new Highlighter("Hello World!", htmlString);
new htmlString = hl.getHighlightedHTML();
I am sure what i'm doing is not the most optimal way. But I can't seem to think of anything else.
I want to
- reduce the time it takes to highlight it.
- reduce the size of library
Any suggestions?
How about highlighting them using javascript?
You know, everybody love javascript, and you can find example like this blog.
JTidy and HTMLCleaner are aloso among the best Java HTML Parser.
see
Comparison between different Java HTML Parser
and
What are the pros and cons of the leading Java HTML parsers?
I've made a class which holds some string and integers, in that class I made a function to convert the data in the class in to a readable string;
public String GetConditions() {
String BigString = null;
String eol = System.getProperty("line.separator");
try {
BigString += "Depth: " + ci(Depth) + eol;
and so on...
Because I have to convert many integers, I made an extra function to convert a integer to a string;
public String ci(Integer i) {
// convert integer to string
if (i != null) {
String a = new Integer(i).toString();
return a;
} else {
return "n/a";
}
}
This throws a NullPointerException exception on return a. I'm quite new to Java, this is probally a noob question... Sorry about, thanks in advance!
There is a much simpler way to convert an Integer to a String: use String#valueOf(int).
public String ci(Integer i)
{
return i == null ? "n/a" : String.valueOf(i);
}
Try converting the Integer you pass in your method to string, instead of instantiating a new one.
You can do it straight forward like:
String a = i.toString();
or
String a = Integer.toString(i.intValue());
Thanks guys, but I found the problem, I've tried to add something to a string which was 'null' , this line:
String BigString = null;