Iam converting some Html text from a webpage into a String by doing the following
mydescription =Html.fromHtml(data.getBody()).toString();
This is what data.getBody() returns:-
<div><p>It's great to have great dynamic companies to work with, and NXP is no exception.</p><p><img alt="This is an image of NXP Logo" src="https://anprodstorage.blob.core.windows.net/b75ef288-0381-45c4-a4cd-809097370bec/untitled.png" style="margin:5px;" /><br></p><div><iframe width="560" height="315" src="https://www.youtube.com/embed/I6191gXXGog" frameborder="0"></iframe> </div><p><br></p></div>
But within that html text there is a image source as well. When I do the above I get a square image with obj written inside it instead of the image.
This is myDescription

I just want to get the text and not the image.
How do i just get the text and not the image
Try this way,hope this will help you to solve your problem.
String htmlString = "<div><p>It's great to have great dynamic companies to work with, and NXP is no exception.</p><p><img alt=\"This is an image of NXP Logo\" src=\"https://anprodstorage.blob.core.windows.net/b75ef288-0381-45c4-a4cd-809097370bec/untitled.png\" style=\"margin:5px;\" /><br></p><div><iframe width=\"560\" height=\"315\" src=\"https://www.youtube.com/embed/I6191gXXGog\" frameborder=\"0\"></iframe> </div><p><br></p></div>";
String first = htmlString.substring(0,htmlString.indexOf("<img"));
String second = htmlString.substring(htmlString.indexOf("/>",htmlString.indexOf("<img"))+2,htmlString.length());
textview.setText(Html.fromHtml(first+second));
use this code:
String clippedBody = htmlString.replaceAll("<img[^>]*?>.*?/[^>]*?>", "");
I advise using libraries, like jsoup when working with HTML (with soup you will be able to get only text by calling Jsoup.parse(html).text())
Haven't tried it myself
private static final Pattern REMOVE_TAGS = Pattern.compile("<img>(\\S+)</img>");
public static String removeTags(String string) {
if (string == null || string.length() == 0) {
return string;
}
Matcher m = REMOVE_TAGS.matcher(string);
return m.replaceAll("");
}
If you want to strip down all the HTML code, then you can use:
replaceAll("\\<[^>]*>","")
For your second question (from Source 2):
// the pattern we want to search for
Pattern p = Pattern.compile("<p>(\\S+)</p>");
Matcher m = p.matcher(string);
// if we find a match, get the group
if (m.find())
{
// get the matching group
String codeGroup = m.group(1);
// print the group
System.out.format("'%s'\n", codeGroup);
}
Source: 1, 2 and 3
Related
For the input text:
<p>Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?
I run the following code:
Whitelist list = Whitelist.simpleText().addTags("br");
// Some other code...
// plaintext is the string shown above
retVal = Jsoup.clean(plaintext, StringUtils.EMPTY, list,
new Document.OutputSettings().prettyPrint(false));
I get the output:
Arbit string <b>of</b>
text. <em>What</em> to <strong>do</strong> with it?
I don't want Jsoup to convert the <br> tags to line breaks, I want to keep them as-is. How can I do that?
Try this:
Document doc2deal = Jsoup.parse(inputText);
doc2deal.select("br").append("br"); //or append("<br>")
This is not reproducible for me. Using Jsoup 1.8.3 and this code:
String html = "<p>Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?";
String cleaned = Jsoup.clean(html,
"",
Whitelist.simpleText().addTags("br"),
new Document.OutputSettings().prettyPrint(false));
System.out.println(cleaned);
I get the following output:
Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?
Your problem must be somewhere else I guess.
i want find a Arabic word with Nunation in a TextView and highlight this,
for example if my word is "اشهد" whitout Nunation i want to find word position in "وَ اَشْهَدُ اَنْ لا اِلهَ اِلاَّ اللَّهُ" with Nunation .
Hi Please see below class i created. It is so basic and did not bother about memory consumption. You guys can optimise yourself.
http://freshinfresh.com/sample/ABHArabicDiacritics.java
If you want to check without nunation(harakath) contains in an arabic String,
ABHArabicDiacritics objSearchd = new ABHArabicDiacritics();
objSearchdobjSearch.getDiacriticinsensitive("وَ اَشْهَدُ اَنْ لا اِلهَ اِلاَّ اللَّهُ").contains("اشهد");
If you want to return Highlighed or redColored searched portion in String.
Use below code
ABHArabicDiacritics objSearch = new ABHArabicDiacritics( وَ اَشْهَدُ اَنْ لا اِلهَ اِلاَّ اللَّهُ, اشهد);
SpannableString spoutput=objSearch.getSearchHighlightedSpan();
textView.setText(spoutput);
To see start and end position of search text,
Use below methods,
/** to serch Contains */
objSearch.isContain();//
objSearch.getSearchHighlightedSpan();
objSearch.getSearchTextStartPosition();
objSearch.getSearchTextEndPosition();
Please copy shared java class and enjoy.
I will spend more time for more feature if you guys request.
Thanks
search ولد in INPUT :
public void RegexMatches() {
String INPUT ="ى لَیْلَهِ تَمامِهِ وَکَمالِهِ فَما کانَتْ اِلاّ ساعَهً وَاِذا بِوَلَدِىَ الْحَسَنِ قَدْ" ;
Pattern p = Pattern.compile("و[\\u064B-\\u064F\\u0650-\\u0656]*ل[\\u064B-\\u064F\\u0650-\\u0656]*د");
Matcher m = p.matcher(INPUT); // get a matcher object
int count = 0;
while(m.find()) {
count++;
System.out.println("Match number "+count);
System.out.println("start(): "+m.start());
System.out.println("end(): "+m.end());
}
}
I am uploading the text to server, i just want to upload those string in html format
example
input:
Do you know the relation between two eyes...???
They never see each other... BUT
They blink together.
They move together.
They cry together.
They see together.
They sleep together.
They share a very deep bonded relationship...
However, when they see a pretty woman, one will blink and another will not...
sendtext = adding_textjoke.getText().toString();
//String htmlString = Html.toHtml(sendtext);
String str = "(?i)\\b((?:https?://|www\\d{0,3}[.]|[a-z0-9.\\-]+[.][a-z]{2,4}/)(?:[^\\s()<>]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:\'\".,<>?«»“”‘’]))";
Pattern patt = Pattern.compile(str);
Matcher matcher = patt.matcher(sendtext);
sendtext = matcher.replaceAll("$1");
System.out.println(sendtext);
Log.e("sendtext", sendtext);
new AddJokesTask().execute(sendtext);
How to do this in android?
You can do it like this
SpannableString contentText = (SpannableString) contentView.getText();
String htmlEncodedString = Html.toHtml(contentText)
SpannableStringBuilder text = (SpannableStringBuilder) contentView.getText();
String htmlEncodedString = Html.toHtml(text);
I have the following piece of HTML:
<p>Bla bla bla...</p>
<p><strong>4. others</strong></p>
<p> </p>
It contains a random <p> </p> tag combination which needs to be filtered out in my Android app. I'm using the following Java code for it:
String html = object.get("Content").toString(); // this is the HTML
html = html.replace("<p> </p>", "");
html = html.replace("<p></p>", "");
html = html.replace("<p><span></span></p>", "");
content.setText(Html.fromHtml(html));
However, when I debug and put a break point on the replace functions, it doesn't replace the strings. Now I have useless <p> tags which I don't want. How do I solve this?
It contains a random tag combination which needs to be filtered out in my Android app.
You need to replace it separately. Because replace() function works case-sensitive and sequential strings.
So, Use below code
String html = object.get("Content").toString().trim(); // this is the HTML
html = html.replace("<p>", "");
html = html.replace("</p>", "");
html = html.replace("<span>", "");
html = html.replace("</span>", "");
content.setText(Html.fromHtml(html));
instead of your joint string in replace()
String html = object.get("Content").toString(); // this is the HTML
html = html.replace("<p> </p>", "");
html = html.replace("<p></p>", "");
html = html.replace("<p><span></span></p>", "");
content.setText(Html.fromHtml(html));
You get result as per you want.
Note: But, remember that anytime you want to replace any character/string than just replace it sequential with case-sensitive. because It gives you exact result.
You are doing it in wrong way.
Correct way:
1. Use StringBuffer instead of String.
2. Find index of < and put it in a variable say int startIdx
3. Find index of > and put it in a variable say int endIdx
4. then use delete() of StringBuffer and specify this startIdx and endIdx to delete HTML tag.
This will remove all HTML code from your string.
Ok so as per your comment, this is how you do it
String replaceThis="<p></p>";
int len = replaceThis.length();
StringBuffer buff="your html string from which you want to replace";
int i;
while((i = buff.indexOf(replaceThis)!=-1)
{
buff.delete(i,len-1);
}
I try to get only this part "9916-4203" in "Region Code:9916-4203 " in android. How can I do this?
I tried below code, I used substring method but it doesn't work:
firstNumber = Integer.parseInt(message.substring(11, 19));
If you know that string contains "Region Code:" couldn't you do a replace?
message = message.replace("Region Code:", "");
Assumed that you have only one phone number in your String, the following will remove any non-digit characters and parse the resulting number:
public static int getNumber(String num){
String tmp = "";
for(int i=0;i<num.length();i++){
if(Character.isDigit(num.charAt(i)))
tmp += num.charAt(i);
}
return Integer.parseInt(tmp);
}
Output in your case: 99164203
And as already mentioned, you won't be able to parse any String to Integer in case there are any non-digit characters
Im going to guess that what you want to extract is the full region code text minus the title. So maybe using regex would be a good simple fit for you?
String myString = "Region Code:9916-4203";
String match = "";
String pattern = "\:(.*)";
Pattern regEx = Pattern.compile(pattern);
Matcher m = regEx.matcher(myString);
// Find instance of pattern matches
Matcher m = regEx.matcher(myString);
if (m.find()) {
match = m.group(0);
}
Variable match will contain "9916-4203"
This should work for you.
Java code sourced from http://android-elements.blogspot.in/2011/04/regular-expressions-in-android.html
In Java the substring() method works with the first parameter being inclusive and the second parameter being exclusive. Meaning "Hello".substring(0, 2); will result in the string He.
In addition to excluding the parsing of something that isn't a number like #Opiatefuchs mentioned, your substring method should instead be message.substring(12, 21).