How to Convert HTML to String android and print that string - android

I followed this/this to Print Receipts in part of POS(Point of Sale) from EPSON Printer
Here I am getting data Json from URL (inside the Json Object I am getting a html print template):
{
"response": {
"status": "<table>.... </table>"
}
}
so with intent I used the above json response to a string and converted it to html:
method = "addFeedLine";
mPrinter.addFeedLine(1);
textData.append("Test print Sample string\n");**//this is sample text**
textData.append(Html.fromHtml(status + "\n"));
**//this is JSON response which is nothing but HTML code, so I am converting it to string**
Over there I have used status as a string so that whatever the content is inside that string, it is printed.
If it's is not a html but just a plain text I will print it like this
method = "addFeedLine";
mPrinter.addFeedLine(1);
textData.append(status);
Here is an example of what status looks like
"status": "The store list Sample\nSTORE DIRECTOR – XYZ\n01/01/01 16:58 6153 05 0191 134\nST# 21 OP# 001 TE# 01 TR# 747\n------------------------------\n400 OHEIDA 3PK SPRINGF 9.99 R\n410 3 CUP BLK TEAPOT 9.99 R\n445 EMERIL GRIDDLE/PAN 17.99 R\n438 CANDYMAKER ASSORT 4.99 R\n474 TRIPOD 8.99 R\n433 BLK LOGO PRNTED ZO 7.99 R\n458 AQUA MICROTERRY SC 6.99 R\n493 30 L BLK FF DRESS 16.99 R\n407 LEVITATING DESKTOP 7.99 R\n441 ** Blue Overprint P 2.99 R\n476 REPOSE 4 PCPM CHOC 5.49 R\n461 WESTGATE BLACK 25 59.99 R\n------------------------------\nSUBTOTAL 160.38\nTAX 14.43\nTOTAL 174.81\nCASH 200.00\nCHANGE 25.19\n------------------------------\nPurchased item total number\nSign Up and Save!\nWith Preferred Saving Card\n"
Now, here I have a plain HTML page:
Search Images Maps Play YouTube News Gmail Drive More »
Web History | Settings | Sign in
Louisa May Alcott’s 184th birthday
[ ] Advanced
searchLanguage
[Google Search][I'm Feeling Lucky] tools
Advertising ProgrammesBusiness Solutions+GoogleAbout GoogleGoogle.com
© 2016 - Privacy - Terms
I need to print this from an url.
Can anyone suggest me how to print this plain text?
There is no HTML tags and no JSON data.

By Html.fromHtml method you can convert HTML to String -
String strToHtml = Html.fromHtml(htmlContentInStringFormat)
Log.e(TAG,"strToHtml :: "+strToHtml);

If you really want to print it like the html it really is, I recommend you to get the primary html code status (before parsing it and so on) and to push it into a WebView like this:
webview.loadDataWithBaseURL("", status, "text/html", "UTF-8", "");
Otherwise, if you just want to print it into the screen, you can use a simple TextView to do that by text_view.setText(textData.toString())

This is the original html value:
String htmldescription = school2.getJSONObject(0).getString("description");
This is the html formatted value:
Spanned spanned = Html.fromHtml(formattedText);
And this is the String conversion:
String formattedText = spanned.toString();
Got it from here: how to save encoded html in string
If this doesn't work out you should check out the developer docs
Hope it Helps, Good Luck!

You can use a simple Regex to convert HTML template to a plain text. It detects all types of HTML tags, but there may be loopholes.
For example:
// Regex pattern
private static final String STR_PATTERN = "\\<[^\\>]*\\>";
public static String htmlToPlainText(final String template) {
// replaceAll(String regex, String replacement)
return (template.replaceAll(STR_PATTERN, ""));
}
I hope it helps

Related

Android TextView not showing multiple lines, even though String has newlines [duplicate]

For the input text:
<p>Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?
I run the following code:
Whitelist list = Whitelist.simpleText().addTags("br");
// Some other code...
// plaintext is the string shown above
retVal = Jsoup.clean(plaintext, StringUtils.EMPTY, list,
new Document.OutputSettings().prettyPrint(false));
I get the output:
Arbit string <b>of</b>
text. <em>What</em> to <strong>do</strong> with it?
I don't want Jsoup to convert the <br> tags to line breaks, I want to keep them as-is. How can I do that?
Try this:
Document doc2deal = Jsoup.parse(inputText);
doc2deal.select("br").append("br"); //or append("<br>")
This is not reproducible for me. Using Jsoup 1.8.3 and this code:
String html = "<p>Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?";
String cleaned = Jsoup.clean(html,
"",
Whitelist.simpleText().addTags("br"),
new Document.OutputSettings().prettyPrint(false));
System.out.println(cleaned);
I get the following output:
Arbit string <b>of</b><br><br>text. <em>What</em> to <strong>do</strong> with it?
Your problem must be somewhere else I guess.

HTML String tags are not replaced (Android)

I have the following piece of HTML:
<p>Bla bla bla...</p>
<p><strong>4. others</strong></p>
<p> </p>
It contains a random <p> </p> tag combination which needs to be filtered out in my Android app. I'm using the following Java code for it:
String html = object.get("Content").toString(); // this is the HTML
html = html.replace("<p> </p>", "");
html = html.replace("<p></p>", "");
html = html.replace("<p><span></span></p>", "");
content.setText(Html.fromHtml(html));
However, when I debug and put a break point on the replace functions, it doesn't replace the strings. Now I have useless <p> tags which I don't want. How do I solve this?
It contains a random tag combination which needs to be filtered out in my Android app.
You need to replace it separately. Because replace() function works case-sensitive and sequential strings.
So, Use below code
String html = object.get("Content").toString().trim(); // this is the HTML
html = html.replace("<p>", "");
html = html.replace("</p>", "");
html = html.replace("<span>", "");
html = html.replace("</span>", "");
content.setText(Html.fromHtml(html));
instead of your joint string in replace()
String html = object.get("Content").toString(); // this is the HTML
html = html.replace("<p> </p>", "");
html = html.replace("<p></p>", "");
html = html.replace("<p><span></span></p>", "");
content.setText(Html.fromHtml(html));
You get result as per you want.
Note: But, remember that anytime you want to replace any character/string than just replace it sequential with case-sensitive. because It gives you exact result.
You are doing it in wrong way.
Correct way:
1. Use StringBuffer instead of String.
2. Find index of < and put it in a variable say int startIdx
3. Find index of > and put it in a variable say int endIdx
4. then use delete() of StringBuffer and specify this startIdx and endIdx to delete HTML tag.
This will remove all HTML code from your string.
Ok so as per your comment, this is how you do it
String replaceThis="<p></p>";
int len = replaceThis.length();
StringBuffer buff="your html string from which you want to replace";
int i;
while((i = buff.indexOf(replaceThis)!=-1)
{
buff.delete(i,len-1);
}

Regular expression with hebrew

I have text like:
לשלום קוראים לי משהmy test is עלות 39.40, כל מיני data 1.1.2015 ויש גם data 123456 מידע
This text have Hebrew and English characters, I need to eliminate all except the 6 digit number (may be 5, this num: 123456).
Can you help me with regular expression for this?
Tried:
String patternS = "[אבגדהוזחטיכךלמםנןסעפףצץקרשתa-fA-F0-9]{5,10}.*";
Pattern pattern = Pattern.compile(patternString);
With no success
To match everything except the number use:
\d+(?:[^\d]\d+)+|[\p{L}\p{M}\p{Z}\p{P}\p{S}\p{C}]+
String resultString = subjectString.replaceAll("\\d+(?:[^\\d]\\d+)+|[\\p{L}\\p{M}\\p{Z}\\p{P}\\p{S}\\p{C}]+", "");
This will give you every 6 didgit combination in your string.
(\d{6,6})
We can't give you a more detailled regex since we do now know the pattern of those strings.
In case there is always the "data " prefix you can also use this to make the pattern more accurate:
data (\d{6,6})
Try something like this:
String patternS = "(\d{5,6})";
Pattern pattern = Pattern.compile(patternS);
Matcher m = pattern.matcher(yourText);
int number = Integer.parseInt(m.group(1));
where yourText is the Hebrew/English text you want to match.
This would work for this specific example.
String s = " לשלום קוראים לי מש my test is עלות 39.40, כל מיני data 1.1.2015 ויש גם data 123456 מידע1234";
System.out.println(s.replaceAll(".*\\b(\\d{5,6})\\b.*", "$1"));

SKipping image while converting html text to string android

Iam converting some Html text from a webpage into a String by doing the following
mydescription =Html.fromHtml(data.getBody()).toString();
This is what data.getBody() returns:-
<div><p>​It's great to have great dynamic companies to work with, and NXP is no exception.</p><p><img alt="This is an image of NXP Logo" src="https://anprodstorage.blob.core.windows.net/b75ef288-0381-45c4-a4cd-809097370bec/untitled.png" style="margin:5px;" /><br></p><div><iframe width="560" height="315" src="https://www.youtube.com/embed/I6191gXXGog" frameborder="0"></iframe> </div><p>​<br></p></div>
But within that html text there is a image source as well. When I do the above I get a square image with obj written inside it instead of the image.
This is myDescription

I just want to get the text and not the image.
How do i just get the text and not the image
Try this way,hope this will help you to solve your problem.
String htmlString = "<div><p>​It's great to have great dynamic companies to work with, and NXP is no exception.</p><p><img alt=\"This is an image of NXP Logo\" src=\"https://anprodstorage.blob.core.windows.net/b75ef288-0381-45c4-a4cd-809097370bec/untitled.png\" style=\"margin:5px;\" /><br></p><div><iframe width=\"560\" height=\"315\" src=\"https://www.youtube.com/embed/I6191gXXGog\" frameborder=\"0\"></iframe> </div><p>​<br></p></div>";
String first = htmlString.substring(0,htmlString.indexOf("<img"));
String second = htmlString.substring(htmlString.indexOf("/>",htmlString.indexOf("<img"))+2,htmlString.length());
textview.setText(Html.fromHtml(first+second));
use this code:
String clippedBody = htmlString.replaceAll("<img[^>]*?>.*?/[^>]*?>", "");
I advise using libraries, like jsoup when working with HTML (with soup you will be able to get only text by calling Jsoup.parse(html).text())
Haven't tried it myself
private static final Pattern REMOVE_TAGS = Pattern.compile("<img>(\\S+)</img>");
public static String removeTags(String string) {
if (string == null || string.length() == 0) {
return string;
}
Matcher m = REMOVE_TAGS.matcher(string);
return m.replaceAll("");
}
If you want to strip down all the HTML code, then you can use:
replaceAll("\\<[^>]*>","")
For your second question (from Source 2):
// the pattern we want to search for
Pattern p = Pattern.compile("<p>(\\S+)</p>");
Matcher m = p.matcher(string);
// if we find a match, get the group
if (m.find())
{
// get the matching group
String codeGroup = m.group(1);
// print the group
System.out.format("'%s'\n", codeGroup);
}
Source: 1, 2 and 3

Showing a UTF16-LE encoded string in textview for Android

I have a UTF-16LE encoded string that comes from a server. I would like to print that string in Textview of my activity. However, the string prints with spaces in between them. So, "Hello" prints as "H e l l o" and doesn't look all that nice in my screen.
Any help is appreciated.
Thanks
Assuming you have a stream (or array of bytes) containing a UTF-16LE encoded string.
String str0 = "Hello, I am a UTF-16LE encoded String";
byte[] utf16le = str.getBytes("UTF-16LE");
If you do not convert these back & stating the character set used you will be producing a string containing a lot of 0-bytes (UTF-16LE is, obviously, 16-bit) in your resulting String.
String wrong = new String(utf16le); // This will produce crap with \0:s in it.
String correct = new String(utf16le, "UTF-16LE"); // This will be the actual string.
Note: If you dump crap String:s like these into a TextView in ICS it will remove the garbage for you and not print "H e l l o".

Categories

Resources