Android - Parse text from website - android

I have webpage with this simple text, which is changeable.
<html><head><style type="text/css"></style></head><body>69766</body></html>
I need parse only number 69766 and save it to variable as String or int. It's possible to parse this number without adding libraries?
Thanks for your questions !

You can do like this
URL url = new URL("http://url for your webpage");
URLConnection yc = url.openConnection();
BufferedReader in = new BufferedReader(
new InputStreamReader(
yc.getInputStream()));
String inputLine;
StringBuilder builder = new StringBuilder();
while ((inputLine = in.readLine()) != null)
builder.append(inputLine.trim());
in.close();
String htmlPage = builder.toString();
String yourNumber = htmlPage.replaceAll("\\<.*?>","");

For your basic need you should take a lot at Html class.

this link shows how to parse the xml with the SAX parser. Its pretty straight forward.
http://www.codeproject.com/Articles/334859/Parsing-XML-in-Android-with-SAX

Related

Parse HTML text in Android

I'm trying to parse some HTML in my Android app and I need to get the text:
Pan Artesano Elaborado por Panadería La Constancia. ¡Esta Buenísimo!
in
Is there any easy way to get only the text and remove all html tags?
The behavior that I need is exactly the one shown in this PHP code http://php.net/manual/es/function.strip-tags.php
Document doc = Jsoup.parse(html);
Element content = doc.getElementById("someid");
Elements p= content.getElementsByTag("p");
String pConcatenated="";
for (Element x: p) {
pConcatenated+= x.text();
}
System.out.println(pConcatenated);//sometext another p tag
Well when you want just to show it, then webview would help you, just set that string to webview and you got it.
When you would to use it elsewhere then i am to stupid for that :D.
String data = "your html here";
WebView webview= (WebView)this.findViewById(R.id.webview);
webview.getSettings().setJavaScriptEnabled(true);
webview.loadDataWithBaseURL("", data, "text/html", "UTF-8", "");
also you can pass just web URL webview.loadDataWithBaseURL("url","","text/html", "UTF-8", "");
Firstly get HTML code with
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url);
HttpResponse response = client.execute(request);
String html = "";
InputStream in = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder str = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null)
{
str.append(line);
}
in.close();
html = str.toString();
then I recommend to create custom tag in HTML such as <toAndroid></toAndroid> and then you can get text with
String result = html.substring(html.indexOf("<toAndroid>", html.indexOf("</toAndroid>")));
your html for example
<toAndroid>Hello world!</toAndroid>
will result
Hello world!
Note that you can place <p> into <toAndroid> tags and then remove it in Java from result.

How to download String file which contain special characters of slovenia

I am trying to download the json file which contains slovenian characters,While downloading json file as a string I am getting special character as specified below in json data
"send_mail": "Po�lji elektronsko sporocilo.",
"str_comments_likes": "Komentarji, v�ecki in mejniki",
Code which I am using
URL url = new URL(f_url[0]);
URLConnection conection = url.openConnection();
conection.connect();
try {
InputStream input1 = new BufferedInputStream(url.openStream(), 300);
String myData = "";
BufferedReader r = new BufferedReader(new InputStreamReader(input1));
StringBuilder totalValue = new StringBuilder();
String line;
while ((line = r.readLine()) != null) {
totalValue.append(line).append('\n');
}
input1.close();
String value = totalValue.toString();
Log.v("To Check Problem from http paramers", value);
} catch (Exception e) {
Log.v("Exception Character Isssue", "" + e.getMessage());
}
I want to know how to get characters downloaded properly.
You need to encode string bytes to UTF-8. Please check following code :
String slovenianJSON = new String(value.getBytes([Original Code]),"utf-8");
JSONObject newJSON = new JSONObject(reconstitutedJSONString);
String javaStringValue = newJSON.getString("content");
I hope it will help you!
Decoding line in while loop can work. Also you should add your connection in try catch block in case of IOException
URL url = new URL(f_url[0]);
try {
URLConnection conection = url.openConnection();
conection.connect();
InputStream input1 = new BufferedInputStream(url.openStream(), 300);
String myData = "";
BufferedReader r = new BufferedReader(new InputStreamReader(input1));
StringBuilder totalValue = new StringBuilder();
String line;
while ((line = r.readLine()) != null) {
line = URLEncoder.encode(line, "UTF8");
totalValue.append(line).append('\n');
}
input1.close();
String value = totalValue.toString();
Log.v("To Check Problem from http paramers", value);
} catch (Exception e) {
Log.v("Exception Character Isssue", "" + e.getMessage());
}
It's not entirely clear why you're not using Android's JSONObject class (and related classes). You can try this, however:
String str = new String(value.getBytes("ISO-8859-1"), "UTF-8");
But you really should use the JSON libraries rather than parsing yourself
When creating the InputStreamReader at this line:
BufferedReader r = new BufferedReader(new InputStreamReader(input1));
send the charset to the constructor like this:
BufferedReader r = new BufferedReader(new InputStreamReader(input1), Charset.forName("UTF_8"));
problem is in character set
as per Wikipedia Slovene alphabet supported by UTF-8,UTF-16, ISO/IEC 8859-2 (Latin-2). find which character set used in server, and use the same character set for encoding.
if it is UTF-8 encode like this
BufferedReader bufferedReader= new BufferedReader(new InputStreamReader(inputStream), Charset.forName("UTF_8"));
if you had deffrent character set use that.
I have faced same issue because of the swedish characters.
So i have used BufferedReader to resolved this issue. I have converted the Response using StandardCharsets.ISO_8859_1 and use that response. Please find my answer as below.
BufferedReader r = new BufferedReader(new InputStreamReader(response.body().byteStream(), StandardCharsets.ISO_8859_1));
StringBuilder total = new StringBuilder();
String line;
while ((line = r.readLine()) != null)
{
total.append(line).append('\n');
}
and use this total.toString() and assigned this response to my class.
I have used Retrofit for calling web service.
I finally found this way which worked for me
InputStream input1 = new BufferedInputStream(conection.getInputStream(), 300);
BufferedReader r = new BufferedReader(new InputStreamReader(input1, "Windows-1252"));
I figured out by this windows-1252, by putting json file in asset folder of the android application folder, where it showed same special characters like specified above,there it showed auto suggestion options to change encoding to UTF-8,ISO-8859-1,ASCII and Windows-1252, So I changed to windows-1252, which worked in android studio which i replicated the same in our code, which worked.

Get HTML code from url in android

I was wondering if is any way to get HTML code from any url and save that code as String in code?
I have a method:
private String getHtmlData(Context context, String data){
String head = "<head><style>#font-face {font-family: 'verdana';src: url('file://"+ context.getFilesDir().getAbsolutePath()+ "/verdana.ttf');}body {font-family: 'verdana';}</style></head>";
String htmlData= "<html>"+head+"<body>"+data+"</body></html>" ;
return htmlData;
}
and I want to get this "data" from url. How I can do that?
Try this (wrote it from the hand)
URL google = new URL("http://www.google.com/");
BufferedReader in = new BufferedReader(new InputStreamReader(google.openStream()));
String input;
StringBuffer stringBuffer = new StringBuffer();
while ((input = in.readLine()) != null)
{
stringBuffer.append(input);
}
in.close();
String htmlData = stringBuffer.toString();
Sure you can. That's actually the response body. You can get it like this:
HttpResponse response = client.execute(post);
String htmlPage = EntityUtils.toString(response.getEntity(), "ISO-8859-1");
take a look at this please, any other parser will work too, or you can even make your own checking the strings and retrieving just the part you want.

How to get data from html in android

I have get html data from webpage. But i want to get only data excluding html tags.
I have tried this:
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(urlText.getText().toString());
// Get the response
BufferedReader rd = new BufferedReader(new InutStreamReader(response.getEntity().getContent()));
StringBuilder sb = new StringBuilder();
String line = "";
while ((line = rd.readLine()) != null)
{
textView.append(line);
sb.append(line+"\n");
}
This giving me whole html data. Tell me now i can get data only.
Have you tried using Html.fromHtml(source)? or use any Java HTML parser (If they work on android) for this.
Here source is your html formatted whole data.
EDIT:
while ((line = rd.readLine()) != null)
{
sb.append(line+"\n");
}
String source = sb.toString();
textView.setText(Html.fromHtml(source));
Look at this example Android Parsing HTML Content Containing Links.

How to programmatically download an HTML page in Android and get its HTML?

I need to download an HTML page programmatically and then get its HTML. I am mainly concerned with the downloading of the page. If I download the page, where will I put it?
Will I have to keep in an String variable? If yes then how?
This site provides a good explanation on how to download a file, and also how to set the location to where it should be stored. You do not have to, and should not, keep it in a string variable. If you are to manipulate the data I would suggest you use an XML parser.
You can call this method in doInBackground of AsyncTask
String html = "";
String url = "ENTER URL TO DOWNLOAD";
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url);
HttpResponse response = client.execute(request);
InputStream in = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder str = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null)
{
str.append(line);
}
in.close();
html = str.toString();

Categories

Resources