Error fetching website source-code

Error fetching website source-code - android

I am trying to create a simple Android app that will have the possibility to fetch the source code of a website. Anyways, I have written the following:
WebView webView = (WebView) findViewById(R.id.webView);
try {
webView.setWebViewClient(new WebViewClient());
InputStream input = (InputStream) new URL(url.toString()).getContent();
webView.loadDataWithBaseURL("", "<html><body><p>"+input.toString()+"</p></body></html>", "text/html", Encoding.UTF_8.toString(),"");
setContentView(webView);
} catch (Exception e) {
Alert alert = new Alert(getApplicationContext(),
"Error fetching data", e.getMessage());
}
I've tried to change the 3rd line several times to other methods that will fetch the source code, but they all redirect me to the alert (error with no message, only the title).
What am I doing wrong?

Is there a particular reason why you can't just use this to load the webpage?
webView.loadUrl("www.example.com");
If you really want to grab the source code into a string so you can manipulate it and display it as you are trying to do, try opening a stream to the content and then using standard java methods to read in the data to a String, to which you can then do whatever you want:
InputStream is = new URL("www.example.com").openStream();
InputStreamReader is = new InputStreamReader(in);
StringBuilder sb = new StringBuilder();
BufferedReader br = new BufferedReader(is);
String read = br.readLine();
while(read != null) {
sb.append(read);
read = br.readLine();
}
String sourceCodeString = sb.toString();
webView.loadDataWithBaseURL("www.example.com/", "<html><body><p>"+sourceCodeString+"</p></body></html>", "text/html", Encoding.UTF_8.toString(),"about:blank");

Related

Parse HTML text in Android

I'm trying to parse some HTML in my Android app and I need to get the text:
Pan Artesano Elaborado por Panadería La Constancia. ¡Esta Buenísimo!
in
Is there any easy way to get only the text and remove all html tags?
The behavior that I need is exactly the one shown in this PHP code http://php.net/manual/es/function.strip-tags.php

Document doc = Jsoup.parse(html);
Element content = doc.getElementById("someid");
Elements p= content.getElementsByTag("p");
String pConcatenated="";
for (Element x: p) {
pConcatenated+= x.text();
}
System.out.println(pConcatenated);//sometext another p tag

Well when you want just to show it, then webview would help you, just set that string to webview and you got it.
When you would to use it elsewhere then i am to stupid for that :D.
String data = "your html here";
WebView webview= (WebView)this.findViewById(R.id.webview);
webview.getSettings().setJavaScriptEnabled(true);
webview.loadDataWithBaseURL("", data, "text/html", "UTF-8", "");
also you can pass just web URL webview.loadDataWithBaseURL("url","","text/html", "UTF-8", "");

Firstly get HTML code with
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url);
HttpResponse response = client.execute(request);
String html = "";
InputStream in = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder str = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null)
{
str.append(line);
}
in.close();
html = str.toString();
then I recommend to create custom tag in HTML such as <toAndroid></toAndroid> and then you can get text with
String result = html.substring(html.indexOf("<toAndroid>", html.indexOf("</toAndroid>")));
your html for example
<toAndroid>Hello world!</toAndroid>
will result
Hello world!
Note that you can place <p> into <toAndroid> tags and then remove it in Java from result.

trying to parse an xml data from a web service .SAXParseException

I am attempting to parse a data document from open weather app. I am successfully reading in the entire file. I can put that entire file into a text view. I just need to parse that data. I get this error when I try to parse:
org.xml.sax.SAXParseException: Unexpected end of document
Here is my code for parse and reading the document in.
public void Weather(View view){
InputStream data;
final String OPEN_WEATHER_MAP_API =
"http://api.openweathermap.org/data/2.5/weather?q=";
StrictMode.ThreadPolicy policy = new StrictMode.ThreadPolicy.Builder().permitAll().build();
StrictMode.setThreadPolicy(policy);
try {
URL url = new URL(String.format(OPEN_WEATHER_MAP_API + City + "&mode=xml&appid=40f9dad632ecd4d87b55cb512d538b75"));
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
// connection.addRequestProperty("x-api-key", this.getString(R.string.open_weather_maps_app_id));
data = connection.getInputStream();
InputStreamReader inputStreamReader = new InputStreamReader(data);
BufferedReader Reader = new BufferedReader(inputStreamReader);
StringBuffer Weatherdata = new StringBuffer();
String storage;
while ((storage = Reader.readLine()) != null) {
Weatherdata.append(storage + "\n");
}
cityField.setText(Weatherdata.toString());
}
catch(Exception e){
e.printStackTrace();
cityField.setText("Fail");
return;
}
try {
DocumentBuilderFactory documetBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documetBuilderFactory.newDocumentBuilder();
Document xmlDocument = documentBuilder.parse(data);
Element rootElement = xmlDocument.getDocumentElement();
}
catch (Exception e){
e.printStackTrace();
}
}
I did a quick google search the other person who had this error was having this error when he had the file stored on the computer/phone.

It occurs because you already reach the end of the InputStream when trying to parse your xml.
Indeed, when displaying the stream content using the InputStreamReader you move the file "cursor" until the end of the stream.
So, when you try to parse it with the SAX parser, it raises this end of document Exception (if you replace the parsing code to a call to data.read(), it will return -1 which means that you already reach the end of the stream).
If you remove the InputStreamReader related code, you will be able to parse the xml.
If you want to keep this code, since the reset method (which allows to reset the cursor to the beginning of the file) is not supported on HttpInputStream you should copy its content to a StringBuilder or a BufferedInputStream for example.

How to parse HTML full page in android

I am calling a HTML page via a web servise . I need to get hole source code of HTML page.
My problem is that, when I convert the http response to string I am getting only some part of HTML page. How do I can get hole HTML page .Please help me.
//paramString1 = url,paramString = header, paramList = paramiters
public String a(String paramString1, String paramString2, List paramList)
{
String str1 = null;
HttpPost localHttpPost = new HttpPost(paramString1);
localHttpPost.addHeader("Accept-Encoding", "gzip");
InputStream localInputStream = null;
try
{
localHttpPost.setEntity(new UrlEncodedFormEntity(paramList));
localHttpPost.setHeader("Referer", paramString2);
HttpResponse localHttpResponse = this.c.execute(localHttpPost);
int i = localHttpResponse.getStatusLine().getStatusCode();
localInputStream = localHttpResponse.getEntity().getContent();
Header localHeader = localHttpResponse.getFirstHeader("Content-Encoding");
if ((localHeader != null) && (localHeader.getValue().equalsIgnoreCase("gzip")))
{
GZIPInputStream localObject = null;
localObject = new GZIPInputStream(localInputStream);
Log.d("API", "GZIP Response decoded!");
BufferedReader localBufferedReader = new BufferedReader(new InputStreamReader((InputStream)localObject, "UTF-8"));
StringBuilder localStringBuilder = new StringBuilder();
while(true){
String str2 = localBufferedReader.readLine();
if (str2 == null)
break;
localHttpResponse.getEntity().consumeContent();
str1 = localStringBuilder.toString();
localStringBuilder.append(str2);
continue;
}
}
}
catch (IOException localIOException)
{
localHttpPost.abort();
}
catch (Exception localException)
{
localHttpPost.abort();
}
Object localObject = localInputStream;
return (String)str1;

Are you receiving the HTML in the variable paramString1?, in that case, are you encoding the String somehow or its just plane HTML?
Maybe the HTML special characters are breaking your response. Try encoding the String with urlSafe Base64 in your server side, and decoding it in the client side:
You can use the function Base64 of Apache Commons.
Server Side:
Base64 encoder = new Base64(true);
encoder.encode(yourBytes);
Client side:
Base64 decoder = new Base64(true);
byte[] decodedBytes = decoder.decode(paramString1);
HttpPost localHttpPost = new HttpPost(new String(decodedBytes));

You may not get the complete source code in your stringBuilder as it must be exceeding the max size of stringBuilder as StringBuilder is set of arrays. If u want to store that particular sourcecode. You may try this: The inputStream (which contains html source code) data, store directly into a File. Then you will have complete source code in that file and then perform file operation to whatever you require. See if this may help you.

Working with ePub files in android

I referred this siegmann android tutorial
and successfully logged the Title, Author name and Table of contents.
Now I read that the whole book can be viewed in WebView.
But I don't find any tutorial for Dispalying an ePub file.
When it comes to creating an ePub file, I found this from SO
But I'm unable to implement it as I don't have any idea about main.xml.
Kindly suggest any tutorial to create and display an ePub file.
For creating ePub, I tried to refer this siegmann eg
but I'm not able to understand it properly.
Do I need to provide .html for each chapter and .css in order to create an ePub file?
I know I'm little unclear in this qustion as I'm absolute beginner when it comes to working with ePub, so any suggestions/help appreciated.

Try this in logTableOfContents()
while ((line = r.readLine()) != null) {
line1 = line1.concat(Html.fromHtml(line).toString());
}
finalstr = finalstr.concat("\n").concat(line1);

You can also spine the epub content with the help of
Spine spine = book.getSpine();
List<SpineReference> spineList = spine.getSpineReferences() ;
int count = spineList.size();
StringBuilder string = new StringBuilder();
for (int i = 0; count > i; i++) {
Resource res = spine.getResource(i);
try {
InputStream is = res.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
try {
while ((line = reader.readLine()) != null) {
linez = string.append(line + "\n").toString();
System.err.println("res media"+res.getMediaType());
htmlTextStr = Html.fromHtml(linez).toString();
Log.e("Html content.",htmlTextStr);
speak(htmlTextStr);
}
} catch (IOException e) {e.printStackTrace();}
//do something with stream
} catch (IOException e) {
e.printStackTrace();
}
}
webview.getSettings().setAllowFileAccess(true);
webview.getSettings().setBuiltInZoomControls(true);
webview.getSettings().setJavaScriptEnabled(true);
webview.loadDataWithBaseURL("file:///android_asset/", linez, "application/xhtml+xml", "UTF-8", null);

android: parse html from page

i would like to parse out some text from a page.
Is there an easy way to save the product info in to a string for example? Example url: http://upcdata.info/upc/7310870008741
Thanks

Jsoup is excellent at parsing simple HTML from Android applications:
http://jsoup.org/
To get the page, just do this:
URL url = new URL("http://upcdata.info/upc/7310870008741");
Document document = Jsoup.parse(url, 5000);
Then you can parse out whatever you need from the Document. Check out this link for a brief description of how to extract parts of the page:
http://jsoup.org/cookbook/extracting-data/dom-navigation

If you want to read from a URL into a String:
StringBuffer myString = new StringBuffer();
try {
String thisLine;
URL u = new URL("http://www.google.com");
DataInputStream theHTML = new DataInputStream(u.openStream());
while ((thisLine = theHTML.readLine()) != null) {
myString.append(thisLine);
}
} catch (MalformedURLException e) {
} catch (IOException e) {
}
// call toString() on myString to get the contents of the file your URL is
// pointing to.
This will give you a plain old string, HTML markup and all.

String tmpHtml = "<html>a whole bunch of html stuff</html>";
String htmlTextStr = Html.fromHtml(tmpHtml).toString();

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.

Error fetching website source-code - android

Related

Parse HTML text in Android

trying to parse an xml data from a web service .SAXParseException

How to parse HTML full page in android

Working with ePub files in android

android: parse html from page

Categories

Resources