I am loading website for example 'http:/example.com' in WebView,
Let this page contain a element
<a class="x" href="/test1">Click here</a>
How can I get this element 'x' and its value of href from WebView.
You need something to read HTML back in your Java code, like showed in this Answer:
how to get html content from a webview?
Then when you have your HTML content, you need a parser to extract element (and data) you need.
In past i've used JSOUP to navigate the HTML and it worked really well, you can find it here https://jsoup.org/.
You could extract the class names and href value with this: (only a concept)
Document yourPage = Jsoup.parse(htmlString);
Element aElement = yourPage.select("path to a element").first();
Set<String> classNames = aElement.classNames();
String url = aElement.attr("href");
If you need help, you can read here a pretty nice intro-tutorial from JSOUP documentation
Related
I'm using Jsoup and need to know text point of Element or Node in jsoup. Example: I have html: <p><span>1</span></p> then I need to know text point of <p> is 0,<span> is 4,</span> is 10... How to do that?
Currently you can't do this in Jsoup, for it does not keep track of the positions of tags in the original input. There was some discussion going on about this earlier (JSOUP HTML Parser)
The solution is to use another parser that explicitly supports this feature. The other post suggested Jericho.
I am currently trying to import selective headline from html content in my webview. I am looking at wide variety of options like json parsing or any hack will do. I was wondering if anyone has had experience with this or a brief idea on how to go about this?
Here's my example:
This is my html file content:
<div><h1><span class = "headline"> Some depressing title </span> <span class = "source" > ABCD </span> </h1> <br/> <span class = "body"> crappy body content which I do not need </span></div>
I just want to retrieve "headline" and "source" from this html in my webview, nothing else(not the body ). How do I go about defining a parameter to retrieve these? Any clues on how to do it?
Thanks!
Step 1: get the HTML source from your WebView - see this question. You basically create a JS interface that extracts your HTML source to a Java String.
Step 2: Use an HTML Parser (for example JSOUP) to parse the JAVA String into a format that you can handle easily.
Step 3: Use the parser to extract your relevant information. Here, you could use getElementsByTag('span') to get all your spans, then filter by class; or you could directly use getElementsByClass('healine') and getElementsByClass('source').
In general, you can retreive the HTML source and parse the DOM in all cases.
Edit: if you don't want to use a parser, you can extract your information by using searches on the HTML source string (finding the correct classes, then finding the indexes of '<' and '>' caracters to parse the information. This way is harder, less efficient, and less flexible, but it can be done.
Instead of going through VoiceOver or similar software, I want a function which can take an element-id as parameter and return the alt text or label so that I can validate whether the text is correct.
Any other suggestions welcome.
You could use HttpClient to fetch the HTML code from web, and use the jsoup library to parse the code, then find out the attributes of selected element. Download jsoup jar and put it into the lib directory of your project.
Document doc = Jsoup.parse("..."); // ... is the string of HTML code
Elements inputElement = doc.select("#...").first(); // ... is the id of your element
String alt = inputElement.attr("alt") // select the "alt" attribute.
I am fetching few html content from my server for which I am using JSON parsing. But this converts my html content to unicode values.
For Eg: <p>Spend minimum $10 (in a single same-day receipt) at any outlet<\/p> is getting converted to,
;p>Spend minimum $10 (in a single same-day receipt) at any outlet </p>
Now if I try to set this to my WebView it displays with HTML tags itself. If I try to encode the data using TextUtils.encode it displays the text with unicode values.
Can anyone help me with this.
How should I fetch a HTML content and display it in WebView?
I am not getting your question exactly but, If you want to load HTML in web view in you can use
webView.loadDataWithBaseURL(null, html, "text/html", "UTF-8", null);
and if you want to convert < and > like notation you can use Jsoup Library
Guys thanks for your help. But I have solved this issue myself. I have elaborated my way of solving the issue.
What I did is,
1)convert the unicode value to Spanned like this,
Spanned ss=Html.fromHtml(;p>Spend minimum $10 (in a single same-day receipt) at any outlet </p>");
2)Now convert this Spanned to String like this,
String tempString=ss.toString();
3)And now set this to WebView which solved the problem,
webView.loadData(tempString, "text/html","UTF-8");
Actually this isn't JSON encoder converts data to HTML entities but some other layer, before it passed to JSON encoder.
JSON have nothing to do with HTML tags, usually only quotes encoded by parser (Unicode is supported by most parsers).
You probably need to change the way data is returned by server, to omit encoding of HTML tags braces to HTML entities or decoding entities backin your app.
Update:
To decode HTML entities used in HTML tags (and others too) you may use StringEscapeUtils.unescapeHTML()
To show the HTML page inside the Webview why you require the JSON. create web view inside the XML and write below code Inside the Activity you can see the HTML page.
webView = (WebView)findViewById(R.id.webView);
FrameLayout mContentView = (FrameLayout) getWindow().
getDecorView().findViewById(android.R.id.content);
final View zoom = this.webView.getZoomControls();
mContentView.addView(zoom, ZOOM_PARAMS);
zoom.setVisibility(View.GONE);
webView.loadUrl("http://www.google.co.in/");
I am trying to retrieve the image url in from this html page
The image is inside of the editions box on the webpage. How would i go about getting it using the JSoup selector method.
Such as
Document doc = Jsoup.connect(url).get();
Element png = doc.select(//What would the tag be?);
I have an idea of how to set it up, just not how to retrieve the tag.
doc.select("div.box-art").select("img").attr("abs:src"));
From the docs it looks like doc.select(".box-art img") should do the trick. (Select an img element which is the child of an element of class box-art.) Note that this could get you multiple imgs, (if JSoup supports that).