This is a simple scenario in which I have tried multiple times but do not receive the data I am after. I am using an imported library called JSoup which parses HTML.
I collect the webpage html document:
// url - The URL of the HTML document:
Document document = Jsoup.connect(url).get();
From there I know you parse data from tags. I want data inside this tag:
<pre>
Example scenario:
<pre> This is the String data inside this tag I wish to collect </pre>
If anyone could help me, I would be grateful (-:
Thanks all (-:
Firstly, you should check the source code from the url accessing to confirm whether the tag pre exists.
Then, you can use the select method of jsoup to extract the pre tag. The sample code is like this:
Document document = Jsoup.connect(url).get();
Elements eles = doc.select("pre");
for (Element ele : eles) {
System.out.println(ele.toString());
}
Related
I want to fetch some data from my website using Jsoup in my webview. The website is still in development so I can't post any code but here's what I want to achieve:
So the user visits the website where all data I require in the app is loaded onto one page. So I want to fetch all that data as separate strings and use them to fill my table layout. The website has all that I want with each string in a p tag with a unique id.
How can I achieve this? I already have jsoup installed but I can't get my head around how to use it.
Generally if you want to extract element that has id so use select(element_name#id_name)
To extract text that element involves it use .text()
Again show us the html part you want to extract the text from
So try this code
try {
Document doc = Jsoup.connect("your url").get();
System.out.println(doc.select("p#id_name").text());
} catch (Exception e) {
e.printStackTrace();
}
I am implementing the moethod to get json cel l content through google spreadsheet
The below is the part of my content
[{"scheme":"http://schemas.google.com/spreadsheets/2006","term":"http://schemas.google.com/spreadsheets/2006#list"}],"title":{"type":"text","$t":"traditional"},
"content":{"type":"text","$t":"ridicule: http://hkgalden.com/face/hkg/369.gif, adore: http://hkgalden.com/face/hkg/adore.gif, agree: http://hkgalden
.com/face/hkg/agree.gif, angel: http://hkgalden.com/face/hkg/a
ngel.gif, angry: http://hkgalden.com/face/hkg/angry
.gif, ass: http://hkgalden.com/face/hkg/ass.gif, banghead: http://hkgalden.com/face/hkg/
banghead.gif, biggrin: http://hkgalden.com/face/hkg/biggrin.gif
When it comes to using the following methods to extract link as follows
from
biggrin: http://hkgalden.com/face/hkg/biggrin.gif
to
http://hkgalden.com/face/hkg/biggrin.gif
Using
link = data[0].split(": ")[1];
It sometimes cannot get the values and hence insert the data record using ORMLite
and found empty data link
What should I follow afterwords?
Is using substring with lastOfIndex can help ?
The below string is what I retrieve from one of the fields of json response >How do I get the value of src in the string below .I really appreciate any help.Thanks in Advance.
Getting better doesn’t stop because it’s getting colder. The best athletes don’t just overcome the elements, they embrace them with Nike Hyperwarm. Gear up for winter: http://www.nike.com/hyperwarm<br/><br/><img class="img" src="http://vthumb.ak.fbcdn.net/hvthumb-ak-ash3/t15/1095964_10151882078663445_10151882076318445_40450_2013_b.jpg" alt="" style="height:90px;" /><br/>Winning in a Winter Wonderland
Try using jsoup html parsing api to with dedicated functionality for html parsing and would also provide for an extensible solution.
For your case (I escape quotes and additional \ to make it a valid Java string):
String str = "Getting better doesn’t stop because it’s getting colder. The best athletes don’t just overcome the elements, they embrace them with Nike Hyperwarm. Gear up for winter: http://www.nike.com/hyperwarm<br/><br/><img class=\"img\" src=\"http://vthumb.ak.fbcdn.net/hvthumb-ak-ash3/t15/1095964_10151882078663445_10151882076318445_40450_2013_b.jpg\" alt=\"\" style=\"height:90px;\" /><br/>Winning in a Winter Wonderland\"";
Document doc = Jsoup.parse(str);
Element element = doc.select("img").first();
System.out.println(element.attr("src"));
Element element2 = doc.select("a").first(); // Get the anchor tag element
System.out.println(element2.attr("onclick")); // onclick as attribute for anchor tag
Output;
http://vthumb.ak.fbcdn.net/hvthumb-ak-ash3/t15/1095964_10151882078663445_10151882076318445_40450_2013_b.jpg
OK, What I want to achieve is to write each result JSoup fetches me in a separate String. Is this somehow possible? I can get the first and last with a function but, yea, then the rest is lost.
right now i have this in my doInBackground:
// Connect to the web site
Document document = Jsoup.connect(url).get();
// Using Elements to get the Meta data
Elements titleElement = document.select("h2[property=schema:name]");
// Locate the content attribute
date1 = titleElement.toString();
Log.e("Date", String.valueOf(Html.fromHtml(date1)));
With this i get a list of results which is nice, but i'd like to have every result in a separate String.
Thanks in advance, if you need anything more please ask :)
I read through the documentation carefully again and found this:
element.eq(n).text
The "n" defines which position to get, the .text strips all the html and makes it a readable text
I am making an application for android, and an element of the functionality of the application is to return results from an online search of a library's catalogue. The application needs to display the results of the search, which is carried out by way of a custom HTML form, in a manner in keeping with the rest of the application. Ie, the results of the search need to be parsed and the useful elements displayed. I was just wondering if/how this could be achieved in android?
You would use a Html Parser. One that i use and works VERY well is JSoup
This is where you will need to begin with parsing html. Also Apache Jericho is another good one.
You would retrieve the html document by using DOM, and use the JSOUP Select() method to select any tags that you would like to get. Either by tag, id, or class.
Solution
Use the: Jsoup.connect(String url) method:
Document doc = Jsoup.connect("http://example.com/").get();
This will allow you to connect to the html page by using the url. And store it as the Document doc, Through DOM. And the read from it using the selector() method.
Description
The connect(String url) method creates a new Connection, and get()
fetches and parses a HTML file. If an error occurs whilst fetching the
URL, it will throw an IOException, which you should handle
appropriately.
The Connection interface is designed for method chaining to build
specific requests:
Document doc = Jsoup.connect("http://example.com")
If you read through the documentation on Jsoup you should be able to achieve this.
EDIT: Here is how you would use the selector method
//Once the Document is retrieved above, use these selector methods to Extract the data you want by using the tags, id, or css class
Elements links = doc.select("a[href]"); // a with href
Elements pngs = doc.select("img[src$=.png]");
// img with src ending .png
Element masthead = doc.select("div.masthead").first();
// div with class=masthead
Elements resultLinks = doc.select("h3.r > a"); // direct a after h3
EDIT: Using JSOUP you could use this to get attributes, text,
Document doc = Jsoup.connect("http://example.com")
Element link = doc.select("a").first();
String text = doc.body().text(); // "An example link"
String linkHref = link.attr("href"); // "http://example.com/"
String linkText = link.text(); // "example""
String linkOuterH = link.outerHtml();
// "<b>example</b>"
String linkInnerH = link.html(); // "<b>example</b>"
You can use XmlPullParser for parsing XML.
For e.g. refer to http://developer.android.com/reference/org/xmlpull/v1/XmlPullParser.html
Being that the search results are HTML and HTML is a Markup Language (ML) you can use Android's XmlPullParser to parse the results.