I try to parse the source file of a website called dgtle.com
In order to get the top news, I coded as :
Document doc = Jsoup.connect("http://www.dgtle.com").get();
Elements blocks = doc.getElementsByClass("listcs1");
I got nothing but NullPointerExecption while doing this. But the Div with the class of "listcs1" really exists. This troubles me and I'm wondering whether anybody can help me deal with this.
Use a doc.select instead of getElementsByClass..
ex.
Elements blocks = doc.select("div.listcs1");
this will get all divs with the class of "listcs1"
Jsoup lists all of the selection combinations here: http://jsoup.org/cookbook/extracting-data/selector-syntax
Related
HtmlSpanner with css
I have found an library called HtmlSpanner that should help me with adding a html string with css to a TextView.
but i cant find any documentation on it except
(new HtmlSpanner()).fromHtml()
but nothing on how to include a css file to it or how i can create an TagManager on handeling the css
can anyone help me?
Out of the box HtmlSpanner does parse blocks, and is able to apply the CSS styles in those blocks to the text.
The code for that is in the StyleNodeHandler handler class.
Now the good news is that it's pretty easy to add new TagNodeHandler classes, and in your case all you'd need is to add one that
does the following:
List item
Read the "href" property from the CSS link
Retrieve the URL that the href points to and read it into a String
Parse the String into a CSS rule
Register that CSS rule
Steps 3 and 4 are already in the StyleNodeHandler class (in the parseCSSFromText method), so you'd only need to implement steps 1 and 2.
Here's a quick Gist of what you'd need to add:
link
I am reading the Head First Android development book. In the third chapter where they try to make an app from NASA RSS feed from here . In the book the author uses SAX parser for Java. I looked online and some of the answers here on SO suggest that SAX is outdated and there are newer solutions.
However I'm not sure what the easier to use ones are for Java. I have used Nokogiri for Ruby and something similar would be awesome. I looked at jsoup and it looked alright, but I am wondering what suggestions you guys might have.
I'm the author of Head First Android Development, so just wanted to chime in with a few thoughts. SAX is definitely a bit cumbersome, but straightforward and was built into Android for a while (hence the decision to use that in the book). I'm also a rails developer and I'm a big fan of nokogiri and use it often. Looking at jsoup, I could definitely see that being useful. That said, I haven't tried it out, so I can't give any first hand experience with it.
Another option to look at is the XML PullParser built into Android. It's still pretty SAX-like, but a bit more full featured.
Hope this helps.
The code on the chapter 3 halts because Android doesn't support networking in it's main thread.
So you can use any parser like XmlPullParser but make sure you do the networking(downloading the feed etc.) off of it's main thread. You can use AsyncTask to take the networking outside the main thread.. or create a new Thread() and do the networking in that thread (Recommended)
Actually, in the 4th chapter they actually DID create a new thread to do the networking. So if you use the chapter4 code instead then it will work.
Another problem you might face is of OutOfMemoryError because Nasa daily images are really big these days. So you'll have to decode the image with inSampleSize. You can check other questions on decoding an image right to get what you want. Good luck. ))
I'm a big fan of Jsoup. I only recently started using it and its amazing. I used to write some super hairy regex patterns to do pattern matching with because I wanted to avoid SAX like the plague... and that was quite tedious as you can imagine. Jsoup let me parse out specific items from a <table> in just a few lines of code.
Let's say I want to take the first 7 rows of a table where the <tr class=...> is GridItem or GridAltItem. Then, lets say we want to print the 1st, 2nd, and 3rd columns as text and then the first <a href> link that appears in the row. Sounds goofy, but I had to do this and I can do this easily:
String page = "... some html markup fetched from somewhere ...";
Document doc = Jsoup.parse(page);
for(int x=0; x< 7; x++) {
Element gridItem = doc.select("tr[class$=Item]").select("tr").get(x);
System.out.println("row: " + gridItem.select("td").get(0).text() + " " + gridItem.select("td").get(1).text() + " " + gridItem.select("td").get(4).text() + " " + gridItem.select("a").get(0).attr("href"));
}
Its that simple with Jsoup. Make sure you add the Jsoup jar file to your project as a library and import those classes which you need: you don't want to import the wrong Document or Element class...
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
Enjoy!
I think SAX is a default way to acheive it, no boundations on trying something new though :)
Since version 1.6.2, Jsoup officially also supports XML parsing. This thus allows you to parse XML and select elements using jQuery-like CSS selectors. To create a XML document with Jsoup, you need the following instead of Jsoup#parse() method:
Document document = Parser.xmlParser().parseInput(xmlString, "");
// ...
This way the input won't implicitly be treated as HTML5 (so, no auto-included <html><head> tags and so on).
I know that in Jsoup when you want to find a certain Element with a link in it, you can do this:
Document doc = Jsoup.parse(text);
Element links = doc.select("[href]");
This however, takes all links to every website in the page...
But what if I have multiple links, and I only want to retrieve the ones specifically linking to google. For instance:
Google
Bing
Another Google
And I want it to take only those with google in it. I tried doing something like this:
Element links = doc.select("[href=\"http://www.google.com\"]");
But this doesn't work... does anyone have a suggestion?
Have you tried simply this:
Element links = doc.select("[href=http://www.google.com]");
//Or,
Element links = doc.select("a[href=http://www.google.com]");
//Or with the 'attribute contains' form, the most likely to work:
Element links = doc.select("a[href*=google]");
What would the best approach to :
get html codes by calling url and displaying it in some views
if i would be editing the values against title tag, it must be automatically updated in the url link?
How to do this?
Check out the JSoup cookbook.
http://jsoup.org/cookbook/
This has a lot of good information in it and should be able to answer most of your questions.
You are probably looking for something along the lines of this: http://jsoup.org/cookbook/input/parse-body-fragment
I need to get data from an XML file in Android. On the iPhone environment, my code is:
NSURL *thisURL = [[NSURL alloc] initWithString:#"http://www.xxx.com/file.xml"];
NSArray *myArray = [[NSArray alloc] initWithContentsOfURL:providerURL];
myArray is now an array of dictionary items initialized with contents from file.xml.
Is there any way to do this in Android? Can someone point me to doc or sample code?
I'm new to the Android environment and just need some direction.
Thanks,
Kevin
See Working with XML in Android for a variety of methods for dealing with XML. Which method to use depends on how big your XML is, and what you want to do with it. '
I'm not sure how it makes any sense to turn XML into an array, so no, none of the methods do that. If you want something similar to that, use Json instead of XML.
After a bit of research, it appears to me that using the Simple XML Serialization framework is going to be my best bet, especially since I do have a relatively simple XML file to read. The result will be a 'list' class with several 'entry' classes which seems like a viable way to handle this...probably better than having an array of classes as was done in the iPhone app.