There is a remote xml file and I need to get set of xml nodes from it and show in an activity. One way is to download it completely and then do an xpath evaluate and get node set as follows
// using node I can drill further down to collect my relevant info
NodeList names = (NodeList) xPath.evaluate("/library/artists/name", new InputSource(new FileReader(getDowloadedXMLFileLocation())), XPathConstants.NODESET);
The other way is to use a pull parser which would be a bit more code I guess to do a simple job to extract a nodeset . I am wondering which method performs better in memory and speed ?
Customized code versus a generic solution, equates to high performance versus ease of use.
XPath (and XLST, and JDOM etc) will construct a DOM like model, and iterate it. This will take more RAM and CPU. XPath is easy, but can be really slow for large files. Also it requires you to have the complete file at hand. Use this for small files.
PullParser on other hand is like reading a file only at places you are interested, skipping the rest. Also, only one node is loaded in RAM at a time. There are no limits on how large the input file is, or is it fully available or being streamed over a medium. Use this for large or streamed files.
An excellent answer I found at Best practices for parsing XML
You can also use XmlPullParser to parse xml. But I suggest if possible, use json instead of XML, which decrease your memory and Google provide its wrapper mechanism in GSON which is best at all.
Links.
1) http://vtd-xml.sourceforge.net/
Related
Looking for an explanation on tree parser vs stream parser.
From what i have been researching the JSON build-in parser in android is a tree parser and the Jackson Json parser is a stream parser. Also, android's xml pull parser is a stream parser.
My question is what is a tree parser and could you explain the difference between a stream and tree parser? From Google I/O presenter mentioned tree parser hog much more battery life and should be avoided in place of a stream parser.
UPDATE: Is tree parser equal to Dom parser? I mean are the terms the same?
A tree parser returns a complete parse of the text. Therefore, it doesn't give an answer until the entire text has been parsed.
In contrast, a stream parser returns information as it is processing the text. It is up to you then to build a tree if you so choose. In algorithms, this difference is the difference between what is called a batch or off-line algorithm (tree parsing) versus an on-line algorithm (stream parser).
See What's the difference between an on-line and off-line algorithm?.
So why would you choose one versus the other? The Google I/O presenter mentioned battery life. But that's a result of the more general principle you need more memory for storing a tree for the entire text and more processing time to read the entire text (assuming the stream parser can quit early).
If you are looking for specific information that uses a small part of the text such as finding the first tag in a DOM or an XML document, then the stream approach is probably the way to go.
If on the other hand you need to find all tags, and tags of various sorts which you might think of as several conceptual passes over the document, or if you'll come back to that text/tree over and over again, then you might want do the parsing once and work off of the resulting tree rather than make several passes over the text.
Similarly if the kind of information you need is best answered by thinking about the problem as a tree: getting or passing information from children nodes, sibling nodes, and/or ancestor nodes, then you'll probably want to go the tree approach. But...
In theory you can always turn a streaming parser into a tree parser by doing the work to build the tree as you go along. And that's extra code you have to write.
The difference between a stream parser and a tree parser is like the difference between a Python iterator/generator versus a list (equivalently a Ruby enumeration versus an array).
I want to create an App that uses a potentially large xml file. It will also modify and ideally be able to traverse in reverse.
I know there is SAX, DOM, and the XML pull parser. The pull parser is out, unless I spend memory on creating my own tree of objects which does not seem feasible.
That leaves SAX and DOM unless there is another parser out there that can do what I want. Highly improbable, I know.
Yes, I saw this answer: https://stackoverflow.com/questions/7498616/which-xml-parser-should-i-use-for-android
Thoughts on having tree like usability without having to use DOM?
There are a lot of options when it comes to parsing XML. But it depends on your own requirements that which parser you can use when. For that you need to know the basic differences between the parser. Here is some basic information i have provided.
SAX parser is one where your code is notified as the parser walks through the XML tree,
and you are responsible for keeping track of state and constructing any objects you might want to keep track of the data as the parser marches through.
DOM parser reads the entire document and builds up an in-memory representation that you can query for different elements. Often, you can even construct XPath queries to pull out particular pieces.
And as you said you are having large file and also if you want faster performance i suggest that you should use StAX parser. Here is link for that.
Hope this will help you...
Also refer this link.
DOM is better for most of the cases where it will load all the XML at a time. But If the XML size is very big then we should go for SAX parser where it will read for the tag from the start of the XML every time.
If the XML is really big then it is better to filter from the server end by sending the requirements in the request or else we can go for pagination which is suggestible.
Which all parsers are used in Android for XML parsing? Right now I know only SAX, XMLPullParser and DOM parsers. It will be really great if someone can tell the efficiency comparison for parsers used.
Thanks,
Stone
this guy has talked exactly for what you have asked.
http://www.ibm.com/developerworks/opensource/library/x-android/
though about efficiency you can find it while he is closing
SAX:
1. Parses node by node
2. Doesnt store the XML in memory
3. We cant insert or delete a node
4. Top to bottom traversing
DOM
1. Stores the entire XML document into memory before processing
2. Occupies more memory
3. We can insert or delete nodes
4. Traverse in any direction.
check this:
SAX parser vs XMLPull parser
http://www.developer.com/ws/article.php/3824221/Android-XML-Parser-Performance.htm
http://www.differencebetween.net/technology/difference-between-sax-and-dom/
Another XML Parser that I've found very useful is the Simple Java XML Parser (SJXP) available at http://www.thebuzzmedia.com/software/simple-java-xml-parser-sjxp/. It uses the XPP3 pull parser, and aims for efficiency while still being very simple to use.
The source code is also available, with great inline commenting if you want to see how it works.
I want to keep some information in a xml file, and I want to let the user update that file. Later I will parse and use that information in my app.
Before rolling my own code to create the UI to let the user do this, I was wondering if Android already has something along the lines I could use?
Android doesn't provide XML generation code itself, but there are plenty of Java resources for it, such as JAXP.
Take a look at using the DOM parser library, that Android has as standard. There are a number of tutorials for this online. For general parsing you might tend to use the SAX parser library due to the fact that it has lower memory requirements, but the DOM parser library appears to contain all of the standard methods that you would use to modify the DOM structure once it's in memory. Just as an example, the Node class has an appendChild() method.
Once you've modified the DOM in memory, hopefully there is some way of persisting the modified Document object for later use (e.g. persist to file), though I have no first-hand experience of doing that.
My application shall parse XML received via HTTP. As far as I understand there are three major ways of parsing XML:
SAX
DOM
XmlPullParser
It is said that SAX is the fastest of these while DOM is not optimal for larger XML documents. But what is a large XML document in terms of parsing? What would be a recommended parser for the following?
XML document size between 1-5 kB
Easy traversing through the document, i.e. I need to know not only the current element but also the parent elements.
As far as I understand there are three major ways of parsing XML:
- SAX
- DOM
- XmlPullParser
Wrong! Neither of those is the best way. What you really want is annotation based parsing using the Simple XML Framework. To see why follow this logic:
Java works with objects.
XML can be represented using Java objects. (see JAXB)
Annotations could be used to map that XML to your Java objects and vice versa.
The Simple XML Framework uses Annotations to allow you to map your Java and XML together.
Simple XML is capable of running on Android (unlike JAXB).
You should use Simple XML for all of your XML needs on Android.
And to help you do exactly that I will point you to my own blog post that explains exactly how to use the Simple library on Android.
Unless you have a 100MB XML file then Simple will be more than fast enough for you. It is for me, I use it on all of my Android XML projects.
N.B. I should point out that if you require the user to download XML files that are more than 1MB on Android then you may want to rethink your strategy. You might be doing it wrong.
I'm afraid this is a case of, it depends ...
As a rule of thumb, using Java to build a DOM tree from an XML document will consume between 4 and 10 times that document's native size (assuming Western text and UTF-8 encoding), depending on the underlying implementation. So if speed and memory-use are not critical it will not be a problem for the small documents you mention.
DOM is generally regarded as quite an unpleasant way to work with XML. For background you might want to look at Elliotte Rusty Harold's presentation: What's Wrong with XML APIs (and how to fix them).
However, using SAX can be even more tedious as the document is processed one item at a time. SAX however is fast and consumes very little memory. If you can find a pull parser you like then by all means try that.
Another approach (not super-efficient, but clean and maintainable) is to build an in-memory tree of your XML (using DOM, say) and then use XPath expressions to select the information you are interested in.