Abort SAX parsing mid-document? - android

I'm parsing a very simple XML schema with a SAX parser in Android.
An example file would be
<Lists>
<List name="foo">
<Note title="note 1" .../>
<Note title="note 2" .../>
</List>
<List name="bar">
<Note title="note 3" .../>
</List>
</Lists>
The ... represents more note data as attributes that aren't important to question.
I use a SAX parser to parse the document and only implement the startElement and 'endElement' methods of the HandlerBase to handle Note and List nodes.
However, In some cases the files can be very large and take some time to process. I'd like to be able to abort the parsing process at any time (i.e. user presses cancel button).
The best way I've come up with is to throw an exception from my startElement method when certain conditions are met (i.e. boolean stopParsing is true).
Is there a better way to do this?
I've always used DOM style parsers, so I don't fully understand the SAX parser.
One final note, I'm running this on Android, so I will have the Parser running on a worker thread to keep the UI responsive. If you know how I can kill the thread safely while the parser is running that would answer my question as well.

You might want to look into XMLPullParser as this could be performed easily by just not calling the next method. You are in more control over the parsing.
XmlPullParser xpp=getResources().getXml(R.xml.lists);
while (xpp.getEventType()!=XmlPullParser.END_DOCUMENT) {
if (xpp.getEventType()==XmlPullParser.START_TAG) {
if (xpp.getName().equals("list")) {
items.add(xpp.getAttributeValue(0));
}
}
xpp.next();
}

Related

Is SAX appropriate for this xml parsing in Android?

Provided a similar xml as follows,
<data>
<train>
<departing>
<schedule>
<time />
<time />
<time />
</schedule>
<locations>
<name>
<name/>
<name>
<name/>
</locations>
</departing>
</train>
<train>
<departing>
<schedule>
<time />
<time />
<time />
</schedule>
<locations>
<name>
<name/>
<name>
<name/>
</locations>
</departing>
</train>
</data>
For simplicity, I have omitted the attributes. The data I'm interested in is the following:
Attributes in every train element.
Elements within a particular train element.
In reality, this xml would be much longer(10+ times). Since this is for android, I want to choose the lightest way to parse this xml. DOM is not my option since it stores too many elements I don't need.
I have been thinking of using SAX. I would use boolean variable to skip unnecessary elements, but I still have to at least reach all of the train elements before I break out of parsing (by throwing exception).
Am I looking at the right approach with SAX for this?
I would first try the XPath option since that will solve your problem with the least amount of code.
I have no knowledge about the internal Android XPath implementation details, but if those turns out to be too heavy weight, I would look at the pull parser packages before submitting to SAX. Pull parsing is usually more straight forward to work with compared to SAX.

Can I get the value of a specified tag quickly with SAX parser?

I want to make a progress bar while importing my articles from an XML feed.
The parsing works fine, but for my progress bar I need to quickly know the total # of <item>s in the feed so I can determine percentage that have been loaded.
My thought was, it would be a lot faster to just do this in PHP and add the "count" to the feed itself - something like this:
<?xml version="1.0" encoding="utf-8"?>
<channel>
<title>My Apps Feed</title>
<link>http://link_to_this_fiel</link>
<language>en-us</language>
<count>42</count>
But then I need to be able to quickly access that "count" number.
At the moment, I have an RSSHandler.java that's being called like this:
//Add all items from the parsed XML
for(NewsItem item : parser.getParsedItems())
{
//...
Note: Min API level 8 for my app
You can use Xpath to get specific node value in XML. Sample would be:
SAXBuilder builder = new SAXBuilder();
Document doc = builder.build(.....); // get document for your xml here
Element elem = (Element) XPath.selectSingleNode(doc, "channel/count");
System.out.println(elem.getValue());
This way you can directly get count value. But I'm not sure this is the efficient and faster way to do this. As an option you can use this.
Also spend some time in reading this: SAX parsing - efficient way to get text nodes
SAX reads the tags in the order they come in. Make sure to put your tag at the beginning of the XML, otherwise you will need to parse it all anyways. For easier parsing, I'd put it into an attribute of a self-closing tag with a unique name. Then you wait until the SAX parser calls your startElement method, check if the tag name matches the name of your count tag, extract the attribute, and display it to the user.
If you want to stop the parser after displaying the count, you can throw a SAXException to do so.
I assume you already know how to do the parsing work off the main thread, as you mention a progress bar and imply that parsing can take some time (and doing long tasks on the main thread gives you ANRs). In case someone stumbles upon this question who doesn't know: You can use AsyncTask and the publishProgress (called in the "worker" thread by your code) and onProgressUpdate (called by Android on the UI thread once you call publishProgress) methods to take care of that.

XML Parser in android

I need to know what is the best way to parsing XML file in android, I know there is 3 parser (XMLPullParser, Dom Parser and Sax parser) so whats the different between it and if there any code to do that.
Sax Parser : Simple API of XML Parse node to node, using top-down traversing, parse without storing xml, Faster compared to Dom Manipulating of node like insertion or deletion is allowed. Needs SAXParserFactory
Dom Parser : Document Object Model Stores entire xml in memory before processing, traverse in any direction, Manipulating of node like insertion or deletion is NOT allowed. Needs DocumentBuilderFactory
Pull Parser: It provides more control and speed from the above two.
Android training recommends XMLPullParser.
http://developer.android.com/training/basics/network-ops/xml.html
We recommend XmlPullParser, which is an efficient and maintainable way to parse XML on Android.
They also give some code examples.

XML sax parser - ignore unbound prefix exception

When parsing an xml file in android, I'm doing like this:
try
{
InputStream is = ...
MyContentHandler ch = new MyContentHandler();
Xml.parse(is, Encoding.UTF_8, ch);
}
catch ...
The problem is that sometimes the file I'm trying to parse is not well-formed.
In my case, undeclared namespaces may be present.
The data I'm interested in is not inside those tags so I could simply ignore it, but I get an exception of unbound prefix not inside the content handler but in the parser itself; this means that if the exception occurs the entire parsing process is interrupted.
Is there a way of using the sax parser ignoring this kind of error (or namespaces at all)?
p.s. I want to avoid loading all the file in memory as a string and strip namespaces out of it, or having to rewrite the file.
I found the solution in another thread.
Instead of using Xml.parse you need to manually instantiate a sax parser through the SAXParserFactory and get a reader.
You can then set the reader features.
Among the available features, one disables namespaces and that does the trick.
Reference -> LINK

Grabbing XML File Data from a Website

Just wondering what would be the best way to grab the following data and parse it.
Here's an example of some the data I want to pull.
<?xml version="1.0" encoding="UTF-8" ?>
<eveapi version="2">
<currentTime>2010-11-19 19:23:44</currentTime>
<result>
<rowset name="characters" key="characterID" columns="name,characterID,corporationName,corporationID">
<row name="jennyhills" characterID="90052591" corporationName="Imperial Academy" corporationID="1000166" />
</rowset>
</result>
<cachedUntil>2010-11-19 20:20:44</cachedUntil>
</eveapi>
I've seen some examples on how to parse XML data but they are all based on if statements and that's a lot of hard coding is there a more generic way to do this?
Parsers are quite hardcoded that's the way they work. You can only check if a certain tag matches a certain pattern and then decide what to do. Especially for simple documents like yours that is absolutely no problem.
If you have more than one type of document to parse then I recommend reading this SO answer.
The "parsing", taking the term literally, is easy. Parsing is the process of taking a text string (in your case, from an http response) and turning it into a data structure such as an XML document tree. That process is handled for you by an XML parser, and you typically don't need to worry about it.
The part you're facing is how to query data from the parsed XML document, right? The easiest way to depends greatly on what you need to do with the data. But XPath is a good way to select data without a lot of verbose if statements and get-child function calls.
See also this question on using XPath in Android.

Categories

Resources