Android dom parser - illegal characters exception - android

I need to parse a xml document in my Android application and I'm using Dom parser. Encoding in my xml file is set to UTF-8. The code I'm using for parsing is as follows:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputStream inStream = getAssets().open("words.xml");
InputSource inSource = new InputSource(inStream);
inSource.setEncoding("UTF-8");
Document doc = db.parse(inSource);
But the problem is that I get an illegal character exception. The node which is problematic has the following structure:
<obriši>
<item>obriši</item>
<item>ukloni</item>
</obriši>
What could be the problem?

Try with
inSource.setEncoding("windows-1251");

Related

XMLParser encoding problems

public XMLParser(InputStream is) {
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db;
db = dbf.newDocumentBuilder();
Document doc = db.parse(is);
node = doc.getDocumentElement();
} catch (Exception e) {
DebugLog.log(e);
}
}
The inputStream contains content like: "Hey there this is a ü character."
The character 'ü' is a 'ü';
When reading the node's content System.out.println(node.getTextContent()) I receive "hey there this is a character." ü is cut of.
Well, is this a valid document? Does it have encoding specified?-> http://www.w3schools.com/XML/xml_encoding.asp
Those might help:
Howto let the SAX parser determine the encoding from the xml declaration?
http://www.coderanch.com/t/127052/XML/XML-parsers-encoding-byte-order
The Problem was the XML Entities and HTML Entities.
I request a webpage which returns data with HTML Entities.
I had to convert the HTML Entities to XML Entities and it worked!
Check this answer for some code

IO Exception looking for File Path using DOM in android

There are a few posts re: this topic but can't figure out why this won't work.
Keep getting an IOException. Guessing it can't find the file. Cheers.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("//res/xml/xml_data.xml");
Change yours with this
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("res/xml/xml_data.xml");
Your URL is incorrect. Resources take the form:
"android.resource://[package]/[res type]/[res name]"
or
"android.resource://[package]/[res id]
ie
"android.resource://com.org.example/xml/xml_data" // No extension
"android.resource://com.org.example/" + R.xml.xml_data

Parsing XML file stored in internal storage using DOM parser in android.

I have created an xml file in the device's internal storage as described on the android developers website. I now want to parse the file using DOM parser. What do i need to do to make the DOM parser read my XML file??
Here's a snippet:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse(new InputSource(new StringReader(data)));
dom.getDocumentElement().normalize();
What do i need to put in the place of "data" in:
Document dom = db.parse(new InputSource(new StringReader(data)));
I know it's silly but any help would be appreciated.
You can make a input stream of the xml string like below and then getting nodes you can parse to get values.
InputStream is = new ByteArrayInputStream(theXMLString.getBytes("UTF-8"));
// Build XML document
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(is);
Remember you are passing xml file as a string.
You can give FileInputStream in inputsource
Document dom = db.parse(new InputSource(new FileInputStream(data)));
For reading XML file, you should try below
FileInputStream in = new FileInputStream("/sdcard/text.txt");
StringBuffer data = new StringBuffer();
InputStreamReader isr = new InputStreamReader(in);
BufferedReader inRd = new BufferedReader(isr);
String text;
while ((text = inRd.readLine()) != null) {
inLine.append(text);
inLine.append("\n");
}
in.close();
String finalData =data.toString(); // Here is your data.
Hope above may useful to you.
Try this code for parsing from Asset folder using DOM Parser :
DocumentBuilderFactory DBF;
DocumentBuilder DB;
Document dom;
Element elt;
DBF = DocumentBuilderFactory.newInstance();
DB = DBF.newDocumentBuilder();
dom = DB.parse(new InputSource(getAssets().open("city.xml")));
elt = dom.getDocumentElement();
NodeList items = elt.getElementsByTagName("item");
where item is Node element, add try ctch block as per the requirements.

android parsing error on tablet but not emulator

I have this exception:
org.xml.saxParseException: Unexpected token (position TEXT#1:2...)
but it is caused only when running my .apk on a tablet pc. The same data when is parsed on the android emulator never causes this exception and works 100%. Any ideas?
Here's the code that throws the exception:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource inputSource = new InputSource();
inputSource.setCharacterStream(new StringReader(xmlData));
Document doc = db.parse(inputSource);
And here is a part from the file:
<Results> <Result title="08 07 2011"><Field title="blah blah" value="blah blah" /> </Result></Results>
Default charset differs, maybe? Does the XML have a charset in it?

Problem with parsing of UTF-8 encoded xml files in Android 3.1 sdk

Xml parsing api is throwing sax parse exception, If i try to parse a xml file which has attributes at root node.
One thing i have noticed is that, this happens if there is a UTF-8 BOM character at the start of the string, if i remove the BOM character things work fine. This code is working fine on 3.0 sdk and below, i saw this problem only in 3.1
am using following parser:
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = docFactory.newDocumentBuilder();
Document doc = null;
StringReader sr = new StringReader(xmlString);
InputSource is = new InputSource(sr);
doc = builder.parse(is);
Try this:
public Document parse(String xml) throws ParsingFailedException {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
//encode the xml to UTF -8
ByteArrayInputStream encXML = new ByteArrayInputStream(xml.getBytes("UTF8"));
Document doc = builder.parse(encXML);
log.error("XML parsing OK");
return doc;
} catch (Exception e) {
log.error("Parser Error:" + e.getMessage());
throw new ParsingFailedException("Failed to parse XML : Document not well formed", e);
}
}
Thanks evilone,
I have opened a issue with google, and they will be fixing this in their branch.
http://code.google.com/p/android/issues/detail?id=16892
Comments from google developer:
"I've prepared a fix for the root problem in our internal Honeycomb tree. But you don't need the fix for your code. Your parseXml method should just take an InputStream rather than a String. You can pass that directly to the InputSource constructor."

Categories

Resources