I am trying to parse an xml file withSaxParser on Android.
This is my xml file:
<?xml version="1.0" encoding="UTF-8"?>
<cars>
<car model="CitroenC3">
<maintenances>
<xm:maintenance xmlns:xm="it.a.b.android.c.car.m" distance="" price="">
<xm:type></xm:type>
</xm:maintenance>
</maintenances>
<chargings>
<xc:charging xmlns:xc="it.a.b.c.fuelconsumption.car.m" quantity="18" price="20" distance="400" consumption="14"/>
</chargings>
</car>
</cars>
And this is the code:
// Handling XML
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
XmlResourceParser parser = getResources().getXml(R.xml.data);
// Create handler to handle XML Tags ( extends DefaultHandler )
DataSaxHandler myXMLHandler = new DataSaxHandler();
xr.setContentHandler(myXMLHandler);
InputStream is= getResources().openRawResource(R.xml.data);
xr.parse(new InputSource(is));
After xr.parse I have the Exception:
03-22 15:24:04.248: INFO/System.out(415): XML Pasing Excpetion =
org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 0: not well-formed (invalid token)
What may be wrong?
Thanks a lot.
AFAIR, any xml file under res/ folder is compiled before it's placed in .apk.
Try to move your XML-file to assets/ folder and load it from there:
xr.parse(new InputSource(getAssets().open("data.xml")));
Related
I have an XML that actually has url-encoded quote marks and other symbols (" appears as %22, ' as %27 and so on).
I'm trying to parse this page without these symbols, but they still appear as %22 and %27.
This is my code:
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
url = new URL(feed);
xr.setContentHandler(rh);
InputSource is = new InputSource(url.openStream());
is.setEncoding("UTF-8");
xr.parse(is);
I'm working on an app which is in the German language. I'm getting the data in XML form. I used SAX parser for parsing these XMLs and display the data in the TextView. Everything is working fine except the special-characters issue which I got after the parsing.
This is my XML which I got through the URL Link. This XML has utf-8 encoding. All the characters are fine in this XML file.
<?xml version="1.0" encoding="utf-8"?>
<posts>
<page id="001">
<title><![CDATA[Sie kaufen bei uns ausschließlich Holzkunst- und Volkskunst-Produkte ]]></title>
<detial><![CDATA[Durch enge Beziehungen mit unseren Lieferanten können wir attraktive rückläufig
Preise und schnelle Lieferungen gewährleisten. Caroline Féry and Laura Herbst Universität Potsdam Mein
Flugzeug hatte zwölf Stunden VERSPÄTUNG </p>]]></detial>
</page>
</posts>
I used SAX parser for parsing this XML:- (and displaying the parsed data in the TextView.)
public class GermanParseActivity extends Activity {
/** Called when the activity is first created. */
static final String URL = "http://www.xyz.com/id=1";
ItemList itemList;
#Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
XMLParser parser = new XMLParser();
String XML = parser.getXmlFromUrl(URL);
System.out.println("This XML is ========>"+XML);
try
{
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
/** Create handler to handle XML Tags ( extends DefaultHandler ) */
MyXMLHandler myXMLHandler = new MyXMLHandler();
xr.setContentHandler(myXMLHandler);
ByteArrayInputStream is = new ByteArrayInputStream(XML.getBytes());
xr.parse(new InputSource(is));
}
catch(Exception e)
{
}
itemList = MyXMLHandler.itemList;
ArrayList<String> listItem= itemList.getTitle();
ListView lview = (ListView) findViewById(R.id.listview1);
myAdapter adapter = new myAdapter(this, listItem);
lview.setAdapter(adapter);
}
}
but after parsing I'm getting strange characters which are not in XML file but generated after parsing the XML file.
Like these characters:
before parsing after parsing
können ---> können
rückläufig ---> rückläufig
gewährleisten ---> gewährleisten
Can anyone please suggest the proper way to fix this issue?
You need to reencode your input. The problem is that the text is UTF-8 but is interpreted as ISO-8859-1. That seems to be a bug of SAX.
String output=new String(input.getBytes("8859_1"), "utf-8");
That line takes the ISO-8859-1 and converts it to utf-8 which is used by Java.
got my anwser from here
They suggest that the heading should be:
<?xml version="1.0" encoding="ISO-8859-1"?>
instead of
<?xml version="1.0" encoding="utf-8"?>
Hope that is the answer- edit just saw that you don't have control over the xml,
so this will not help, rekire's answer is then a option
I am parsing XML from URL using SAXParser. The XML has some data with ampersand (&) sign. XML data is not read after the ampersand. How would I resolve this issue?
URL website = new URL(FullURL);
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
HandlingXMLStuff doingwork = new HandlingXMLStuff();
xr.setContentHandler(doingwork);
xr.parse(new InputSource(website.openStream()));
String information = doingwork.getInformation();
XML tag has data like
<choice>Cat & Dog</choice>
I am getting output as
Cat
To have a naked '&' rather than "&" you need to use a CDATA[[]] structure around the "Cat & Dog".
I am trying to parse my xml file resource with SaxParser. I have created my DataHandler but I don't know how indicate to XmlReader the location of data.xml that is in res/xml/.
What is the correct parameter for InputSource object?
XmlResourceParser parser = getResources().getXml(R.xml.data);
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
// Create handler to handle XML Tags ( extends DefaultHandler )
DataSaxHandler myXMLHandler = new DataSaxHandler();
xr.setContentHandler(myXMLHandler);
//R.xml.data is my xml file
InputSource is=new InputSource(getResources().getXml(R.xml.data)); //getResources... is wrong say Eclipse
xr.parse(is);
Thanks a lot.
The problem is that the call to getResources().getXml(int id) is returning a XmlResourceParser, and there is no InputSource constructor that takes an XmlResourceParser.
If you want to stick with the SaxParser, you'll need to open up an InputStream via Resources#openRawResource(int id), and then pass that to the InputSource constructor. You'll also need to move the file to res/raw to use the openRawResource function.
I am having problem in parsing the xml data, below is my code
URL url=new URL(Urls.statusUrl);
SAXParserFactory spf=SAXParserFactory.newInstance();
SAXParser sp=spf.newSAXParser();
XMLReader xr=sp.getXMLReader();
FOIA_Parsing_Handler mxmlparsing=new FOIA_Parsing_Handler();
xr.setContentHandler(mxmlparsing);
xr.parse(new InputSource(url.openStream()));//Exception at this line
mStatus=mxmlparsing.getRecords();
m_adapter=new StatusAdapter(Status_Inquiry.this, R.layout.status_inquiry_view, mStatus);
mLstViewStatusInquiry.setAdapter(m_adapter);
I am getting
org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 8299: not well-formed (invalid token)
at above line
Regards
Saurabh
As suggested by st0le the problem is in xml data I am getting some garbage in xml so the code is coorect