Android why xPath.String return empty string? - android

I got xml
<FictionBook xmlns="http://www.gribuser.ru/xml/fictionbook/2.0" xmlns:l="http://www.w3.org/1999/xlink">
<description>
<title-info>
<genre>love_contemporary</genre>
<author>
<first-name>Sylvain</first-name>
<last-name>Reynard</last-name>
</author>
<book-title>Gabriel's Inferno</book-title>
<annotation>
<p>Enigmatic and sexy, Professor Gabriel Emerson is a well respected Dante specialist by day, but by night he devotes himself to an uninhibited life of pleasure. He uses his notorious good looks and sophisticated charm to gratify his every whim, but is secretly tortured by his dark past and consumed by the profound belief that he is beyond all hope of redemption. When the sweet and innocent Julia Mitchell enrolls as his graduate student, his attraction and mysterious connection to her not only jeopardizes his career, but sends him on a journey in which his past and his present collide. An intriguing and sinful exploration of seduction, forbidden love and redemption, Gabriel's Inferno is a captivating and wildly passionate tale of one man's escape from his own personal hell as he tries to earn the impossible…forgiveness and love.</p>
</annotation>
<date/>
<coverpage>
<image l:href="#_0.jpg"/>
</coverpage>
<lang>en</lang>
<src-lang>en</src-lang>
<sequence name="Gabriel's Inferno" number="1"/>
</title-info>
<document-info>
<author>
<first-name/>
<last-name/>
</author>
<date/>
<id>2aec7273-a8a4-4edc-803a-820c4d76bc3f</id>
<version>1.0</version>
</document-info>
<publish-info>
<book-name>Gabriel's Inferno</book-name>
<year>2011</year>
</publish-info>
</description>
</FictionBook>
My expression to get value of attribute
string(//coverpage/image/#l:href)
Code in android programm
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
String expression;
String attrValue;
expression = "string(//coverpage/image/#l:href)";
try {
attrValue = xpath.compile(expression).evaluate(obj,
XPathConstants.STRING).toString();
System.out.println("VAL XML:"+attrValue);
} catch (XPathExpressionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
But on console i get only:
VAL XML:
Why? What i doing wrong?
I try http://www.freeformatter.com/xpath-tester.html#ad-output for online testtings - everything works fine. Get string #_0.jpg

Your problem is that the node you're trying to catch is using the XML namespace, and the factory isn't aware of it. I see two solutions for this:
Without defining the namespace
Avoid the issue using local-name() to ignore namespaces altogether.
//*[local-name() = 'coverpage']/*[local-name() = 'image']/#*[local-name() = 'href']
(//coverpage/image/#*[local-name() = 'href'] might work as well)
Defining the namespace
Make XPathFactory aware of the different namespaces so that it knows which one to use.
import javax.xml.namespace.NamespaceContext;
...
xpath.setNamespaceContext(new MyNamespaceContext());
attrValue = xpath.compile(expression).evaluate(obj,
XPathConstants.STRING).toString();
...
private static class MyNamespaceContext implements NamespaceContext {
public String getNamespaceURI(String prefix) {
if("l".equals(prefix)) {
return "http://www.w3.org/1999/xlink";
}
return null;
}
public String getPrefix(String namespaceURI) {
return null;
}
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
(possible duplicate: How to use XPath on xml docs having default namespace)

Related

Android parseDouble returns Infinity when it should throw NumberFormatException

I'm trying to determine if a String represents a Double. I expect the following code to throw a NumberFormatException:
String s = "type1234";
try {
Double val = Double.parseDouble(s);
} catch (NumberFormatException e) {
// Handle exception
e.printStackTrace();
}
Instead, val ends up with Infinity. I ran the code in a standard JVM and it does, indeed, throw NumberFormatException
It looks like Android is ignoring the leading characters 'typ' and then parsing it as e^1234, which is out of Double's range.
Is this the expected behavior? If so, what is a more reliable way to determine if a String can be parsed as a Double?
Use a regular expression to validate yourself if you don't trust the framework. Here's an example of one way to do this.
// You can get much fancier than this to handle all cases, but this should handle most
String regexDouble = "^-?\\d*(\\.\\d+)?$";
boolean isDouble = val.matches(regexDouble);
Fastest way to check if a String can be parsed to Double in Java
The answer on this post works on android. I have tested it on ICS.

android dom parser issue

i have this rss feed to parse that contains several tags. i am able to retrieve the value (child element) for all except for the description tag node. please find below the rss feed
<fflag>0</fflag>
<tflag>0</tflag>
<ens1:org>C Opera Production</ens1:org>
−
<description>
<p>Opera to be announced</p>
<p>$15 adults/$12 seniors/$10 for college students<span style="white-space: pre;"> </span></p>
</description>
the code that i am using for this is
StringBuffer descriptionAccumulator = new StringBuffer();
else if (property.getNodeName().equals("description")){
try{
String desc = (property.getFirstChild().getNodeValue());
if(property.getNodeName().equals("p")){
descriptionAccumulator.append(property.getFirstChild().getNodeValue());
}
}
catch(Exception e){
Log.i(tag, "No desc");
}
else if (property.getNodeName().equals("ens1:org")){
try{
event.setOrganization(property.getFirstChild().getNodeValue());
Log.i(tag,"org"+(property.getFirstChild().getNodeValue()));
}
catch(Exception e){
}
else if (property.getNodeName().equals("area")||property.getNodeName().equals("fflag") || property.getNodeName().equals("tflag") || property.getNodeName().equals("guid")){
try{
//event.setOrganization(property.getFirstChild().getNodeValue());
Log.i(tag,"org"+(property.getFirstChild().getNodeValue()));
}
catch(Exception e){
}
else if(property.getNodeName().equals("p") || property.getNodeName().equals("em") || property.getNodeName().equals("br") || property.getNodeName().startsWith("em") || property.getNodeName().startsWith("span") || property.getNodeName().startsWith("a") || property.getNodeName().startsWith("div") || property.getNodeName().equals("div") || property.getNodeName().startsWith("p")){
descriptionAccumulator.append(property.getFirstChild().getNodeValue());
descriptionAccumulator.append(".");
System.out.println("description added:"+descriptionAccumulator);
Log.i("Description",descriptionAccumulator+property.getFirstChild().getNodeValue());
}
I tried capturing the value of <description> tag but that dint work out, so I tried using all the usual html formatting tags that are used but still no way out. using any other parser is not an option. could some body please help me out with this. thanks
I believe smth is wrong with the rss xml. For instance check what xml is returned by StackOverflow rss feed. Specifically pay attention how <summary type="html"> node content looks like - it has no child xml nodes inside, only pure xml-escaped text. So if it is acceptable in your case - spend efforts on a proper rss xml generation rather than on fixing the consequences.
You are parsing this as xml, so the description tag doesn't have a string value, it has multiple children. You might try getting getting the description node and pretty printing it's children. See LSSerializer for printing to XML.

Is there a htmlDecode?

I am downloading a JSONObject from a web site. The entries are however HTML-encoded, using
"
and
&
tags. Is there an easy way to get these to Java strings? Short of writing the converter myself, of course.
Thanks RG
PS: I am using the stuff in a ListView. Probably I can use Html.fromHTML as I can for TextView. Don't know.
OK, I simply went to write my own quick fix. Not efficient, but that's OK for the purpose. A 5-minutes-solution.
public static String unescape (String s)
{
while (true)
{
int n=s.indexOf("&#");
if (n<0) break;
int m=s.indexOf(";",n+2);
if (m<0) break;
try
{
s=s.substring(0,n)+(char)(Integer.parseInt(s.substring(n+2,m)))+
s.substring(m+1);
}
catch (Exception e)
{
return s;
}
}
s=s.replace(""","\"");
s=s.replace("<","<");
s=s.replace(">",">");
s=s.replace("&","&");
return s;
}
I've heard of success in using the Apache Commons on Android.
You should be able to use StringEscapeUtils.unescapeHtml() (from the Lang package).
Here are the (fairly straightforward) directions on using the Apache Commons libraries in your Android apps: Importing org.apache.commons into android applications.

Using java.util.regex in Android apps - are there issues with this?

In an Android app I have a utility class that I use to parse strings for 2 regEx's. I compile the 2 patterns in a static initializer so they only get compiled once, then activities can use the parsing methods statically.
This works fine except that the first time the class is accessed and loaded, and the static initializer compiles the pattern, the UI hangs for close to a MINUTE while it compiles the pattern! After the first time, it flies on all subsequent calls to parseString().
My regEx that I am using is rather large - 847 characters, but in a normal java webapp this is lightning fast. I am testing this so far only in the emulator with a 1.5 AVD.
Could this just be an emulator issue or is there some other reason that this pattern is taking so long to compile?
private static final String exp1 = "(insertratherlong---847character--regexhere)";
private static Pattern regex1 = null;
private static final String newLineAndTagsExp = "[<>\\s]";
private static Pattern regexNewLineAndTags = null;
static {
regex1 = Pattern.compile(exp1, Pattern.CASE_INSENSITIVE);
regexNewLineAndTags = Pattern.compile(newLineAndTagsExp);
}
public static String parseString(CharSequence inputStr) {
String replacementStr = "replaceMentText";
String resultString = "none";
try {
Matcher regexMatcher = regex1.matcher(inputStr);
try {
resultString = regexMatcher.replaceAll(replacementStr);
} catch (IllegalArgumentException ex) {
} catch (IndexOutOfBoundsException ex) {
}
} catch (PatternSyntaxException ex) {
}
return resultString;
}
please file a reproduceable test case at http://code.google.com/p/android/issues/entry and i'll have a look. note that i will need a regular expression that reproduces the problem. (our regular expressions are implemented by ICU4C, so the compilation actually happens in native code and this may end up being an ICU bug, but if you file an Android bug i'll worry about upstream.)
If you launched with debugging you can expect it to be about twice as slow as a regular launch. However a minute does seem extraordinary. Some things to suggest, i. look at the console output to see if warnings are being spat out, ii. when it is doing the compile, in the debugger press 'pause' and just see what it is doing. There are ways to get the source, but even so just looking at the call stack may reveal something.

Parse HTML in Android

I am trying to parse HTML in android from a webpage, and since the webpage it not well formed, I get SAXException.
Is there a way to parse HTML in Android?
I just encountered this problem. I tried a few things, but settled on using JSoup. The jar is about 132k, which is a bit big, but if you download the source and take out some of the methods you will not be using, then it is not as big.
=> Good thing about it is that it will handle badly formed HTML
Here's a good example from their site.
File input = new File("/tmp/input.html");
Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");
//http://jsoup.org/cookbook/input/load-document-from-url
//Document doc = Jsoup.connect("http://example.com/").get();
Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
String linkHref = link.attr("href");
String linkText = link.text();
}
Have you tried using Html.fromHtml(source)?
I think that class is pretty liberal with respect to source quality (it uses TagSoup internally, which was designed with real-life, bad HTML in mind). It doesn't support all HTML tags though, but it does come with a handler you can implement to react on tags it doesn't understand.
String tmpHtml = "<html>a whole bunch of html stuff</html>";
String htmlTextStr = Html.fromHtml(tmpHtml).toString();
We all know that programming have endless possibilities.There are numbers of solutions available for a single problem so i think all of the above solutions are perfect and may be helpful for someone but for me this one save my day..
So Code goes like this
private void getWebsite() {
new Thread(new Runnable() {
#Override
public void run() {
final StringBuilder builder = new StringBuilder();
try {
Document doc = Jsoup.connect("http://www.ssaurel.com/blog").get();
String title = doc.title();
Elements links = doc.select("a[href]");
builder.append(title).append("\n");
for (Element link : links) {
builder.append("\n").append("Link : ").append(link.attr("href"))
.append("\n").append("Text : ").append(link.text());
}
} catch (IOException e) {
builder.append("Error : ").append(e.getMessage()).append("\n");
}
runOnUiThread(new Runnable() {
#Override
public void run() {
result.setText(builder.toString());
}
});
}
}).start();
}
You just have to call the above function in onCreate Method of your MainActivity
I hope this one is also helpful for you guys.
Also read the original blog at Medium
Maybe you can use WebView, but as you can see in the doc WebView doesn't support javascript and other stuff like widgets by default.
http://developer.android.com/reference/android/webkit/WebView.html
I think that you can enable javascript if you need it.

Categories

Resources