I am trying to get an iframe src from a XML for showing it in a WebView. For this i am using XPath for getting the value. Currently i am trying to parse this iframe. But i am not getting any value at all.
I have tried this XPath:
"//GoodreadsResponse/book/reviews_widget/iframe[#id=\"the_iframe\"]/#src/text()"
Is my XPath wrong for getting this iframe src? Full xml is here.
Perhaps another answer will correct me, but I don't think XPath will parse data within the CDATA section.
You can do this in two steps, however.
Grab the text section of //GoodreadsResponse/book/reviews_widget
This is the CDATA section before. It's not XML as it is (multiple root elements), but we can add a parent element and then parse it.
I will include a .NET snippet which hopefully you can convert.
XmlNode node = document.SelectSingleNode("//GoodreadsResponse/book/reviews_widget");
String cdataText = node.InnerText;
// The cdataText here isn't quite XML, as it has multiple roots.
// let's surround it by a single root element
String xml = "<root>" + cdataText + "</root>";
XmlDocument innerDoc = new XmlDocument();
innerDoc.LoadXml(xml);
XmlNode srcAttr = innerDoc.SelectSingleNode("/root/div/iframe[#id=\"the_iframe\"]/#src");
// This prints out https://www.goodreads.com/api/reviews_widget_iframe?did=DEVELOPER_ID&format=html&isbn=0307277674&links=660&min_rating=&review_back=fff&stars=000&text=000
Console.WriteLine(srcAttr.Value);
Related
Can anyone please tell me how to parse the below string?
"Testing the parser - <tag><name>ANKIT</name><id>7</id></tag> <tag><name>VIKRAM</name><id>8</id></tag>. Some random text here"
How can I get the name "ANKIT" which is inside the <tag><name> ?
I tried SAX parser.
I think the XML parsers works only when the starting line is <?xml version="1.0"?>.
Is my understanding correct?
Since your text isn't valid XML but reminds more structure like HTML which isn't as strict, consider using HTML parser like https://jsoup.org/. With this library your code can look like
String myXML = "Testing the parser - <tag><name>ANKIT</name><id>7</id></tag> <tag><name>VIKRAM</name><id>8</id></tag>. Some random text here";
Document doc = Jsoup.parse(myXML, "", Parser.xmlParser());
String tag_name_text = doc.select("tag name")//CSS query to find <name> elements inside <tag> elements
.first()//take first result
.text();//get text it would generate in browser
System.out.println(tag_name_text);
Output: ANKIT
If I understand you right, try
STRINGNAME.substring(STRINGNAME.lastIndexOf(""), STRINGNAME.indexOf(""));
Grabbing an XML file off of the internet and feeding it's data into a database. Most of it works just fine. But I have a problem with xml tags having the same name but different values attached.
So, we have an xml file like this:
<Overtag>
<Tag> Name </Tag>
<SubTag> TextSubTag </SubTag>
<TagWithValue value="SomeValue"> TextTagWithValue </TagWithValue>
</Overtag>
<Overtag>...
I can set up a NodeList by Overtag. I can get a nodeist of Overtag's Children which I call Children.
So, I run this over a for loop - for(int nN=0; nN
I grab the Text of the Tags themselves:
String sTag = Children.item(nN).getNodeName();
I can even grab the text between the Tags:
Children.item(nN).getTextContent()
BUT I need to organize this text based on the value.
What command can I use to get "SomeValue" if nN = the childlist number for (in this case 2)?
As in: Children.item(nN).?
Found it...
if using NodeList to list out your xml tags, you need to return it to an element, then use getAttribute; so if the tag is (assuming your list with TagWithValue in it is a NodeList called Children):
Element eChild = (Element) Children.item(nN);
String sAtt = eChild.getAttribute("value");`
This will give you sAtt = "SomeValue". Sorry to waste space by posting then finding the answer two hours later; hopefully someone else finds this useful.
Instead of going through VoiceOver or similar software, I want a function which can take an element-id as parameter and return the alt text or label so that I can validate whether the text is correct.
Any other suggestions welcome.
You could use HttpClient to fetch the HTML code from web, and use the jsoup library to parse the code, then find out the attributes of selected element. Download jsoup jar and put it into the lib directory of your project.
Document doc = Jsoup.parse("..."); // ... is the string of HTML code
Elements inputElement = doc.select("#...").first(); // ... is the id of your element
String alt = inputElement.attr("alt") // select the "alt" attribute.
I am trying to parse the following using Dom parser in Android.
<offerURL>
http://statTest.dealtime.com/DealFrame/DealFrame.cmp?bm=553&BEFID=93767&aon=%5E1&MerchantID=434524&crawler_id=1909400&dealId=TCk4NTG97Aa3wSQgh2U3FQ%3D%3D&url=http%3A%2F%2Frover.ebay.com%2Frover%2F1%2F707-64686-24023-0%2F2%3Fipn%3Dpsmain%26icep_item_id%3D190622592957%26icep_vectorid%3D260601%26kwid%3D1%26mtid%3D637%26crlp%3D1_260601%26kw%3D%7Bquery%7D%26query%3D%7Bquery%7D%26linkin_id%3D%7Blinkin_id%7D%26sortbid%3D%7Bbidamount%7D%26fitem%3D190622592957%26mt_id%3D637&linkin_id=7000251&Issdt=120323134700&searchID=p2.77722a731149145f60fa&DealName=Samsung+B2100+Outdoor+In+Schwarz+%28black%29+Orig.+Neuware&dlprc=89.95&crn=&istrsmrc=1&isathrsl=0&AR=1&NG=3&NDP=6&PN=1&ST=7&DB=sdcprod&MT=phx-pkadu-intl-dc20&FPT=DSP&NDS=&NMS=&MRS=&PD=95929320&brnId=14863&IsFtr=0&IsSmart=0&DMT=&op=&CM=&DlLng=7&RR=1&cid=&semid1=&semid2=&IsLps=0&CC=0&SL=0&FS=1&code=&acode=538&category=&HasLink=&frameId=&ND=&MN=&PT=&prjID=&GR=&lnkId=&VK=
</offerURL>
For parsing I am using following code :
Node node = .....
String nodeName = node.getNodeName();
if (nodeName.equalsIgnoreCase("offerURL")) {
String offerUrl = node.getFirstChild().getNodeValue()
Log.d("offerUrl => " + offerUrl);
}
It works fine but the value of <offerURL> tag is getting truncated.
The value of variable offerUrl printted in log cat is "http://statTest.dealtime.com/DealFrame/DealFrame.cmp?bm=553"
Not sure what exactly the issue is. Please help.
& is a predefined entitiy in XML and must be represented in a special way. In the URL, if you change all the & to & that should work.
Predefined entities in XML will tell you all the predefined entities in XML and how to represent them.
While using the android DOM parser, I'm picking up the following element:
<![CDATA[
Widget1… Widget2… Widget3…
]]>
I've tried populating a textview using Html.fromHtml as well as trying to shove the contents of that element into a Webview.
both methods display the content, but seems to strip out the named character entities
Anything I can do to retain the formatting/markup?
I am reading the data from an RSS feed if that helps.
this is what my webview instantiation looks like
webview.loadData(((Node) nodeList.item(0)).getNodeValue(), "text/html", "utf-8");
I had a similar issue recently and here's what worked for me. I used the jsoup.jar from http://jsoup.org/. Once I added the jar to my project, I just needed to use the following line:
String htmlFreeString = Jsoup.parse( stringWithHtml ). text();
Then, I just set the string as the text to my TextView without issue.
Hope that helps.