Inconsistent DOM structure of the loaded HTML page in webview - android

I am making Highlighter for Android in WebView.
For the highlighting purpose, in one example i am using JQuery and Rangy, in another pure Javascript and XPath. I am trying the same Highlight in Desktop Browsers also.
Please refer to the previous questions which i posted for the problems which i am facing during Highlighting...
-->> Problem when using XPath
-->> Problem when using Rangy (Not answered yet!)
So... from the responses to these question, i came to the conclusion that
the DOM structure of the loaded HTML in WebView is inconsistent and different than that of the same HTML page loaded in Desktop Browser mostly with reference to "TextNodes"..
To support this conclusion, i have created a jsFiddle (link in the question).
But, i also think that WebView may not be changing the DOM structure but it surely returns incorrect TextNodes inside a div...
Now the question is, is there any way to stop this change in DOM structure of the HTML in WebView?
Any insight guys?

Related

Android Webview: Get current viewport/rendered content

I am writing an Android Webview App to record what HTML content I am currently looking at. Is there any way to capture the rendered HTML content (or DOM element)?
I checked all the Android Webview APIs, and none provided such functionalities.
I also tried to rewrite the scroll listener, but it won't give me the content I am looking at(rendering on the current screen) either.
The previous question shows how to get the entire HTML content from the Webview: how to get HTML content from a Webview? (13 answers), but it couldn't solve my issue, since even if I got the page content I won't know which part I am looking at.
Please let me know if you have any suggestions or reasonable solutions for this.

Qt 5 parsing HTML on Android

I'm developing a cross platform mobile app with Qt 5.3.1. I need to load various HTML pages and parse DOM element values from them. At the moment I have succesfully loaded a page with QNetworkAccessManager and stored it in QByteArray but I hit the wall trying to parse the valuable data out from it.
Couple points:
I can't use QWebkit since it's not supported on Android on Qt 5
The HTML can't be assumed being strict mark up, eg Qt's XML readers or DOM parsers won't work on their own
I'm only parsing text from pages. The information is all i need, not visual style
What options do I have? It sounds a little bit stupid that WebKit would be the only way doing this, since I don't need to display any graphical data from webpages. Is writing my own DOM parser for HTML the way to go?
http://qt-project.org/wiki/Handling_HTML
Has a pretty good list of html parsers that are available.
Sometimes a good regular expression can catch what you need, but it isn't as robust as a good HTML parser.
The first link on the page looks pretty promising:
http://tidy.sourceforge.net/libintro.html
I don't know how difficult it would be to build the libraries for Qt Android, but it looks do-able, and works with standard tools.
Hope that helps.

Load website with local js/css

Is there a way to load someones website into webview, and then apply your css to it(that is stored in your app or some other server) ?
I have been searching for an answer for a month now, and nothing seems to work, i tried using some functions from WebView that load and store the html of the site and then load it with my url(or some similar solutions), the css is applied but then the sites javascripts dont work, the links are messed up and all sorts of problems occur.( also jsoap is not the answer, tried it, or maybe i didnt figure it out correctly )
Long story short, load www.some-site.com but make it use your css and remain fully functional?
Easily done with WebViewClient.shouldInterceptRequest.

Can WebView be used for HTML parsing?

I came across the WebView class in android.webkit and was impressed by how it "does everything for you" (as a programmer), as far as rendering visual HTTP content on the screen.
My question: Is it possible to use the WebView class as a shortcut for parsing rendered HTML for non-visual purposes?
(that is, retrieve certain elements from a web page for text processing, etc.)
If so, how would one go about this?
You don't need to, as far as I know, Android is using TagSoup to parse HTML, and you can use it too.

xml parsing with no browser use

I have read the example for Rss Parsing from the ibm site.(http://www.ibm.com/developerworks/opensource/library/x-android/).
In this example,the rss are shown in a listview and then,if you press one announcement you can see it in the web browser of the device.How could i see them in the app,with no use of the device browser?
Thanks a lot
Create a layout with a WebView then load the URL from each "announcement" using WebView.loadUrl.
I'm a little confused but you seem to have answered your own question.
You say you don't want to use the web browser on the device but the example in your question doesn't use the browser. It does exactly what you're asking for.
The idea is that you download the html from the website and then use the parser to break it up into separate "announcements" and store them in list view items in your program.
I have done a bit of this type of thing myself in android. I used jsoup java library, which makes breaking the html into the bits you want to display really easy.
If you want some more help I can give you an example of an app I made that pulls movie times from google.com/movies as an example. here are links to the classes where I did the html download and parse:
ScreenScraper.java
HtmlParser.java

Categories

Resources