I pulled a website to a WebView via HTTP GET. The problem is that the website isn't formatted for mobile. I found that if I edit the HTML, I can comment out the scripting that makes the left pane on the site.
Method:
Download page to string, search string for and replace first substring <link with <!--, write to file, and load into the WebView.
That works great until it comes to a link. Clicking on it causes the WebView to attempt to load file:///index.php/Whatever_the_page_was.
What I want to do is capture that link request and change the file:/// part to www.wurmpedia.com, and then run it through my parser to remove the script like the first, and repeat the process on any other link click that follows.
I could not find any other way to pull this off and this is what I made up. Any help would be appreciated, either through URL modification or with a more efficient method.
How about intercepting the link request using
WebView.shouldInterceptRequest
Related
While I know how to extract contents of a website by URLConnection and BufferedReader and get its source code, sometimes a website is itself getting data from elsewhere and showing onto the page.
e.g. I am now working on this page
http://bet.hkjc.com/marksix/userinfo.aspx?file=lucky_ocbs.asp&lang=en
and the 10 branches name and other details in the table in the page is not in the source code of the page.
Question:
Instead of extracting data from source code, is there any way to extract wordings simply from the final text showing in a page? If yes, how could it be done?
Thanks a lot.
Yes, there is a way to extract the information from the website even if it performs some client side operations such as loading the data from an external website before displaying it. Although it'll be a very tricky solution and if you would have an opportunity to make an agreement with the website's owner and ask him to provide API to your application, I'd choose that option.
Ok, according to your question you can try to use Android's WebView to render the website first. Then just get the html content using one of the method described here. The most tricky part here is to make it in user friendly way. You have to cover a WebView with a progress bar while your app is waiting for onPageFinished callback from WebView. I'm not sure that WebView is acting properly in that case. But it's worth to try.
Short Answer: You can't.
Reason: What renders the HTML is the client side. e.g: Browsers, Chrome, Firefox, IExplore, etc... Since you don't have a interpreter for the Markup Language you are unable to get only tag content ,even the browsers download all content, this is the HTTP behavior.
Workaround: Since you mentioned that some branches are not on page, i assume it is running on client side via some Javascript, what you can do is check what client is executing and perform via code). Since your client is the app.
Also see: Jsoup
You can not extract only your wanted information without download source html. after you downloaded source, you can use jsoup to iterate to only your wanted information.
add this to your app level build.gradle file
compile 'org.jsoup:jsoup:1.9.2'
then you can download and parse source code.
String url = "http://bet.hkjc.com/marksix/userinfo.aspx?file=lucky_ocbs.asp&lang=en";
InputStream input = new URL(url).openStream();
Document doc = Jsoup.parse(input, "ISO-8859-9", url);
Elements sectionElements = doc.select("div#general-info-panel");
Elements imageElements = sectionElements.select("img[src]");
you need to convert above code block to your html page source code. you can find examples to how to use jsoup.
http://phantomjs.org/ can be used to extract a website's content after JavaScript execution. Not sure if they have an android build.
I've created a simple android browser. I have used EditText for URL and Webview for loading webpages. Although the browser works fine when I put the complete URL path, I need a functionality by which I can get predictions/auto-completion of partially put URLs. Please let me know if following options are valid -
Using AutoCompleteTextView instead of EditText with a database of all the available websites on the internet(More than 1 billion websites!! Where can I get such dynamically updating database??)
Saving URLs which are frequently being used by the user and use them as a prediction in an AutoCompleteTextView.(How can I add these URLs in a dynamic AutoCompleteTextView/Database??)
Currently I am using google search as a workaround. Please provide your views on how this can be achieved.
I Have one SDK (android). Is there any way to get entire webpage (HTML javascript whole page) by making API call?
There are many ways to get URL of webpage. I dont want URL I want full HTML of webpage directly so that I can use that in mobile app.
Is it possible? If it is then how? and If it is not then why not?
Any type of help appreciated.
If you have the URL of the page, the API is called HTTP GET and you only request the page with that URL - that will give you all the HTML that is on the page, including all the embedding tags that refer to external javascript, css, images etc.
I have a WebView with some web page in it. Now I want to retrieve complete HTML contents of what is inside the WebView.
I use loadUrl("javascript:...") and WebView's javascript interface feature to retrieve this HTML using something like this:
document.getElementsByTagName('html')[0].innerHTML / outerHTML
document.documentElement.outerHTML
...
In each case I receive partial HTML contents - exactly first 10000 characters! So my question is - how do I get complete HTML content? Is it device-specific and, maybe there are workarounds?
Btw, web pages are created dynamically with javascript - I can't simply download the file from server.
Also, I tried printing HTML contents in javascript with console.log and found exactly the same behavior.
Thanks in advance!
My mistake - it was not related to javascript, neither had to do with specific device I tested on.
So, in short, any of those js properties work correctly.
I'm little bit struggle on past few days i can't get good solution for regarding this. My task is to load the youtube link in an webview. The given url is VideoLink. I directly load this link through android webview it won't play. When i load the embed code of this link, it successfully loaded. Here my problem is i get the embed code Manually (ie load the url on system browser-> right click -> select copy embed html),but i have lot of links like this. Is it not possible to do manually. Is it possible to get the embed html code of youtube link by programmatically.
Why cant you create a httpConection to the URL and read the InputStream in to String and give it to Webview. if you can explain a bit more, i can suggest you a better idea.