How to extract text from html page? - android

How to extract text from html page? For example the web page is the link http://www.atempodihockey.it/campionati/campionati-hil/serie-a1-2013-2014/calendario.html from I want to take the text. I must have the name of the team and the resoult of the match

I think below code can help u
webView = (WebView) findViewById(R.id.webterms);
webView.getSettings().setJavaScriptEnabled(true);
webView.getSettings().setPluginsEnabled(true);
webView.getSettings()
.setUserAgentString(
"Mozilla/5.0 (Linux; U; Android 2.0; en-us; Droid Build/ESD20) AppleWebKit/530.17 (KHTML, like Gecko) Version/4.0 Mobile Safari/530.17");
after creating your webview load your url or html page
webView.addJavascriptInterface(new MyJavaScriptInterface(),"HTMLOUT");
webView.setWebViewClient(new WebViewClient() {
#Override
public boolean shouldOverrideUrlLoading(WebView view, String url) {
view.loadUrl(url);
return false;
}
#Override
public void onPageFinished(WebView view, String url1) {
if (pDialog.isShowing()) {
pDialog.dismiss();
}
webView.loadUrl("javascript:window.HTMLOUT.processHTML(document.documentElement.innerText);");
}
});
webView.loadUrl(url);
Then create a class which has a one method for processing your html
class MyJavaScriptInterface {
public void processHTML(String html) {
if (null != html && html.trim().length() > 0) {
System.out.println("your Html ->" + html);
}
}

For this purpose, you can use HtmlAgilityPack
Do it as follwing...
Add reference of HtmlAgilityPack in your project.
using HtmlAgilityPack;
and then put the url to get the full page
HtmlWeb webGet = new HtmlWeb();
HtmlDocument document = webGet.Load("http://www.atempodihockey.it/campionati/campionati-hil/serie-a1-2013-2014/calendario.html");
From the html of 'document' variable you can get your expected text

Related

Webview webcontent not loading

I am trying to load some HTML content (it is a hyperlink) in WebView. But when I click that link nothing is opening. But the same link is working in browser or iOS WebView perfectly.
The code I try:
htmlContent.getSettings().setJavaScriptEnabled(true);
htmlContent.setWebViewClient(new WebViewClient() {
#Override
public boolean shouldOverrideUrlLoading(WebView view, String url) {
view.loadUrl(url);
return true;
}
});
htmlContent.loadDataWithBaseURL(null, Html.fromHtml(categoryItemsBean.getContent()).toString(), "text/html", "utf-8", null);
Value of categoryItemsBean.getContent().
<p><a title="كتاب مساعدة الأصدقاء"
href="https://www.manhal.com/platform/stories/demo/index.php?storyid=31"
target="_blank">كتاب مساعدة الأصدقاء </a></p>
Using loadData() method for loading html content on webview
Sample
String unencodedHtml ="<html><body>'%23' is the percent code for ‘#‘ </body></html>";
String encodedHtml = Base64.encodeToString(unencodedHtml.getBytes(),Base64.NO_PADDING);
htmlContent.loadData(encodedHtml, "text/html", "base64");
Convert your html content into base64

Webview InputText not supported in 2.1 and 2.2

In Android we have an url
"http://offers.bartizan.com/we-want-ileads-at-our-shows"
which contains lot of data.
We load this url in webview, In android 2.3 and onwards its working fine means diplay all content of url data but in android 2.1 and 2.2 not display Input Text form.
Here is my code which i used to load url in webview.
webView.getSettings().setJavaScriptEnabled(true);
webView.addJavascriptInterface(new MyJavaScriptInterface(), "HTMLOUT");
webView.getSettings().setPluginsEnabled(true);
webView.getSettings().setDomStorageEnabled(true);
webView.getSettings().setJavaScriptCanOpenWindowsAutomatically(true);
webView.getSettings().setRenderPriority(RenderPriority.HIGH);
webView.enablePlatformNotifications();
webView.getSettings().setCacheMode(android.webkit.WebSettings.LOAD_DEFAULT);
webView.loadUrl("http://offers.bartizan.com/we-want-ileads-at-our-shows");
webView.setWebViewClient(new WebViewClient(){
#Override
public boolean shouldOverrideUrlLoading(WebView view, String url) {
//if(url.contains("http://www.youtube.com/embed/")){
Log.d("WBC", "This is a YouTube link: " + url);
Log.e("url---Video->", url);
view.loadUrl(url);
return true;
} });
class MyJavaScriptInterface {
public void showHTML(String html) {
String htmlString = html;
Log.e("<---MyJavaScriptInterface---showHTML----->", htmlString);
}
}
Below form not display in android OS. 2.1 and 2.2
First name *
Last name
Email address *
List the names of your shows here. Please include the show manager name, email and phone. *
1) Showing blank area above image
2) Textbox we require

Force webview to display desktop sites

I have the below webview client which sets the user agent to a desktop browser when we are viewing a page that does not contain the word google in the URL. (Also does other stuff but that all works fine).
webView.setWebViewClient(new WebViewClient() {
#Override
public boolean shouldOverrideUrlLoading(WebView view, String url) {
if (!url.contains("google")) {
String newUA= "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.4) Gecko/20100101 Firefox/4.0";
webView.getSettings().setUserAgentString(newUA);
view.loadUrl(url);
}else {
webView.getSettings().setUserAgentString(null);
view.loadUrl(url);
}
return true;
}
public void onPageStarted(WebView view, String url, Bitmap favicon)
{
super.onPageStarted(view, url, favicon);
progressBar.setVisibility(View.VISIBLE);
}
public void onPageFinished(WebView view, String url) {
// TODO Auto-generated method stub
super.onPageFinished(view, url);
String page = webView.getUrl();
if (!(page.contains("google"))){
grabit.setVisibility(View.VISIBLE);
}else{
grabit.setVisibility(View.GONE);
}
webView.loadUrl("javascript: function loadScript(scriptURL) { var scriptElem = document.createElement('SCRIPT'); scriptElem.setAttribute('language', 'JavaScript'); scriptElem.setAttribute('src', scriptURL); document.body.appendChild(scriptElem);} loadScript('"+CFG.Bookmarklet+"');");
progressBar.setVisibility(View.INVISIBLE);
if (webView.canGoBack()){
left.setImageResource(R.drawable.ic_arrowleft);
}else{
left.setImageResource(R.drawable.ic_arrowleft_gray);
}
if (webView.canGoForward()){
right.setImageResource(R.drawable.ic_arrowright);
}else{
right.setImageResource(R.drawable.ic_arrowright_gray);
}
}
});
This issue with this is that while on some sites it works perfectly others it does not work and on some it seems to just change the view port.
A few examples are:
> Argos - shows mobile
> Tesco - shows mobile but view port has changed
> Amazon - works
> John Lewis - shows mobile but view port has changes
> Play.com - works
So is there something I am missing? Another way the websites it does not work on are checking the browser to decide what to display?
It would seem that the 'show desktop version' in Chrome works fine for these sites.. so prehaps chrome does something else to?
Thanks
You can use setDesktopMode(true) from this WebView subclass or read how it's implemented in detail.
This way, you don't have to set a fixed user-agent string. Moreover, setting the user-agent string only is usually not enough. It depends on the site, of course, since every website uses their own way of determining whether the client is on mobile or desktop.
You Can try this
webview.getSettings().setJavaScriptEnabled(true);
webview.getSettings().setUserAgentString("Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.4) Gecko/20100101 Firefox/4.0");
The only solution which worked for me (javascript will be executed many times, but this is the only working solution for now)
#Override
public void onLoadResource(WebView view, String url) {
super.onLoadResource(view, url);
view.evaluateJavascript("document.querySelector('meta[name=\"viewport\"]').setAttribute('content', 'width=1024px, initial-scale=' + (document.documentElement.clientWidth / 1024));", null);
}
You can set desktop UA string too
webView.getSettings().setUserAgentString("Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36");

Android webView blinks

The webview in my app (where i present some RSS news) uses to blink when i scroll,or sometimes if i read the article.
How can i fix it?
the problem is the same in my 4.1.1 device and in the emulator.
this is my webview:
final WebView desc = (WebView) view.findViewById(R.id.desc);
desc.setWebViewClient(new WebViewClient() {
#Override
public void onPageFinished(WebView view, String url) {
// super.onPageFinished(view, url);
desc.loadUrl("javascript:(function() { "
+ "document.getElementsByTagName('img')[0].style.display = 'none'; "
+ "})()");
}
});
// Set webview properties
WebSettings ws = desc.getSettings();
ws.setSupportZoom(true);
// ws.setDisplayZoomControls(true);
ws.setLayoutAlgorithm(LayoutAlgorithm.SINGLE_COLUMN);
ws.setLightTouchEnabled(false);
ws.setPluginState(PluginState.ON);
ws.setJavaScriptEnabled(true);
ws.setUserAgentString("Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.36 (KHTML, like Gecko) Chrome/13.0.766.0 Safari/534.36");
// ws.setLoadsImagesAutomatically(false);
// desc.requestFocus(View.FOCUS_DOWN);
desc.loadDataWithBaseURL("", DESC,
"text/html", "UTF-8", null);
put android:hardwareAccelerated="false" to that activity.
I hope it may helps
Thanks,
Chaitanya.K
when you set android:hardwareAccelerated="false" to that activity,you'd better not use animation in that activity.otherwise animation will be so bad.
you could set webview.setLayerType(View.LAYER_TYPE_SOFTWARE,null);
it also close hardwareAccelerated

Webview not being seen as a mobile browser

In my webview I am going to http://www.nsopw.gov/Core/OffenderSearchCriteria.aspx
when this is done via the android browser it is seen as a mobile site.
But in my app in a webview it is seen as not a mobile browser therefore not being redirected to the mobile version.
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
requestWindowFeature(Window.FEATURE_NO_TITLE);
setContentView(R.layout.buttons);
wb = new WebView(this);
wb.getSettings().setJavaScriptEnabled(true);
wb.setWebViewClient(new HelloWebViewClient());
wb.getSettings().setSupportZoom(true);
wb.getSettings().setBuiltInZoomControls(true);
wb.getSettings().setDomStorageEnabled(true);
String[] loading = getResources().getStringArray(R.array.array_loading);
Random r = new Random();
int rN = r.nextInt(12 - 1) + 1;
progressBar = ProgressDialog.show(Sex_Offenders.this, loading[rN],
"Loading...");
final String urlToLoad ="http://www.nsopw.gov/Core/OffenderSearchCriteria.aspx";
//final String urlToLoad = "http://m.familywatchdog.us/m_v2/msa.asp?es=&l=0&w=0&brtp=html&rstp=xlarge&imgtp=jpg&imgw=310&imgh=320";
wb.setWebViewClient(new HelloWebViewClient() {
public void onPageFinished(WebView view, String url) {
if (progressBar.isShowing()) {
progressBar.dismiss();
}
}
});
Context context = getApplicationContext();
if (Repeatables.isNetworkAvailable(context) == true) {
wb.getSettings().setCacheMode(WebSettings.LOAD_DEFAULT);
wb.loadUrl(urlToLoad);
} else {
wb.getSettings().setCacheMode(WebSettings.LOAD_CACHE_ELSE_NETWORK);
wb.loadUrl(urlToLoad);
Repeatables.NoConnectionAlert(this);
}
setContentView(wb);
try forcing the user agent string yourself, as it appears that the webview user agent string is not identifying itself as a mobile browser.
use
webview.getSettings.setUserAgentString(...)
You can google for the useragent string, I used
Mozilla/5.0 (Linux; U; Android 2.2; en-us; Nexus One Build/FRF91) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1
and it worked with the link you provided in your site and loaded the mobile version.

Categories

Resources