Extract specific table data using Jsoup - android

I'm trying to extract the football fixtures from this Webpage. Right now I've this code for extracting the fixtures from the table of that webpage.
private class LoadFixtures extends AsyncTask<Void,Void,Void> {
String stringDT="",stringHome="",stringAway="";
String url = "http://www.bbc.com/sport/football/spanish-la-liga/fixtures";
String stringTime="";
#Override
protected Void doInBackground(Void... params) {
Document doc = null;
try {
doc = Jsoup.connect(url).timeout(0).get();
Elements matchDetails = doc.select("td.match-details");
Elements ele_hTeam = matchDetails.select("span.team-home.teams");
Elements ele_aTeam = doc.select("span.team-away.teams");
Elements ele_time = doc.select("td.kickoff");
int tsize = ele_hTeam.size();
for(int i=0;i<tsize;i++) {
stringTime+="\n\n"+ele_time.get(i).text();
stringHome+="\n\n"+ele_hTeam.get(i).text();
stringAway+="\n\n"+ele_aTeam.get(i).text();
}
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void aVoid) {
homeTeam.setText(stringHome);
awayTeam.setText(stringAway);
timeView.setText(stringTime);
super.onPostExecute(aVoid);
}
}
This code gives me the whole list of fixtures, but what i want to do is just extract the fixtures of specific date. For example, lets say i want to extract the fixtures only from Saturday 16th January 2016

The code below will do as you ask. I have simply supplied a String variable with the date you are looking up. The code below loops on each table on the page. Each table will contain x number of fixtures. If the tables caption contains the date you have provided it will enter this table and allow you to select the home team and the away team. Hope this helps!
String dateLookup= "16th January 2016";
String url = "http://www.bbc.com/sport/football/spanish-la-liga/fixtures";
try {
Document document = Jsoup.connect(url).timeout(0).get();
Elements tableElements = document.select("table.table-stats");
for (Element e : tableElements) {
if (e.select("caption").text().contains(dateLookup)) {
Elements matchElements = e.select("tr.preview");
for (Element match : matchElements) {
System.out.println("Home Team: " + match.select("span.team-home").text());
System.out.println("Away Team: " + match.select("span.team-away").text() + "\n");
}
}
}
} catch (IOException e) {
e.printStackTrace();
}

Related

Unable to find comments from URL using Jsoup

I am working on Android. I want to extract comments from a web page using jsoup library. I am doing in this way. But could not do that. Can anyone help?
public void fun() {
Document doc = null;
try {
doc = Jsoup.connect("http://tribune.com.pk/story/1164751/federal-govt-dodged-chinese-govt-cpec/").timeout(10 * 1000).get();
} catch (IOException e) {
e.printStackTrace();
}
Elements pa = doc.getElementsByClass("span-12 last");
int count = 1;
for (Element iter : pa) {
System.out.println( iter.text());
count = count + 1;
}
}
You have 2 issues here:
Your program closes because the server expects to get a userAgent string and returns you a 403 error.
The comments are located under the "li-comment" class.
This code works for me:
Document doc = null;
try {
doc = Jsoup.connect("http://tribune.com.pk/story/1164751/federal-govt-dodged-chinese-govt-cpec/").timeout(10 * 1000)
.userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0")
.get();
} catch (IOException e) {
e.printStackTrace();
}
Elements el = doc.getElementsByClass("li-comment");
for (Element e : el) {
System.out.println(e.text());
System.out.println("-----------------");
}
You should also handle the case that li-comment is emtpy or does not exist, in case that there are no comments on the page.
on button click i used this..
public void fetchData(View v) {
Toast.makeText(getApplicationContext(),
"Data is fetching from The Hindu wait some time ",
Toast.LENGTH_LONG).show();
new Thread(new Runnable() {
#Override
public void run() {
try {
// get the Document object from the site. Enter the link of
// site you want to fetch
/*
* Document document = Jsoup.connect(
* "http://javalanguageprogramming.blogspot.in/") .get();
*/
Document document = Jsoup.connect(
"http://www.thehindu.com/").get();
title = document.text().toString();
// Get the title of blog using title tag
/* title = document.select("h1.title").text().toString(); */
// set the title of text view
// Get all the elements with h3 tag and has attribute
// a[href]
/*
* Elements elements = document.select("div.post-outer")
* .select("h3").select("a[href]"); int length =
* elements.size();
*/
Elements elements = document.select("div.fltrt")
.select("h3").select("a[href]");
int length = elements.size();
for (int i = 0; i < length; i++) {
// store each post heading in the string
posts += elements.get(i).text();
}
// Run this on ui thread because another thread cannot touch
// the views of main thread
runOnUiThread(new Runnable() {
#Override
public void run() {
// set both the text views
titleText.setText(title);
postText.setText(posts);
}
});
} catch (Exception e) {
e.printStackTrace();
}
}
}).start();
}

Get html from list of pages

I have list of web pages(over 100) with I have to vistit and collect data from.
I decided to save the html from all of them to one file, and then use Jsoup to find the interesting data.
But problem is to I do not know how to run 100 threads, and save the responses into one file, any ideas?
maybe it's not a masterpiece, but it works, and I wanted to make it as simple as possible.
ArrayList<String> links = new ArrayList<>();
Elements myDiv;
private void saveDetails() throws IOException {
if(repeat < links.size()){
repeat++;
textView.setText(String.valueOf(repeat));
saveFile(myDiv.toString());
myDiv = null;
getDetails(links.get(repeat));
}else {
textView.setText("finished");
}
}
private void getDetails(String urlStr) {
final String detailsUrl = urlStr;
new Thread() {
#Override
public void run() {
Message msg = Message.obtain();
try {
Document doc = Jsoup.connect(detailsUrl).get();
myDiv = doc.select(".exhibitor-contact");
} catch (IOException e1) {
e1.printStackTrace();
}
detailsHandler.sendMessage(msg);
}
}.start();
}
private Handler detailsHandler = new Handler() {
public void handleMessage(Message msg) {
super.handleMessage(msg);
try {
saveDetails();
} catch (IOException e) {
e.printStackTrace();
}
}
};
You don't need to save all of them in a file and then process them. You can gather information one by one. It is my suggestion:
arrayList urls = {100 site-url}; //in correct syntax
Document doc = null;
for (String url : urls) {
doc = Jsoup.connect(url).get();
//now proccess doc.toString as you want(in regular expression for example)
//save your desired information
}

Getting data from xml in android

I am working on android application. In my app I got the xml data response from server and stored it in a string. Now I need to get each value of that xml and display in a dropdown. How can I do that. Please help me with this. Will be really thankful.
My xml data:
<?xml version="1.0" encoding="utf-8"?>
<root>
<status>first<status>
<description>very good</description>
<Firstnames>
<name>CoderzHeaven</name>
<name>Android</name>
<name>iphone</name>
</Firstnames>
<SecondNames>
<name>Google</name>
<name>Android</name>
</SecondNames>
</root>
I am getting the above mentioned xml data from server. Now I need to display that in listview. How can I get those values using xmlparser. I tried with different examples but it didnt work for me.
You will need to create an extra class and parametrize your adapter with objects of this class, an example data model would look like:
public class DataClass {
private String status, description;
private ArrayList<String> fnames, lnames;
public DataClass() {
fnames = new ArrayList<String>();
lnames = new ArrayList<String>();
}
public String getStatus() {
return status;
}
public void setStatus(String status) {
this.status = status;
}
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
public ArrayList<String> getFnames() {
return fnames;
}
public ArrayList<String> getLnames() {
return lnames;
}
}
As for the XML parser, there are literally tons of examples, you're definitely in advantage if you can use search. Just to give you a staring point, tutorials one, two, three, four.
If you experience problems, post your efforts and the code that didn't work, what have you tried and so on. Then you'll get help, otherwise nobody on SO is going to write code for you. https://stackoverflow.com/help/how-to-ask
Here's how you can do it if the xml is inside of your apps assets folder.
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
InputStream input = null;
try {
input = getApplicationContext().getAssets().open("data.xml");
} catch (IOException e) {
e.printStackTrace();
}
DocumentBuilder builder = null;
try {
builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
Document doc = null;
if (builder == null) {
Log.e("TAG", "Builder is empty.");
return;
}
try {
doc = builder.parse(input);
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
if (doc == null) {
Log.e("TAG", "Document is empty.");
return;
}
// Get Firstnames element
Element firstNames = (Element) doc.getElementsByTagName("Firstnames").item(0);
// Get name nodes from Firstnames
NodeList nameNodes = firstNames.getElementsByTagName("name");
// Get count of names inside of Firstnames
int cChildren = nameNodes.getLength();
List<String> names = new ArrayList<String>(cChildren);
for (int i=0; i<cChildren; i++) {
names.add(nameNodes.item(i).getTextContent());
Log.d("TAG","Name: "+names.get(i));
}
// Do same with SecondNames
}

Only first word of two strings gets added to db

When trying to add words to a database via php, only the first word of both strings gets added.
I send the text via this code:
public void sendTextToDB() {
valcom = editText1.getText().toString();
valnm = editText2.getText().toString();
t = new Thread() {
public void run() {
try {
url = new URL("http://10.0.2.2/HB/hikebuddy.php?function=setcomm&comment="+valcom+"&name="+valnm);
h = (HttpURLConnection)url.openConnection();
if( h.getResponseCode() == HttpURLConnection.HTTP_OK){
is = h.getInputStream();
}else{
is = h.getErrorStream();
}
h.disconnect();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
Log.d("Test", "CONNECTION FAILED 1");
}
}
};
t.start();
}
When tested with spaces and commas etc. in a browser, the php function adds all text.
The strings also return the full value when inserted into a dialog.
How do I fix this?
Thank you.
You need to URL-encode valcom and valnm when putting them into the URL.
See java.net.URLEncoder.encode: http://developer.android.com/reference/java/net/URLEncoder.html
url = new URL("http://10.0.2.2/HB/hikebuddy.php?function=setcomm&comment="
+ URLEncoder.encode(valcom)
+ "&name="+ URLEncoder.encode(valnm));

Jsoup in Android, extracting the "middle word" in text from an element after title=

Im making an Android App and i need help with targeting a specific text in an Element
This is where im at:
Elements bookietemp = item.getElementsByClass("name");
String bookie1 = bookietemp.select("a[title]").first().text(); //This dosnt work
Log.d("test", bookie1);
I have tried with the above, but it dosnt work or return anything:
"bookietemp" will contain the following code, from this i want to extract only: "Toto" or "Tobet" (The second word/the word after "Open ", after title=)
This is the value from "bookietemp"
<a rel="nofollow" class="name" title="Open Toto website!" target="_blank" href="/bookmakers/toto/web/"><span class="BK b6"> </span></a>
<a rel="nofollow" class="name" title="Open Tobet website!" target="_blank" href="/bookmakers/tobet/web/"><span class="BK b36"> </span></a>
And my full code is here:
public class AsyncTaskActivity extends Activity {
Button btn_start;
TextView state;
TextView output;
ProgressDialog dialog;
Document doc;
String test;
Element test2;
#Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_async_task);
btn_start = (Button) findViewById(R.id.btn_start);
state = (TextView) findViewById(R.id.state);
output = (TextView) findViewById(R.id.output);
btn_start.setOnClickListener(new View.OnClickListener() {
#Override
public void onClick(View v) {
btn_start.setEnabled(false);
new ShowDialogAsyncTask().execute();
}
});
}
private class ShowDialogAsyncTask extends AsyncTask<String, Void, ArrayList<String>> {
ArrayList<String> arr_linkText=new ArrayList<String>();
#Override
protected void onPreExecute() {
// update the UI immediately after the task is executed
super.onPreExecute();
Toast.makeText(AsyncTaskActivity.this, "Invoke onPreExecute()",
Toast.LENGTH_SHORT).show();
output.setText("Please Wait!");
}
#Override
protected ArrayList<String> doInBackground(String... String) {
// String linkText = "";
try {
doc = Jsoup.connect("http://www.bmbets.com/sure-bets/").get();
// linkText = el.attr("href");
// arr_linkText.add(linkText);
Elements widgets = doc.getElementsByClass("surebets-widget");
for (Element widget : widgets){
//Log.d("test", el.toString());
Elements items = widget.getElementsByClass("item"); //Dette giver dig ca 8 items.
for (Element item : items)
{
Elements matchtemp = item.getElementsByClass("odd");
String matchname = matchtemp.select("a[title]").first().text();
Log.d("test", matchname);
//Here is the problem
Elements bookietemp = item.getElementsByClass("name");
String bookie1 = bookietemp.select("a[title]").first().text();
Log.d("test", bookie1);
Elements tipvals = item.getElementsByClass("tip-val");
if (tipvals.size() == 2)
{
Log.d("test", "Head to Head kamp");
Element tipval1 = tipvals.get(0);
String oddshome = tipval1.text().trim();
Element tipval2 = tipvals.get(1);
String oddsaway = tipval2.text().trim();
Log.d("test", oddshome + " " + oddsaway);
}
else
{
Log.d("test", "3 way");
Element tipval1 = tipvals.get(0);
String oddshome = tipval1.text().trim();
Element tipval2 = tipvals.get(1);
String oddsdraw = tipval2.text().trim();
Element tipval3 = tipvals.get(2);
String oddsaway = tipval3.text().trim();
Log.d("test", oddshome + " " + oddsdraw + " " + oddsaway);
}
}
// arr_linkText.add(linkText);
}
// return test2;
} catch (IOException e) {
e.printStackTrace();
}
return arr_linkText;
}
#Override
protected void onProgressUpdate(Void... values) {
super.onProgressUpdate(values);
// // progressBar.setProgress(values[0]);
// // txt_percentage.setText("downloading " +values[0]+"%");
}
#Override
protected void onPostExecute(ArrayList<String> result) {
// super.onPostExecute(result);
Toast.makeText(AsyncTaskActivity.this, "Invoke onPostExecute()",
Toast.LENGTH_SHORT).show();
state.setText("Done!");
//output.setText(result);
for (String temp_result : result){
output.append (temp_result +"\n");
}
btn_start.setEnabled(true);
}
}
Note i have something a bit similar to extract another text, which is working:
Elements matchtemp = item.getElementsByClass("odd");
String matchname = matchtemp.select("a[title]").first().text();
Log.d("test", matchname);
I finally figured it out myself using:
Elements bookietemp = item.getElementsByClass("name");
String bookie1 = bookietemp.attr("title"); //This gets the full line
String arr[] = bookie1.split(" ", 3); //This splits the word in 3
String theRest = arr[1]; //This selects the second word
EDit:
If anyone have a simplier way, or a way to combine these lines im still interrested

Categories

Resources