Want to get href and title from the table using Jsoup - android

I want to parse Html table using Jsoup but I am having trouble getting my requried data from it. I want to get href and title from each row of this table but I am getting the whole data from the table.
<table class="FullWidth gv" cellspacing="0" rules="all" border="1" id="ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION" style="border-collapse:collapse;">
<tr>
<th scope="col">S#</th>
<th scope="col">Code</th>
<th scope="col">Registered Course Title</th>
<th scope="col">Credits</th>
<th scope="col">Offered Course Title</th>
<th scope="col">Class</th>
<th scope="col">Teacher</th>
<th scope="col">Fee</th>
<th scope="col"> </th>
</tr>
<tr>
<td class="Center">
1</td>
<td class="NoWrap">GSC 220</td>
<td class="Width33">Complex Variables & Transforms</td>
<td class="Center">3</td>
<td class="Width33">Complex Variables & Transforms</td>
<td class="NoWrap">BCE-4 (A) MORNING</td>
<td class="Width33">AMMAR AJMAL</td>
<td>YES</td>
<td>
<a title="Complex Variables & Transforms" class="a" href="Attendance.aspx?COID=21480" target="_blank">Attendance</a>
</td>
</tr>
<tr class="Alternating">
<td class="Center">
2</td>
<td class="NoWrap">CSC 221</td>
<td class="Width33">Data Structure and Algorithm</td>
<td class="Center">3</td>
<td class="Width33">Data Structure and Algorithm</td>
<td class="NoWrap">BCE-4 (A) MORNING</td>
<td class="Width33">ABU BAKAR</td>
<td>YES</td>
<td>
<a title="Data Structure and Algorithm" class="a" href="Attendance.aspx?COID=21478" target="_blank">Attendance</a>
</td>
</tr>
<tr>
<td class="Center">
3</td>
<td class="NoWrap">CSL 221</td>
<td class="Width33">Data Structures and Algorithm Lab</td>
<td class="Center">1</td>
<td class="Width33">Data Structures and Algorithm Lab</td>
<td class="NoWrap">BCE-4 (A) MORNING</td>
<td class="Width33">ABU BAKAR</td>
<td>YES</td>
<td>
<a title="Data Structures and Algorithm Lab" class="a" href="Attendance.aspx?COID=21479" target="_blank">Attendance</a>
</td>
</tr>
<tr class="Alternating">
<td class="Center">
4</td>
<td class="NoWrap">CSC 220</td>
<td class="Width33">Database Management System</td>
<td class="Center">3</td>
<td class="Width33">Database Management System</td>
<td class="NoWrap">BCE-4 (A) MORNING</td>
<td class="Width33">BUSHRA SABIR</td>
<td>YES</td>
<td>
<a title="Database Management System" class="a" href="Attendance.aspx?COID=21481" target="_blank">Attendance</a>
</td>
</tr>
<tr>
<td class="Center">
5</td>
<td class="NoWrap">CSL 220</td>
<td class="Width33">Database Management System Lab</td>
<td class="Center">1</td>
<td class="Width33">Database Management System Lab</td>
<td class="NoWrap">BCE-4 (A) MORNING</td>
<td class="Width33">BUSHRA SABIR</td>
<td>YES</td>
<td>
<a title="Database Management System Lab" class="a" href="Attendance.aspx?COID=21482" target="_blank">Attendance</a>
</td>
</tr>
<tr class="Alternating">
<td class="Center">
6</td>
<td class="NoWrap">CSC 320</td>
<td class="Width33">Operating System</td>
<td class="Center">3</td>
<td class="Width33">Operating System</td>
<td class="NoWrap">BCE-4 (A) MORNING</td>
<td class="Width33">BUSHRA SABIR</td>
<td>YES</td>
<td>
<a title="Operating System" class="a" href="Attendance.aspx?COID=21474" target="_blank">Attendance</a>
</td>
</tr>
<tr>
<td class="Center">
7</td>
<td class="NoWrap">CSL 320</td>
<td class="Width33">Operating System Lab</td>
<td class="Center">1</td>
<td class="Width33">Operating System Lab</td>
<td class="NoWrap">BCE-4 (A) MORNING</td>
<td class="Width33">BUSHRA SABIR</td>
<td>YES</td>
<td>
<a title="Operating System Lab" class="a" href="Attendance.aspx?COID=21475" target="_blank">Attendance</a>
</td>
</tr>
<tr class="gvFooter">
<td> </td>
<td> </td>
<td> </td>
<td class="Center">15</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
I am trying like this
Document doce = Jsoup.connect(urlofthewebsite)
.cookies(hashMap)
.get();
Element tableheader = doce.select("table[id=ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION}").first();
for(Element element : tableheader.children())
{
System.out.println(element.text());
}

First of all, your example have typo at
select("table[id=ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION}")
since you are ending attribute selector with } instead of ].
You avoid such errors with id start using #identifier instead of [id=identifier] and .className instead of [class=className].
Also by calling
.select("table[id=ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION]")
.first();
you are not getting first row from table (like headers), but first table with this id (since such elements - tables with specific id - your selector suppose to find).
If you want find headers simply pick them by selecting th tags like
Element table = doce.select("table#ctl00_Body_STUDENT_SSS_ctrl0_COURSE_REGISTRATION").first();
for(Element column : table.select("th")) {
System.out.println(column.text());
}
Now based on
I want to get href and title from each row of this table but I am getting the whole data from the table.
you may want to use something like
for (Element link : table.select("a")){
System.out.println(link.attr("title")+" -> "+link.attr("href"));
//you can also use abs:href to get absolute path
}

Related

Parse specific table to array of strings Jsoup

i have this, rather complex html I want to parse using JSoup. I have tried several things, but none is working. Basically, I wanted to get the second table, and read all rows and append it to string.
What I have tried
val document = Jsoup.parse(it.data)
val tableElements = document.select("table:eq(2) > tbody")
for (element in tableElements) {
val data = element.select("td")
try {
Timber.i("${data[0].select("small").text()} : ${data[1].select("small").text()}")
} catch (e: Exception) {
}
}
What part I want to extract
<table>
<tbody>
<tr class="">
<td class="odsazena" align="left"><small>User's identification number: </small></td>
<td class="odsazena" align="left"><small>34565</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Study programme: </small></td>
<td class="odsazena" align="left"><small>Informatics</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Type of study: </small></td>
<td class="odsazena" align="left"><small>Bachelor</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Form of study: </small></td>
<td class="odsazena" align="left"><small>full-time, attendance method</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Standard length of study: </small></td>
<td class="odsazena" align="left"><small>3</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Number of credits required to complete your study: </small></td>
<td class="odsazena" align="left"><small>180</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Number of credits enrolled for the whole study: </small></td>
<td class="odsazena" align="left"><small>120</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Number of credits obtained during your whole course of study: </small></td>
<td class="odsazena" align="left"><small>90</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Your prospective academic degree: </small></td>
<td class="odsazena" align="left"><small>Bc.</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Beginning of study: </small></td>
<td class="odsazena" align="left"><small>09/01/2017</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Resolution of admission: </small></td>
<td class="odsazena" align="left"><small>Admitted without the entrance exam</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Progress of study: </small></td>
<td class="odsazena" align="left"><small>enrolled</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Mode of completion: </small></td>
<td class="odsazena" align="left"><small><i>not stated</i></small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Current financing: </small></td>
<td class="odsazena" align="left"><small>study fully financed from ME SK</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Final thesis topic: </small></td>
<td class="odsazena" align="left"><small><i>not stated</i></small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Final thesis supervisor: </small></td>
<td class="odsazena" align="left"><small><i>not stated</i></small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Language of study: </small></td>
<td class="odsazena" align="left"><small>Slovak</small></td>
</tr>
<tr class="">
<td class="odsazena" align="left"><small>Card number:</small></td>
<td class="odsazena" align="left"><small>123456</small></td>
</tr>
</tbody>
</table>
And now, what is the problem exactly? Well from what I've tried, the code does not even let me print the stuff I want, and in the current state as it is it will just skip the for cycle. What I wanted to achieve is that I wanted to get to second table "table:eq(2)" and get elements inside "tbody"
I think that you should select the "tr" elements as well and iterate over them as looks like you are iteratin over "tbody". This is a solution in Java, as I don't know Kotlin syntax, but maybe it helps:
Elements tableElements = doc.select("table").get(1).select("tbody").select("tr");
for (Element element : tableElements) {
Elements data = element.select("td");
System.out.println(data.select("small").first().text() +" : "
+ data.select("small").last().text());
}
This is java code to do what you want.
You can apply selector on elements.
#Test
public void selectSecondTable() {
String html = "" +
"<table></table>" +
"<table>\n" +
" <tbody>\n" +
" <tr class=\"\">\n" +
" <td class=\"odsazena\" align=\"left\"><small>User's identification number: </small></td>\n" +
" <td class=\"odsazena\" align=\"left\"><small>34565</small></td>\n" +
" </tr>\n" +
" </tbody>\n" +
"</table>";
Document doc = Jsoup.parse(html);
//select tr from second table in document:
for (Element e : doc.select("table:eq(1) tr")) {
//for each table row select text from small tag and print to console:
System.out.println(e.select("small").text());
}
}

How to login by filling a form in a website using Android

I'm trying to log on to a website that has a form to which you should provide user-name and password, check a box, and press a login button. I tried all kinds of httpClient POST messages, but it seems that it is not working. Can anyone assist and point to an example of skeleton of android Java way to login? Here is the form from the html page:
<form name="loginForm" method="post" action="/login.do">
<table border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="10px"> </td>
<td><label class="formLabel" for="loginID">Username</label></td>
</tr>
<tr>
<td> </td>
<td><input type="text" name="username" value="" class="formTextField"></td>
</tr>
<tr>
<td> </td>
<td><label class="formLabel" for="password"> Password</label></td>
</tr>
<tr>
<td> </td>
<td><input type="password" name="password" value="" class="formTextField"></td>
</tr>
<tr>
<td> </td>
<td> </td>
</tr>
<tr>
<td> </td>
<td><input type="checkbox" name="agreement" value="on" class="formTextField">
I agree with <div>
<b>Terms and Conditions</b></div>
</td>
</tr>
</table>
<p><input type="submit" value="Login" class="FPFormFieldB"></p>
<p>Have you forgotten the password?</p>
<p>New user registration</p>
</form>
I managed to login using JSOUP.
The key is that you need to do a get first, then a post, with the cokies (that includes SessionID and other stuff).
Here is the code that worked for me, hopefully it will assist others:
import android.provider.DocumentsContract;
import org.jsoup.Jsoup;
import org.jsoup.Connection;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;
public class Webbing {
public static void Open() throws Exception {
Connection.Response loginForm = Jsoup.connect("http://your website")
.method(Connection.Method.GET)
.execute();
Document document = Jsoup.connect("http://your website")
.data("username", "XXX")
.data("password", "YYY")
.data("agreement", "on")
.timeout(5000)
.cookies(loginForm.cookies())
.post();
String url = "http://a page you want to load after login";
Document fpl = Jsoup.connect(url)
.timeout(5000)
.cookies(loginForm.cookies())
.get();
body = fpl.body().toString();
ExtFile.write(body);
}
}

How can i get element from <td> in html with jsoup Android

How can I get bolded element "THIS" from HTML using jsoup. My problem is that I don't know how to get to this element, because I need to detect if its from <tr> "Ulica" first. what do I need to put in document.select(...)? Any ideas? Thanks.
<table class="InfoTable">
<tr>
<td class="Name">Ulica:</td>
<td class="Value"><span id="ctl00_RightContentPlaceholder_lbAregStreet">**THIS**</span></td>
</tr>
<tr>
<td class="Name">Mesto:</td>
<td class="Value"><span id="ctl00_RightContentPlaceholder_lbAregCity">XXXXX</span></td>
</tr>
<tr>
<td class="Name">PSČ:</td>
<td class="Value"><span id="ctl00_RightContentPlaceholder_lbAregZip">XXXX</span></td>
</tr>
<tr>
<td class="Name">Štát:</td>
<td class="Value"><span id="ctl00_RightContentPlaceholder_lbAregCountry">XXXXX</span></td>
</tr>
</table>
You can put all that into a single selector
Example:
// html is your posted html code here, you can connect to a website too.
final String html = ...
Document doc = Jsoup.parse(html); // Parse into document
// Select the element and print it
for( Element element : doc.select("td:contains(Ulica:) ~ td") )
{
System.out.println(element);
}
Explanation:
td:contains(Ulica:) ~ td: Selects td elements with text Ulicia, and takes the next sibling element that's a td.
Output:
<td class="Value"><span id="ctl00_RightContentPlaceholder_lbAregStreet">**THIS**</span></td>
Now you can get the values you need form that element.
Take a look at this; it is a nice way to do HTML parsing in Java.
String html="<table class=\"InfoTable\"<tr><td class=\"Name\">Ulica:</td> <td class=\"Value\"><span id=\"ctl00_RightContentPlaceholder_lbAregStreet\">**THIS**</span></td></tr><tr><td class=\"Name\">Mesto:</td><td class=\"Value\"><span id=\"ctl00_RightContentPlaceholder_lbAregCity\">XXXXX</span></td></tr></table>";
org.jsoup.nodes.Document doc = Jsoup.parse(html);
Iterator<Element> productList = doc.select("table[class=InfoTable]").iterator();
while (productList.hasNext()) {
//Do some processing
Element descLi = productList.next().select( "td:eq(1)").first();
String rr = descLi.text();
Log.d("TESTTT",rr );
}

Parse HTML table using JSOUP and display it to listview

I am new in Android programming. I need to get values from HTML and display it in list.
Here is link http://www.hak.hr/info/cijene-goriva/
->so I need values (10,41,10.51)
<div id="div_eurosuper95">
<table class="nowrapper fuel_segmented">
<thead>
<tr>
<th>
Gorivo
</th>
<th>
Cijena (kn)
</th>
</tr>
</thead>
<tbody>
<tr>
<td class="fuel_name"><span class="vendorName">Tifon</span></br>euroSUPER 95 BS</td>
<td class="fuel_segmented">10,41</td>
</tr>
<tr>
<td class="fuel_name"><span class="vendorName">Tifon</span></br>EUROSUPER 95
BS CLASS</td>
<td class="fuel_segmented">10,51</td>
</tr>
<tr>
<td class="fuel_name"><span class="vendorName">Crodux derivati</span></br>EUROSUPER 95 BS</td>
<td class="fuel_segmented">10,41</td>
</tr>
<tr>
<td class="fuel_name"><span class="vendorName">AdriaOil</span></br>Euro Super 95 BS TOP</td>
<td class="fuel_segmented">10,51</td>
</tr>
</tbody>
</table>
</div>
You can use Jsoup selector to select all the <td> tags that have are of class fuel_segmented.
Document doc = Jsoup.parse(html);
Elements fuel = doc.select("td.fuel_segmented");
This is a basic CSS selector syntax, where the td specifies the tag, and the . specifies that it is a class. If it was a specific td with an id you could've specified it as td#fuel_segmented.
This will return a collection of Element objects, represented by an Elements object.
To make it a bit more easy to see what is what, you can loop through the elements and display the corresponding fuel name.
Elements fuel = doc.select("td.fuel_segmented");
for (Element element : fuel) {
System.out.println(element.previousElementSibling().text()
+ ": " + element.text());
}
which will output
Tifon euroSUPER 95 BS: 10,41
Tifon EUROSUPER 95 BS CLASS: 10,51
Crodux derivati EUROSUPER 95 BS: 10,41
AdriaOil Euro Super 95 BS TOP: 10,51
I suggest that you read more about how to use the selector in Jsoup to parse the data that you need. That part of the cookbook can be found here.
To display your datas in ListView, there is a good tutorial to understand how does it work.
I really don't know where did you get these prices, on Jsoup you have all the "Cookbooks" necessary, with examples to parse html document.

Create Table from HTML code

<table style="width: 560px; border: 2px solid #fee3cc; font-size: 1em;" rules="all" border="1" cellpadding="5" cellspacing="0">
<tbody>
<tr>
<td>
<p>Bharatiya Jana Sangh</p>
</td>
<td>
<p>1951 - 1977</p>
</td>
</tr>
<tr>
<td>
<p>Janata Party</p>
</td>
<td>
<p>1977 - 1979</p>
</td>
</tr>
<tr>
<td>
<p>Bharatiya Janata Party</p>
</td>
<td>
<p>1980</p>
</td>
</tr>
</tbody>
</table>
I have the html code of table as above, and I want to directly show a new table on layout as above table code, how I can do this
In your activity:
WebView webview = new WebView(this);
setContentView(webview);
String yourHtml = "<html><body><table>...</table></body></html>";
webview.loadData(yourHtml , "text/html", "utf-8");

Categories

Resources