Html parsing using Jsoup selector combinations in Android - android

I want to parse <dt>Seeders:</dt> & <dt>Leechers:</dt> from a html using Jsoup.
See the full code below.
<div id="details">
<dl class="col1">
<dt>Type:</dt>
<dd>Audio > Music</dd>
<dt>Files:</dt>
<dd><a href="/torrent/8682317/" title="Files" onclick="
if (filelist < 1) {
new Ajax.Updater('filelistContainer', '/ajax_details_filelist.php', {method: 'get', parameters: 'id=8682317'});
filelist=1;
}; toggleFilelist(); return false;">28</a></dd>
<dt>Size:</dt>
<dd>222.65 MiB (233468815 Bytes)</dd>
<br />
<dt>Tag(s):</dt>
<dd>markus schulz dakota things trance armada 2011 inspiron </dd>
<br />
<dt>Uploaded:</dt>
<dd>2013-07-13 15:30:25 GMT</dd>
<dt>By:</dt>
<dd>
-inspiron- <img src="/static/img/vip.gif" alt="VIP" title="VIP" style="width:11px;" border='0' /></dd>
<br />
<dt>Seeders:</dt>
<dd>16</dd>
<dt>Leechers:</dt>
<dd>1</dd>
<dt>Comments</dt>
<dd><span id="NumComments">0</span>
</dd>
<br />
<dt>Info Hash:</dt><dd> </dd>
01DD6B7325C3DB5F0DF5BBE510FD3FD9738D1C88 </dl>
<div class="torpicture">
<img src="//image.bayimg.com/345b5b11734bb9973863359cc52929f3ddc45205.jpg" title="picture" alt="picture" />
</div>
<dl class="col2">
</dl>
<div id="CommentDiv" style="display:none;">
<form method="post" id="commentsform" name="commentsform" onsubmit="new Ajax.Updater('NumComments', '/ajax_post_comment.php', {evalScripts:true, asynchronous:true, parameters:Form.serialize(this)}); return false;" action="/ajax_post_comment.php">
<p class="info">
<textarea name="add_comment" id="add_comment" rows="8" cols="50"></textarea><br/>
<input type="hidden" name="id" value="8682317"/>
<input type="submit" value="Submit" /><input type="button" value="Hide" onclick="document.getElementById('CommentDiv').style.display = 'none'" />
</p>
</form>
</div>
<br/>
<br/>
<div id="social">
</div>
<iframe src="http://cdn1.adexprt.com/dl/dl.php?b=bar&r=75&n=Markus_Schulz_-_Global_DJ_Broadcast_%282013-07-11%29_%28Inspiron%29&m=magnet%3A%3Fxt%3Durn%3Abtih%3A01dd6b7325c3db5f0df5bbe510fd3fd9738d1c88%26dn%3DMarkus%2BSchulz%2B-%2BGlobal%2BDJ%2BBroadcast%2B%25282013-07-11%2529%2B%2528Inspiron%2529%26tr%3Dudp%253A%252F%252Ftracker.openbittorrent.com%253A80%26tr%3Dudp%253A%252F%252Ftracker.publicbt.com%253A80%26tr%3Dudp%253A%252F%252Ftracker.istole.it%253A6969%26tr%3Dudp%253A%252F%252Ftracker.ccc.de%253A80%26tr%3Dudp%253A%252F%252Fopen.demonii.com%253A1337" width="622" height="51" frameborder="0" scrolling="no"></iframe>
<br /><br /> <div class="download">
<a style='background-image: url("/static/img/icons/icon-magnet.gif");' href="magnet:?xt=urn:btih:01dd6b7325c3db5f0df5bbe510fd3fd9738d1c88&dn=Markus+Schulz+-+Global+DJ+Broadcast+%282013-07-11%29+%28Inspiron%29&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.istole.it%3A6969&tr=udp%3A%2F%2Ftracker.ccc.de%3A80&tr=udp%3A%2F%2Fopen.demonii.com%3A1337" title="Get this torrent"> Get this torrent</a>
<a style='background-image: url("/static/img/icon-https.gif");' href="http://adexprt.me/get/Markus_Schulz_-_Global_DJ_Broadcast_%282013-07-11%29_%28Inspiron%29?tag=bal" title="Anonymous Download"> Anonymous Download</a>
</div>
<div>(Problems with magnets links are fixed by upgrading your torrent client!)</div>
<div class="nfo">
<pre>=======================================================
Site: http://www.inspirontrance.com/
=======================================================
=======================================================
F B Page: Inspiron Trance
=======================================================
=======================================================
TWITTER : inspiron22
=======================================================
Markus Schulz
01. Mobil - One Morning (Aleksey Sladkov Remix)
02. Store N Forward - Nuts
03. Alter Future vs. Holbrook & SkyKeeper - Megapolis
04. Danilo Ercole - Cruzer
05. Aaron Camz - Emission
06. Markus Schulz Featuring Sarah Howells - Tempted
07. M.I.K.E. Presents Caromax - Inner Thoughts
08. Ruffault - Progressive Dream
09. Styller - What We Left Behind
10. Meridian - Exit
11. Lange - A Different Shade of Crazy
12. Tucandeo Featuring Natalie Gioia - Disappear (Xtigma Remix)
13. Sebastian Weikum - Sky is the Limit
14. Markus Schulz - Don't Leave Until the Sunrise
Guy J
01. Roger Martinez & Secret Cinema - Menthol Raga (Guy J Remix)
02. Ambassador - The Fade (Guy J Remix)
03. Guy J - Seven
04. Echomen – Perpetual (Guy J Remix)
Back with Markus Schulz
15. Mauro Picotto & Riccardo Ferri - New Time, New Place (New World Punx Remix)
16. Grube & Hovsepian - Trickster
17. Nifra - Waves
18. Markus Schulz featuring Dauby - Perfect (Digital X Remix) [Global Selection]
19. Basil O'Glue - Gilgamesh
20. Skytech - The Other Side
21. ID
Enjoy
(Inspiron) </pre>
</div>
I've used this code which parses the whole details instead of parsing the 'seeders' & 'leechers'
try {
document = Jsoup.connect(BLOG_URL).get();
title = document.title();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
// selector query
Elements nodeBlogStats = document.select("div#details");
// check results
if (nodeBlogStats.size() > 0) {
// get value
result = nodeBlogStats.get(0).text();
}

According to http://jsoup.org/apidocs/org/jsoup/select/Selector.html, you are looking for
E ~ F an F element preceded by sibling E
and
:contains(text) elements that contains the specified text.
I would try
Element seeders = document.select("dt:contains(Seeders) ~ dd").get(0);
Element leechers = document.select("dt:contains(Leechers) ~ dd").get(0);

Related

Submitting a search in website with Jsoup

I want to create an app that allows me to search for a city in this weather site.
I would like you to return the site corresponding to the search performed.
I tried to do this with an EditText and a Button, but the search does not seem to work, because the page returned is the same as the initial one.
How can I solve this problem?
This is my code:
final EditText editText = findViewById(R.id.edit);
final TextView textView = findViewById(R.id.testo);
Button button = findViewById(R.id.clicca);
button.setOnClickListener(new View.OnClickListener() {
Connection.Response res = null;
Document doc;
#Override
public void onClick(View v) {
try {
doc = Jsoup.connect("https://www.ilmeteo.it/meteo/cerca")
.data("citta", "bari")
.post();
} catch (IOException e) {
e.printStackTrace();
}
textView.setText(doc.location());
}
});
This is the site's HTML code:
<div id="search">
<a id="search-logo" href="https://www.ilmeteo.it" title="IL Meteo - Home Page"></a>
<a id="search-arrow" href="javascript:;" onclick="toggleSearchMenu('main');"></a>
<form id="form-search0" name="search0" action="https://www.ilmeteo.it/meteo/cerca" method="get" onsubmit="return CheckSearchForm0()">
<input id="search-main" name="citta" value="" size="17" maxlength="64" class="txtSearch" onfocus="this.className='txtSearch';openSearchMenu('main');virginSearch=false;" onblur="if(this.value=='')this.className='txtSearch txtSearchE'" title="Cerca comune o località" autocomplete="Off" tabindex="1" onkeyup="ajax_showOptions(this,'type=IT&sort=smart',event)" type="text">
</form>
<a id="search-button" href="javascript:;" onclick="$('#form-search0').submit()"></a>
<div id="fav-search-cont"><span id="fav-search"></span></div>
</div>
EDIT
Thank you all! Your answers have solved my problem :)
But I have problems with another weather site. How do I instead from this other site do the same operation as before?
P.S .: the problem of this site is the mandatory click of the city to search and that next to the city in the URL there is a code, like this "http://www.meteo.it/meteo/roma-58091".
This is the second site's HTML code:
<div class="pksrc">
<form class="search-form" onsubmit="return false">
<fieldset class="icon-lens">
<input type="hidden" id="searchid" disabled="" value="">
<input type="hidden" id="searchtarget" value="_blank">
<input type="text" class="query " id="searchinput" name="search" value="" placeholder="Cerca località" autocomplete="off">
<input type="submit" value="submit">
</fieldset>
</form>
<div id="search-menu"></div>
<ul id="search-option">
<li>Milano</li>
<li>Roma</li>
<li>Napoli</li>
</ul>
</div>
Answer for your second question:
This is the way I found out to navigate to the "city" page you want to search.
Step 1:
Pass the starting letter of the city to the request and get a JSON response.
Ex: If you want to search "Milano", then get the results for letter "m" using this URL http://www.meteo.it/autosuggest/m.json?
The sample JSON response is:
{
"url": [
{
"ita": "meteo",
"sea": "meteo-mare",
"ski": "meteo-montagna",
"eur": "meteo",
"wor": "meteo"
}
],
"results": [
{
"code": "15146",
"value": "Milano (MI)",
"value_it": "milano",
"value_en": "milan",
"url": "ita"
},
{
"code": "20030",
"value": "Mantova (MN)",
"value_it": "mantova",
"value_en": "mantua",
"url": "ita"
},
]
}
From the JSON response get the milano city's code & value_it
Ex: code=15146 & value_it=milano
Step2:
Construct the URL using retrieved values.
Ex: http://www.meteo.it/meteo/value_it-code
http://www.meteo.it/meteo/milano-15146
Example for Comacchio city:
Request URL : http://www.meteo.it/autosuggest/c.json
JSON Response:
{
"code": "38006",
"value": "Comacchio (FE)",
"value_it": "comacchio",
"value_en": "comacchio",
"url": "ita"
}
Construct URL using JSON values:
http://www.meteo.it/meteo/comacchio-38006
String city = "Bari";
String url = "https://www.ilmeteo.it/meteo/cerca?citta="+ city;
Document doc = Jsoup.connect(url).get();
List<Element> rows = doc.select("table[class=datatable] > tbody > tr[id*='']");
for (Element row : rows) {
System.out.println(row.text());
}
Sample Output:
13 pioggia e schiarite 25.3° NW 35 / 36 forte51% 0.1 mm modeste0% 3070m1008mb 27°>10km buona 51 7.9
14 pioggia e schiarite 25.5° NW 34 / 35 forte50% 0.1 mm modeste0% 3050m1008mb 27°>10km buona 50 7.6
15 poco nuvoloso 25.5° NW 32 / 33 forte50% - assenti -0% 3070m1008mb 27°>10km buona 50 6.5
16 poco nuvoloso 25.4° NW 31 / 32 forte49% - assenti -0% 3100m1008mb 27°>10km buona 49 5.1
17 sereno 25° NW 30 / 31 moderato50% - assenti -0% 3130m1008mb 26°>10km buona 50 3.3
18 sereno 24.5° NNW 28 / 29 moderato52% - assenti -0% 3180m1008mb 25°>10km buona 52 1.6
19 sereno 23.7° NW 26 / 27 moderato54% - assenti -0% 3220m1009mb 24°>10km buona 54 0.4
20 sereno 22.6° NW 23 / 26 moderato57% - assenti -0% 3190m1009mb 23°>10km buona 57 0
21 sereno 21° NW 21 / 24 moderato68% - assenti -0% 3160m1009mb 21°>10km buona 68 0
22 poco nuvoloso 20° NW 19 / 24 moderato76% - assenti -0% 3130m1009mb 20°>10km buona 76 0
23 nubi sparse 19.4° WNW 18 / 24 moderato80% - assenti -0% 3130m1009mb 20°>10km buona 80 0
24 nubi sparse 19.2° WNW 18 / 23 moderato81% - assenti -0% 3130m1009mb 20°>10km buona 81 0
01 poco nuvoloso 19° WNW 18 / 22 moderato82% - assenti -0% 3140m1009mb 20°>10km buona 82 0
02 poco nuvoloso 18.6° WNW 17 / 21 moderato84% - assenti -0% 3160m1009mb 19°>10km buona 84 0

jsoup basic scraping technique

I have checked through all forum but i dont understand where i am wrong. basically i try to scrape the word "Sun, 04 Feb 2018" out. anyhelp will be appreciated on what concept i got wrong in this case but i keep getting no null return.
aspx. code
<div class="divLatestDraws slider" data-min-width="282">
<div class="slide-wrapper four-d">
<ul class="slide-container ulDraws" style="width: 1808px; margin-left: 0px;">
<li style="width: 301.33px;"><div class="tables-wrap">
<table class="table table-striped orange-header">
<thead>
<tr>
<th class="drawDate">Sun, 04 Feb 2018</th>
my jsoup code
// Connect to the web site
Document document = Jsoup.connect(url).get();
// Using Elements to get the Meta data
Elements xxx = document.select("div[class=divLatestDraws slider]");
Elements zzz = xxx.select("th[class=drawDate]");
desc=zzz.body().text();
} catch (IOException e) {
e.printStackTrace();
}
return null;

I want to fetch some info from a webpage in android studio

I want to fetch sometitle and somelink from HTML code below for my android app ...
HELP ME :(
<div class="proper-list list-group page-cat-wrap">
<figure class="col-md-12 thumb-vertical">
<div class="col-xs-4 thumb-image">
<a href="/somelink.html" class="image-hover">
<img alt="SomeTag" src="/storage/images/100/2382.jpg">
</a>
</div>
<figcaption class="col-xs-8">
<h3>
<a href="/somelink.html">
SomeTitle
</a>
</h3>
<p>
<a href="/secondlink.html">
SomeText
</a>
</p>
</figcaption>
<div class="clearfix"></div>
<div class="mobile-only icon-right">
<a href="/somelink.html">
<i class="fa fa-chevron-right" aria-hidden="true"></i>
</a>
</div>
I heard of jsoup but won't able to get links with jsoup.
Jsoup is the best library to parse any of HTML content or document,
Here is the link and example,
http://jsoup.org/
Example
private void parsehtmlPage(){
File input = new File("/yourFolder/home.html");
Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");
Element elementId = doc.getElementById("elementId");
Elements ankerLinks = elementId.getElementsByTag("a");
for (Element link : ankerLinks) {
String linkHref = link.attr("href");
String linkText = link.text();
}
}

Jsoup selector basic Android

I'm trying to learn jsoup for android and I'm having a hard time with learning the selectors. I've already set up the application with simple buttons and textviews that can retrieve basic info i.e. title etc. Now I'm trying to get the text that I've highlighted below. I've tried multiple times and cannot get the correct syntax down.
<li class="info info info">
<script>clicked = false</script>
<div class="simple">
<p class="name">TEXT I NEED TO PARSE </p>
<ul class="Type">
<li>Normal</li> </ul>
<p class="address">120 Hollywood Blvd.</p>
</div>
<div class="sortables">
<p class="inches"></p>
</div>
<div class="action_links">
</div>
Document doc = null;
try {
doc = Jsoup.connect("http://example.com/index.html").get();
} catch (IOException e) {
// TODO Throws exception
}
Element simple = doc.getElementsByClass("simple").first();
Element p = simple.getElementsByClass("name").first();
Element a = p.select("a").first();
String text = a.text();
System.out.println(text);

Use geo coordinates from google maps link/url/ in Android device

I have intent filter to intercept url's from Google maps.
My problem is: shortened link. If I don't know coordinates - links are useless. Unfortunately in Google maps for android links are shortened. In web user can choose type of Google maps link - short or not. In this case I have chance to use coordinates.
In short: If have link like this
http://maps.google.com/maps?q=45.089036,+-106.347656&num=1&t=h&vpsrc=0&ie=UTF8&z=4&iwloc=A
there is not problem to read coordinates. But if link is:
http://m.google.com/u/m/zIOcsV
ooops... Google have internal way to solve this link.
Does anybody found way to get coords from second link?
you can use this example of jquery mobile app using google maps geo coordinates:
jsfiddle-google-maps-geo-coordinates
or with custom images here: jsfiddle-google-map-with-images
the head is:
<link rel="stylesheet" href="http://code.jquery.com/mobile/1.2.0/jquery.mobile-1.2.0.min.css" />
<!--link rel="stylesheet" href="/Content/jquery-mobile-responsive.css" /-->
<script src="http://code.jquery.com/jquery-1.8.3.min.js"></script>
<script src="http://code.jquery.com/mobile/1.2.0/jquery.mobile-1.2.0.min.js"></script>
<script src="http://maps.google.com/maps/api/js?sensor=true&language=en"></script>
<script>
$(document).bind("mobileinit", function () {
$.mobile.ajaxEnabled = false;
});
$(Start);
function Start() {
$("#google_map .ui-collapsible-heading").click(function () {
var locations = [
['<br/>Main Office', 31.590496, 34.561981, 4],
['Sensor 16<br/>User Name: 16_DE-1R', 31.590595, 34.562980, 5],
['Sensor 17<br/>User Name: 17_TEN-2MP', 31.590694, 34.563979, 3],
['Sensor 18<br/>User Name: 18_TEN-2MP', 31.590793, 34.564978, 2],
];
var map = new google.maps.Map(document.getElementById('googleMap'), {
zoom: 17,
center: new google.maps.LatLng(31.590892, 34.561977),
mapTypeId: google.maps.MapTypeId.ROADMAP
});
var infowindow = new google.maps.InfoWindow();
var marker, i;
for (i = 0; i < locations.length; i++) {
marker = new google.maps.Marker({
position: new google.maps.LatLng(locations[i][3], locations[i][2]),
map: map
});
google.maps.event.addListener(marker, 'click', (function (marker, i) {
return function () {
infowindow.setContent(locations[i][0]);
infowindow.open(map, marker);
}
})(marker, i));
}
});
}
</script>
and body:
<div data-role="page" class="type-interior">
<div data-role="header" data-theme="a">
<a data-icon="back" href="#" rel="external">Back</a>
<h1>Sensor</h1>
</div>
<div data-role="content">
<div class="content-primary">
<h2>Sensor: 16</h2>
<div data-role="collapsible-set">
<div id="google_map" data-role='collapsible' data-collapsed=true data-theme="b" data-content-theme="d">
<h1>Sensor Map</h1>
<div id="googleMap" style="width:100%; height:300px;"></div>
</div>
</div>
<ul data-role="listview" data-inset="true" data-theme="b">
<li data-role="list-divider"></li>
<li>Configure Comm</li>
<li>Measurements</li>
<li>Users</li>
</ul>
</div>
<div class="content-secondary">
<div data-role="collapsible" data-collapsed="true" data-theme="b" data-content-theme="d">
<h3>More Options</h3>
<ul data-role="listview" data-theme="c" data-dividertheme="d">
<li data-role="list-divider">Actions</li>
<li>Add User</li>
<li>Edit Sensor</li>
<li>Delete Sensor</li>
</ul>
</div>
</div>
</div>
<div data-role="footer" data-theme="a">
<h4>Mobile</h4>
</div>

Categories

Resources