Im pretty new to jsoup. For days im trying now to read out a simple number from a span without any success.
I hope to find help here. My html:
<div class="navi">
<div class="tab mail">
<a href="/comm.php/indexNew/" accesskey="8" title="Messages">
<span class="tabCount">1 </span>
<img src="/b2/message.png" alt="Messages" class="moIcon i24" />
</a>
</div>
The class tabCount excists 3 times though in the whole document and I am interested in the first span with this class.
Now I am trying in onCreate() of a service to create a thread with:
Thread downloadThread = new Thread() {
public void run() {
Document doc;
try {
doc = Jsoup.connect("https://www.bla.com").get();
String count = doc.select("div.navi").select("div.tab.mail").select("a[href]").first().select("tabCount").text();
Log.d("SOMETHING", "test"+(count));
} catch (IOException e) {
e.printStackTrace();
}
}
};
downloadThread.start();
This forces my app to crash. The same if i change text() to ownText(). if i remove text() then the app can start but it gives me null.
what am i doing wrong? By the way, besides the service a webview is loading the same url. might that be a problem?
You only need to select the element you're interested in, you don't need to get every outer element before. In your example you could try
String count = doc.select("span.tabCount").text();
Where you define the type of the element "span" and class name ".tabcount"
For an example that might help you, look at this link
Edit:
Try this code instead, this will get the value of the first span.
Elements elements = doc.select("span.tabCount");
String count = elements.first().text();
And if you want to print all elements you could do like this.
Elements elements = doc.select("span.tabCount");
for (Element e : elements) {
Log.d("Something", e.text();
}
Haven't you meant .select(".tabCount")?
BTW, on Android AsyncTasks are more convenient than Threads. Also, empty catch blocks are a bad practice.
Your select statement is wrong. You can insert the whole selection string in one line. Furthermore you have to prefix "tabCount" with a dot as it is a class.
String count = doc.select("div.navi div.tab.mail a").first().select(".tabCount").text();
Related
I'm writing an Android app and trying to figure out how should I construct my call to get table data from this webpage: http://uk.soccerway.com/teams/scotland/saint-mirren-fc/1916/squad/
I've read the cookbook from JSOUP website but because I haven't used this library before I am bit stuck. I came up with something like this:
doc = Jsoup.connect("http://uk.soccerway.com/teams/scotland/saint-mirrenfc/1916/squad/").get();
Element squad = doc.select("div.squad-container").first(); Element
Elements table = squad.select("table squad sortable");
As you can see I'm nowhere near getting players statistics yet. I think the next step should be to point new Element object to "tbody" tag inside the "table squad sortable"?
I know I will have to use for loop once I manage to read the table and then read each row inside the loop.
Unfortunately table structure is a bit complex for someone with no experience so I would really appreciate some advice!
Basically each row has the following selector -
#page_team_1_block_team_squad_3-table > tbody:nth-child(2) > tr:nth-child(X) where X is the row's number (starting at 1).
One way is to iterate over the rows and extract the info:
String url = "http://uk.soccerway.com/teams/scotland/saint-mirren-fc/1916/squad/";
String userAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0";
Document doc = null;
try {
doc = Jsoup.connect(url)
.userAgent(userAgent)
.get();
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
int i = 1;
Elements row;
do {
row = doc.select("#page_team_1_block_team_squad_3-table > tbody:nth-hild(2) > tr:nth-child(" + i + ")");
for (Element el : row) {
System.out.print(el.select(".shirtnumber").text() + " ");
System.out.println(el.select(".name").text());
i++;
}
} while (row != null);
This will print the number and name of each player. Since I don'r want to count the number of rows (and keep the program felxible for changes), I orefer to use do...while loop - I will iterate as ling as the row exists (or not empty).
The output I get:
1 J. Langfield
21 B. O'Brien
28 R. Willison
2 S. Demetriou
3 G. Irvine
4 A. Webster
...
Use your browser's developer tools to get the names of the other columns and use it to get all the info you need.
I'm using this code to select a URL in div tag
Elements mElements = doc.select("a[class^=titr]");
Element linkElement = mElements.select("a").first();
linkElement.attr("href");
but in this cod I just can see first item Because the method is first();
how can I specific that I want to select for example item 0 to 20 instead of first ??
The mElements is returned as a List<Element> - try mElements.get(0) for each one; for all of them iterated:
print("\nElements: (%d)", mElements.size());
for (Element link : mElements) {
print(" * %s <%s> (%s)", link.tagName(),link.attr("abs:href"), link.attr("rel"));
}
http://jsoup.org/apidocs/org/jsoup/select/Elements.html
This is however probably not recommended, since arrays change over time; perhaps you want a better selector method.
http://jsoup.org/cookbook/extracting-data/selector-syntax
I seem to be getting what seems like some extra line breaks after using this method to set the text of a TextView
message.setText(Html.fromHtml( message ));
How can I remove these? They cause my layout to get warped since it adds two extra lines to the output.
The string was saved to my sqlite database via Html.toHtml( editText.getText() ).trim();
Initial string input : hello
Log output of the message variable: <p dir="ltr">hello</p>
you can use this lines ... totally works ;)
i know your problem solved but maybe some one find this useful .
try{
string= replceLast(string,"<p dir=\"ltr\">", "");
string=replceLast(string,"</p>", "");
}catch (Exception e) {}
and here is replaceLast ...
public String replceLast(String yourString, String frist,String second)
{
StringBuilder b = new StringBuilder(yourString);
b.replace(yourString.lastIndexOf(frist), yourString.lastIndexOf(frist)+frist.length(),second );
return b.toString();
}
For kotlin you can use
html.trim('\n')
Looks like toHtml assumes everything should be in a <p> tag. I'd strip off the beginning and ending <p> and </p> tags before writing to the database.
This is working as below in Kotlin.
val myHtmlString = "<p>Test<\/p>"
HtmlCompat.fromHtml(myHtmlString.trim(), FROM_HTML_MODE_COMPACT).trim('\n')
I'm writing an app that fetches text sms from html of website smsmaza.in, for which I'm using Jsoup to parse HTML. Following is the code which is troubling me
BLOG_URL="http://www.smsmaza.in/";
Document document;
document = Jsoup.connect(BLOG_URL).timeout(12000).get();
Elements texts=document.getElementsByClass("sms");
When I print value of texts.size() it comes to be zero, which means nothing is selected. What is the problem?
Thanks in advance.
Here is the complete program :- http://pastecode.org/index.php/view/20317090
from your code i have used:
Document document=Jsoup.connect("http://www.smsmaza.in/").timeout(12000).get();
Elements texts=document.getElementsByClass("sms");
Log.e("sms", Integer.toString(texts.size()));
and logcat show me 10 sms classes are selected. so it is working well.
you should not block setContentView. and in your bellow code:
if(texts.size()>0){
int i=0;
while(i<texts.size()){
result[i]=texts.get(i).text();
//you should increase your i here
}
}
you should increase i++ in while loop.
if it doesn't help, try this:
int i = 0;
for(Element element : texts){
result[i] = element.text();
i++;
}
Could anybody post here some code how can I read word by word from file? I only know how to read line by line from file using BufferedReader. I'd like if anybody posted it with BufferedReader.
I solved it with this code:
StringBuilder word = new StringBuilder();
int i=0;
Scanner input = new Scanner(new InputStreamReader(a.getInputStream()));
while(input.hasNext()) {
i++;
if(i==prefNamePosition){
word.append(prefName);
word.append(" ");
input.next();
}
else{
word.append(input.hasNext());
word.append(" ");
}
}
There's no good way other than to read() and get a character at a time until you get a space or whatever criteria you want for determining what a "word" is.
If you're trying to replace the nth token with a special value, try this:
while (input.hasNext()) {
String currentWord = input.next();
if(++i == prefNamePosition) {
currentWord = prefName;
}
word.append(currentWord);
word.append(" ");
}
Another way is to employ a tokenizer (e.g. in Java) and using the delimiter space character (i.e. ' '). Then just iterate through the tokens to read each word from your file.
You can read lines and then use splits. There is no clear definition of word but if you want the ones separated by blank spaces you can do it.
You could also use regular expressions to do this.