I've been using Jsoup in order to fetch certain words from google search but it fails to my understanding in the Jsoup query process.
It's getting successfully into the doInBackground method but it won't print the title and body of each link on the search.
My guess is that the list I'm getting from doc.select (links) is empty.
which brings it to query syntax problem
value - it's the keyword search, in my case, it's a barcode that actually works. Here's the link
Here it's the async call from another class:
String url = "https://www.google.com/search?q=";
if (!value.isEmpty())
{
url = url + value + " price" + "&num10";
Scrape_Asynctasks task = new Scrape_Asynctasks();
task.execute(url);
}
and here is the async task itself:
public class Scrape_Asynctasks extends AsyncTask<String, Integer, String>
{
#Override
protected void onPreExecute() {
super.onPreExecute();
}
#Override
protected String doInBackground(String... strings) {
try
{
Log.i("IN", "ASYNC");
final Document doc = Jsoup
.connect(strings[0])
.userAgent("Jsoup client")
.timeout(5000).get();
Elements links = doc.select("li[class=g]");
for (Element link : links)
{
Elements titles = link.select("h3[class=r]");
String title = titles.text();
Elements bodies = link.select("span[class=st]");
String body = bodies.text();
Log.i("Title: ", title + "\n");
Log.i("Body: ", body);
}
}
catch (IOException e)
{
Log.i("ERROR", "ASYNC");
}
return "finished";
}
#Override
protected void onProgressUpdate(Integer... values) {
super.onProgressUpdate(values);
}
#Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
}
}
Don't use "Jsoup client" as your user agent string. Use the same string as your browser, eg. "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0". Some sites (including google) don't like it.
Your first selector should be .g: Elements links = doc.select(".g");
The sites uses javascript, so you will not get all the results as you get in your browser.
You can disable JS in your browser and see the difference.
Related
I am pretty new with the concept of asynctask and i have an asynctask that gets me a json from an api with parameter an then(postexecute) puts the content inside textviews to be shown(they are set visible after setting the text), the thing is i am trying to validate that the json isnt actually empty, and with my code i actually do that, but if the parameter i use is correct, the validation still detects that its empty, if i try to get it again(by pressing the button that triggers the asynctask) after 2 or three tries it will actually get it tho, i think its because i am doing it on the background, here is the asynctask
private class ConsultarDatos extends AsyncTask<String, Void, String> {
#Override
protected String doInBackground(String... urls) {
// params comes from the execute() call: params[0] is the url.
try {
return downloadUrl(urls[0]);
} catch (IOException e) {
return "Unable to retrieve web page. URL may be invalid.";
}
}
// onPostExecute displays the results of the AsyncTask.
#Override
protected void onPostExecute(String result) {
JSONArray ja = null;
try {
ja = new JSONArray(result);
txtNombre.setText(ja.getString(0) +" " + ja.getString(1));
txtCategoria.setText(ja.getString(2));
txtDNI.setText(ja.getString(3));
txtEstado.setText(ja.getString(4));
//working=false;
} catch (JSONException e) {
e.printStackTrace();
}
}
}
and here is what i am trying to do
btnGenerar.setOnClickListener(new View.OnClickListener() {
#Override
public void onClick(View view) {
new ConsultarDatos().execute("https://api-adress/file.php?DNI=" + etDNI.getText().toString());
//while(working)
//{
//}
if (txtCategoria.getText()!="") {
btnGenerar.setVisibility(View.INVISIBLE);
etDNI.setVisibility(View.INVISIBLE);
txtCategoria.setVisibility(View.VISIBLE);
txtDNI.setVisibility(View.VISIBLE);
txtEstado.setVisibility(View.VISIBLE);
txtNombre.setVisibility(View.VISIBLE);
imgTarjeta.setVisibility(View.VISIBLE);
}
else
{
Toast.makeText(getApplicationContext(),"DNI Incorrecto",Toast.LENGTH_LONG).show();
}
}
});
as i commented i tried to do a while that would wait until the textsviews are all set but that just crashed my app
I resolved it, just moved the the visibility set and validation to the end of the onPostExecute and just to be sure i put the toast in the exception too just so the user gets some feedback
protected void onPostExecute(String result) {
JSONArray ja = null;
try {
ja = new JSONArray(result);
txtNombre.setText(ja.getString(0) +" " + ja.getString(1));
txtCategoria.setText(ja.getString(2));
txtDNI.setText(ja.getString(3));
txtEstado.setText(ja.getString(4));
if (txtCategoria.getText()!="") {
btnGenerar.setVisibility(View.INVISIBLE);
etDNI.setVisibility(View.INVISIBLE);
txtCategoria.setVisibility(View.VISIBLE);
txtDNI.setVisibility(View.VISIBLE);
txtEstado.setVisibility(View.VISIBLE);
txtNombre.setVisibility(View.VISIBLE);
imgTarjeta.setVisibility(View.VISIBLE);
}
else
{
Toast.makeText(getApplicationContext(),"DNI Incorrecto",Toast.LENGTH_LONG).show();
}
} catch (JSONException e) {
e.printStackTrace();
Toast.makeText(getApplicationContext(),"DNI Incorrecto",Toast.LENGTH_LONG).show();
}
}
Use something like https://www.getpostman.com/ to see what the result of your API call is. Right now it seems like you don't know what you're getting back, and how consistently it comes back. You need to validate that your server is sending you back valid data.
Using a json library to parse the JSON response, such as GSON or Moshi would help you out as well. Right now you're trying to get the values based on arbitrary numbers. You could be having an exception from just one field missing, but there's not enough info to tell. Gson is fairly easy to set up in my experience: https://github.com/google/gson
Aim
In a fragment, I have a search bar which looks for online news about what the user typed. I would want to display these news (title + description + date of publication + ... etc.) in the GUI, as vertical blocks.
Implementation
Explanations
In the fragment, within the search event handling, I instanciated an asynchronous task and execute it with the good URL REST API I use to do the search.
In the asynchronous task, I make use of this REST API (thanks to the URL and some required parameters as an authorization key, etc.). When my asynchronous task gets answered, it must update the fragment's GUI (i.e.: it must vertically stack GUI blocks containing the titles, descriptions, etc. of the got news).
Sources
You will find sources in the last part of this question.
My question
In the asynchronous task (more precisely: in its function that is executed after having got the answer), I don't know how to get the calling fragment. How to do this?
Sources
Fragment part
private void getAndDisplayNewsForThisKeywords(CharSequence keywords) {
keywords = Normalizer.normalize(keywords, Normalizer.Form.NFD).replaceAll("[^\\p{ASCII}]", "");
new NetworkUseWorldNews().execute("https://api.currentsapi.services/v1/search?keyword=" + keywords + "&language=en&country=US");
}
Asynchronous task part
public class NetworkUseWorldNews extends AsyncTask<String, Void, String> {
#Override
protected String doInBackground(String[] urls) {
StringBuilder string_builder = new StringBuilder();
try {
URL url = new URL(urls[0]);
HttpsURLConnection https_url_connection = (HttpsURLConnection) url.openConnection();
https_url
_connection.setRequestMethod("GET");
https_url_connection.setDoOutput(false);
https_url_connection.setUseCaches(false);
https_url_connection.addRequestProperty("Authorization", "XXX");
InputStream input_stream = https_url_connection.getInputStream();
BufferedReader buffered_reader = new BufferedReader(new InputStreamReader(input_stream));
String line;
while((line = buffered_reader.readLine()) != null) {
string_builder.append(line);
}
buffered_reader.close();
} catch (IOException e) {
e.printStackTrace();
}
return string_builder.toString();
}
#Override
protected void onPostExecute(String result) {
try {
JSONObject news_response_http_call = new JSONObject(result);
switch(news_response_http_call.getString("status")) {
case "ok":
JSONArray news = news_response_http_call.getJSONArray("news");
for(int i = 0; i < news.length(); i++) {
JSONObject a_news = news.getJSONObject(i);
String title = a_news.getString("title");
String description = a_news.getString("description");
String date_of_publication = a_news.getString("published");
String url = a_news.getString("url");
String image = a_news.getString("image");
System.out.println(title + ": " + date_of_publication + "\n" + image + "\n" + url + "\n" + description);
WorldNewsFragment world_news_fragment = ...;
}
break;
}
} catch (JSONException e) {
e.printStackTrace();
}
}
}
If I am right, you want to update View of your caller Fragment. if FragmentA called service then FragmentA should be update.
However the approach you are asking is wrong. Instead of getting caller Fragment in your AsyncTask response. You should do it with Callback.
So now you will need to pass callback in AsyncTask. So instead of posting full code, here are already answers with this problem.
Finally your calling syntax will look like.
NetworkUseWorldNews task = new NetworkUseWorldNews(new OnResponseListener() {
#Override
public void onResponse(String result) {
// Either get raw response, or get response model
}
});
task.execute();
Actually I am still very unclear about your question. Let me know in comments if you have more queries.
Must checkout
Retrofit or Volley for calling Rest APIs
Gson for parsing JSON response automatically to models
Hi i have searched the internet and stackoverflow all day without any solution! :(
My problem is:
I have to show some schedules in a app, and the data is stored on a webpage, normally on PC you visit the page, enter your id, and it shows the schedule...
but how do i get to the schedule without interacting with a webview or stuff like that? i have to save some specific html data after login...
i have tired with jsoup, but after login, then the url changes, and i dont know how to get it, therefore i tried with webview, but this didnt work either
please help :)
public class getHTML extends AsyncTask{
String words;
#Override
protected Void doInBackground(Void... params) {
try {
final String USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36";
final String FORM_URL = "http://timetable.scitech.au.dk/apps/skema/VaelgElevskema.asp?webnavn=skema";
final String SCHEDULE_URL = "http://timetable.scitech.au.dk/apps/skema/ElevSkema.asp";
final String USERID = "201506426";
// # Go to search page
Connection.Response loginFormResponse = Jsoup.connect(FORM_URL)
.method(Connection.Method.GET)
.userAgent(USER_AGENT)
.execute();
// # Fill the search form
FormElement loginForm = (FormElement)loginFormResponse.parse()
.select("form").first();
// ## ... then "type" the id ...
Element loginField = loginForm.select("input[name=aarskort]").first();
loginField.val(USERID);
// # Now send the form
Connection.Response loginActionResponse = loginForm.submit()
.cookies(loginFormResponse.cookies())
.userAgent(USER_AGENT)
.execute();
// # go to the schedule
Connection.Response someResponse = Jsoup.connect(SCHEDULE_URL)
.method(Connection.Method.GET)
.userAgent(USER_AGENT)
.execute();
// # print out the body
Element el = someResponse.parse()
.select("body").first();
words = el.text();
System.out.println(loginActionResponse.parse().html());
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void aVoid) {
super.onPostExecute(aVoid);
tv.setText(words);
Toast.makeText(MainActivity.this, "DET VIRKER!", Toast.LENGTH_SHORT).show();
}
}
I have an image in JPG in a sit (I suppose it is HTML format but I am not sure about it). I open the source of the page and I see there the image I need written this way.
If I take the link it show me the image.
But i don't know how can I get from the URL page to get this link. It is not look like written in JSON format.
How can I get it?
Thanks
Bar.
After some play I get to this:
The meta is the elements, and og.image and content are one of there meta data attribute.
So I do as follow to get the image URL string
String imageLink=null;
try {
Log.d(TAG, "Connecting to [" + strings[0] + "]");
Document doc = Jsoup.connect(strings[0]).get(); // put all the HTML page in Document
// Get meta info
Elements metaElems = doc.select("meta");
for (Element metaElem : metaElems) {
String property = metaElem.attr("property");
if(property.equals("og:image"))// if find the line with the image
{
imageLink = metaElem.attr("content");
Log.d(TAG, "Image URL" + imageLink );
}
}
} catch (Exception e) {
e.printStackTrace();
exception =e;
return null;
}
Here I am posting the small code snippet for ingrate this kind of functionality may this help you.
Step 1: Add below gradle
compile 'org.jsoup:jsoup:1.10.2'
Step 2:
Use below async task for get all meta information from any Url.
public class MainActivity extends AppCompatActivity {
private ImageView imgOgImage;
private TextView text;
String URL = "https://www.youtube.com/watch?v=ufaK_Hd6BpI";
String UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36";
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
text = (TextView) findViewById(R.id.text);
imgOgImage = (ImageView) findViewById(R.id.imgOgImage);
new FetchMetadataFromURL().execute();
}
private class FetchMetadataFromURL extends AsyncTask<Void, Void, Void> {
String websiteTitle, websiteDescription, imgurl;
#Override
protected void onPreExecute() {
super.onPreExecute();
}
#Override
protected Void doInBackground(Void... params) {
try {
// Connect to website
Document document = Jsoup.connect(URL).get();
// Get the html document title
websiteTitle = document.title();
//Here It's just print whole property of URL
Elements metaElems = document.select("meta");
for (Element metaElem : metaElems) {
String property = metaElem.attr("property");
Log.e("Property", "Property =" + property + " \n Value =" + metaElem.attr("content"));
}
// Locate the content attribute
websiteDescription = metaElems.attr("content");
String ogImage = null;
Elements metaOgImage = document.select("meta[property=og:image]");
if (metaOgImage != null) {
imgurl = metaOgImage.first().attr("content");
System.out.println("src :<<<------>>> " + ogImage);
}
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void result) {
text.setText("Title : " + websiteTitle + "\n\nImage Url :: " + imgurl);
//t2.setText(websiteDescription);
Picasso.with(getApplicationContext()).load(imgurl).into(imgOgImage);
}
}
}
Note : Here I have just roughly making this demo.no any coding standard will user so please take care this while you ingrate this code in your application.I am just making this demo for learning purpose only.
Here I am just used youtube url for display meta data.you can used any url based on your requirement.
I hope you are clear with my logic.
Good Luck
I want to display in a TextView the Snow in the past 24 hours of a ski resort. I used the CSS path and tried other ways but nothing happens the TextView doesn't display nothing.
The web page: http://www.arizonasnowbowl.com/resort/snow_report.php
The CSS path: #container > div.right > table.interior > tbody > tr:nth-child(2) > td.infoalt
private class Description extends AsyncTask<Void, Void, Void> {
String desc;
#Override
protected void onPreExecute() {
super.onPreExecute();
mProgressDialog = new ProgressDialog(Snowreport.this);
mProgressDialog.setTitle("Snow Report");
mProgressDialog.setMessage("Loading...");
mProgressDialog.setIndeterminate(false);
mProgressDialog.show();
}
#Override
protected Void doInBackground(Void... params) {
try {
// Connect to the web site
Document document = Jsoup.connect(url).get();
Elements elms = document.select("td.infoalt");
for(Element e:elms)
if(e.className().trim().equals("infoalt"))
//^^^<--trim is required as,
// their can be leading and trailing space
{
TextView txtdesc = (TextView) findViewById(R.id.snowp24);
txtdesc.setText((CharSequence) e);
}
mProgressDialog.dismiss();
} catch (IOException e1) {
e1.printStackTrace();
}
return null;
}
The code:
Element div = doc.getElementById("contentinterior");
Elements tables = div.getElementsByTag("table");
Element table = tables.get(1);
String mSnow = table.getElementsByTag("tr").get(1).getElementsByTag("td").get(1).text();
You may have the incorrect String for the selection parameter. The correct selection to use as a parameter for Document.select() can be found by 'Inspecting the element' of a webpage most easily done by right clicking in the Chrome browser.
The following code may produce a better result for you:
final Elements tableElements = response.parse()
.getElementsByClass("info")
.select("td");
for (Element element : tableElements) {
String string = element.getElementsByClass("infoalt").text().trim()
Log.d("Jsoup", string);
}
Good luck and happy coding!