Android - Get hidden input value of html page using DefaultHttpClient

Android - Get hidden input value of html page using DefaultHttpClient - android

I already asked a similar question before, but wasn't providing enough details. So here is the "same" question, but this time more in-depth.
I've got a webpage with html, and somewhere on that page I have: <input name="__RequestVerificationToken" type="hidden" value="the_value_I_want" />
So, my question is: How can I get the value (the_value_I_want) of the hidden text field in Android, using HttpGet of an already open DefaultHttpClient connection?
My current code:
// Method to get the hidden-input value of the Token
private String getToken(){
String url = "http://myhost/Account/Login";
String hidden_token = "";
String response = "";
HttpGet get = new HttpGet(url);
try{
// Send the GET-request
HttpResponse execute = MainActivity.HttpClient.execute(get);
// Get the response of the GET-request
InputStream content = execute.getEntity().getContent();
BufferedReader buffer = new BufferedReader(new InputStreamReader(content));
String s = "";
while((s = buffer.readLine()) != null)
response += s;
content.close();
buffer.close();
// Get the value of the hidden input-field with the name __RequestVerificationToken
// TODO
}
catch(Exception ex){
ex.printStackTrace();
}
return hidden_token;
}
So, what should I add on the TODO-line?
Because the Token and Cookie only remain as long as the Session stays open, I can't use the Jsoup library for finding the hidden field (which I did by using the code below). Instead I need to use the already open DefaultHttpClient.
Jsoup code:
Document doc = Jsoup.connect(url).get(); // <- this opens a new session
org.jsoup.nodes.Element el = doc.select("input[name*=" + TOKEN).first();
hidden_token = el.attr("value");
Thanks in advance for the responses.
PS: For those wondering: I need this token to be able to Log-in using a Google-account with a POST-request, combined with the token I got from a Cookie.

Ok, lucky for me this was very simple. I just replaced Document doc = Jsoup.connect(url).get(); with Document doc = Jsoup.parse(html);
So in the code of the main post I replaced //TODO with:
Document doc = Jsoup.parse(response);
org.jsoup.nodes.Element el = doc.select("input[name*=" + TOKEN).first();
hidden_token = el.attr("value");
Edit 1:
I thought this did the trick, but it doesn't.. It still thinks there are two different Sessions opened.. :S
Does Jsoup.parse(...) open a new Jsoup.get-session behind the scenes?
Edit 2:
It's even worse.. Every time another page is opened on the website, another session is created and therefore another token is needed.. So I need to discuss some things with the creator of the website/web api hybrid and figure some things out.. Perhaps create a different log-in just for the Web API..
All in all I'm kinda frustrated right now, even though all the problems I've encountered are "solved"..

Related

When is it necessary to specific application/json Content-Type explicitly

Currently, I'm building a Android mobile app & Python restful server services.
I found that, it makes no different, whether or not I'm using
self.response.headers['Content-Type'] = "application/json"
The following code (which doesn't specific Content-Type explicitly) works fine for me. I was wondering, in what situation, I should specific Content-Type explicitly?
Python restful server services code
class DebugHandler(webapp2.RequestHandler):
def get(self):
response = {}
response["key"] = "value"
self.response.out.write(json.dumps(response))
application = webapp2.WSGIApplication([
('/debug', DebugHandler),
], debug = True)
Android mobile app client code
public static String getResponseBodyAsString(String request) {
BufferedReader bufferedReader = null;
try {
URL url = new URL(request);
HttpURLConnection httpURLConnection = (HttpURLConnection)url.openConnection();
initHttpURLConnection(httpURLConnection);
InputStream inputStream = httpURLConnection.getInputStream();
bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
int charRead = 0;
char[] buffer = new char[8*1024];
// Use StringBuilder instead of StringBuffer. We do not concern
// on thread safety. stringBuffer = new StringBuffer();
StringBuilder stringBuilder = new StringBuilder();
while ((charRead = bufferedReader.read(buffer)) > 0) {
stringBuilder.append(buffer, 0, charRead);
}
return stringBuilder.toString();
} catch (MalformedURLException e) {
Log.e(TAG, "", e);
} catch (IOException e) {
Log.e(TAG, "", e);
} finally {
close(bufferedReader);
}
return null;
}

Content-Type specifies what's inside the response (i.e. how to interpret the body of the response). Is it JSON, a HTML document, a JPEG, etc? It is useful when you have different representations of your resources and together with Accept it's a header involved in doing content negotiation between client and server.
Different clients might need different formats. A C# client might prefer XML, a Javascript client might prefer JSON, another client could work with multiple representations but try to request the most efficient one first and then settle for others if the server can't serve the preferred one, etc.
Content-Type is very important in the browser so that the user agent knows how to display the response. If you don't specify one the browser will try to guess, usually based on the extension and maybe fallback to some Save as... dialog if that fails also. In a browser, the lack of a Content-Type might cause some HTML to open a Save as... dialog, or a PDF file to be rendered as gibberish in the page.
In an application client, not having a Content-Type might cause a parsing error or might be ignored. If you server only serves JSON and your client only expects JSON then you can ignore the Content-Type, the client will just assume it's JSON because that's how it was built.
But what if at some point you want to add XML as a representation, or YAML or whatever? Then you have a problem because the client assumed it's always JSON and ignored the Content-Type. Now when it receives XML it will try to parse as JSON and fail. If instead the client was built with content types in mind and you always specify a Content-Type then your client will then take it into account and select an appropriate parser instead of blindly making assumptions.

Android - Regex-Pattern for hidden html-input field's value

I'm pretty new/bad with regex-patterns, but this is what I want:
I've got a webpage with html, and somewhere on that page I have: <input name="__RequestVerificationToken" type="hidden" value="the_value_I_want" />
So, my question is: How can I get the value (the_value_I_want) of the hidden text field in Android?
I did make the HttpGet already (see code below), I just need to know the correct Pattern for this.
Code:
// Method to get the hidden-input value of the Token
private String getToken(){
String url = "http://myhost/Account/Login";
String hidden_token = "";
String response = "";
HttpGet get = new HttpGet(url);
try{
// Send the GET-request
HttpResponse execute = MainActivity.HttpClient.execute(get);
// Get the response of the GET-request
InputStream content = execute.getEntity().getContent();
BufferedReader buffer = new BufferedReader(new InputStreamReader(content));
String s = "";
while((s = buffer.readLine()) != null)
response += s;
}
catch(Exception ex){
ex.printStackTrace();
}
// Get the value of the hidden input-field with the name __RequestVerificationToken
Pattern pattern = Pattern.compile("<input name=\"" + TOKEN + "\" type=\"hidden\" value=\".\" />", Pattern.DOTALL);
Matcher matcher = pattern.matcher(response);
while(matcher.find())
hidden_token = matcher.group();
return hidden_token;
}
So, what should I replace the following line with?
Pattern pattern = Pattern.compile("<input name=\"" + TOKEN + "\" type=\"hidden\" value=\".\" />", Pattern.DOTALL);
Or should I also change something else?
Thanks in advance for the responses.
PS: For those wondering: I need this token to be able to Log-in using a Google-account with a POST-request, combined with the token I got from a Cookie.
Edit 1:
After reading the answer of this stackoverflow question I think it's better to not use a regex-pattern for the HTML page. Does anyone know a better solution (I would appreciate it if this better solution would be with a code sample).
Edit 2:
I tried using Illegal Argument's answer and added the Jsoup library. I did indeed manage to get the token by making the following changes to my code above:
Replace everything in the try { ... } with:
// Get the value of the hidden input-field with the name __RequestVerificationToken
Document doc = Jsoup.connect(url).get();
org.jsoup.nodes.Element el = doc.select("input[name*=" + TOKEN).first();
hidden_token = el.attr("value");
This does indeed get me the token of the hidden field, but now I have an entire new problem.. The token changed, because Jsoup opens a new session. So basically I can't use the Jsoup and are "forced" to use the already open DefaultHttpClient that I also use for the POST.
I will make a new question for this though, since my original answer was just bad questioning by myself (not providing all the details) and so I accept Illegal Argument's answer as the correct one (though it didn't solved my current problem, it might help others).

Try using Jsoup library. Its is a regex parser built for this purpose.

Website login and keep session cookies

I am trying to scrape some content form a website but you must be logged in order to view specific content. I want make a login using user id & password and keep session cookies on: m.amway.com i tried using Jsoup.... however after using the code below i realize that Jsoup cannot read javascript which is what the website is based on....
Does anyone have a method i could use to login, keep session cookie, and scrape content, using something other than Jsoup? Thanks in advance.
public String Jlogin(String User, String Pass) throws Exception{
String title = "didnt work";
Response logRes = Jsoup.connect(AmwayURL)
.data("userid", User)
.data("userpswd", Pass)
.method(Method.POST)
.execute();
// get all cookies
Map<String, String> cookies = logRes.cookies();
Document doc1 = logRes.parse();
String sessionId = logRes.cookie("JSESSIONID");
Document doc2 = Jsoup
.connect("https://m.amway.com/business/volume/pvbv/inquiry.ashx")
.cookie("jsessionid", sessionId).get();
System.out.println(doc2);
title = doc2.toString() + "................." + sessionId;
return title;
}

You can use a much larger API called HttpClient.
has the following classes:
- HttpGet
- HttpPost
- HttpEntity
- HttpResponse
HttpResponse reads Javascript from any page, as follows:
EntityUtils.toString(HttpResponse.getEntity());
for more details on how to use the API, check this link (Extremly helps):
http://www.codeblues.in/blog/?p=5

Android: showing a captcha from server

For my application I need to use a captcha for verification. I am using this link for it: http://vocublablair.nl/webservices/send.php
It returns a string something like this: /webservices/simple-php-captcha.php?_CAPTCHA&t=0.59145200+1338304461
Then that link should be called, being: http://vocublablair.nl/webservices/simple-php-captcha.php?_CAPTCHA&t=0.59145200+1338304461
When I call this with the same HttpClient (so with the right session cookie) it gives the following error:
java.lang.IllegalArgumentException: Illegal character in query at index 94: http://vocublablair.nl/webservices/simple-php-captcha.php?_CAPTCHA&t=0.59145200+1338304461

best way is to generate your own captcha image because using third party resources may cause inconvenient for your customers.
for generating capthcha in android you could use this simple library :
http://simplecaptcha.sourceforge.net/

I solved the problem doing the following:
First of all the link I got back had to be HTML encoded using:
Html.fromHtml(String)
Calling the correct URL:
content = sb.toString(); //the fetched link
String myCookie = "";
//get session cookie from other call.
List<Cookie> cookies = ((DefaultHttpClient)httpClient).getCookieStore().getCookies();
for(Cookie c : cookies){
if(c.getName().equals("PHPSESSID")){
myCookie = c.getName() + "=" + c.getValue();
break;
}
}
URL url = new URL(Html.fromHtml("http://vocublablair.nl" + content).toString());
URLConnection connection = url.openConnection();
connection.setDoInput(true);
connection.setRequestProperty("Cookie", myCookie);
connection.connect();
Bitmap bit = BitmapFactory.decodeStream(new FlushedInputStream((InputStream) connection.getContent()));
I am using a FlushedInputStream. Read about it here: http://twigstechtips.blogspot.com/2011/10/android-loading-jpgpng-images-from-url.html

Gain URL from google image search in android

I am trying to be able to view the source code of a webpage after being given a URL in order to parse the text for a certain string which represents and image url.
I found this post which is pretty much what I am after trying to do but can't get it working:
Post
This is my code below.
public String fetchImage() throws ClientProtocolException, IOException {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet("www.google.co.uk/images?q=songbird+oasis");
HttpResponse response = client.execute(request);
String html = "";
InputStream in = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder str = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null)
{
str.append(line);
}
in.close();
html = str.toString();
return html;
}
but for some reason it just does not work. It forces me to use a try catch statement in calling the method. Once this works I think it will simple from here using regex to find the string "href="/imgres?imgurl=........jpg" to find the url of a jpg image to then be shown in an image view.
Please tell me if i'm going at this all wrong.

First, Google has a search API, which will be a better solution than the scraping you are going through, since the API will be reliable, and your solution will not be.
Second, use the BasicResponseHandler pattern for string responses, as it is much simpler.
Third, saying something "just does not work" is a pretty useless description for a support site like this one. If it crashes, as kgiannakakis pointed out, you will have an exception. Use adb logcat, DDMS, or the DDMS perspective in Eclipse to examine the stack trace and find out what the exception is. That will give you some clues for how to solve whatever problem you have.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.