Android - Regex-Pattern for hidden html-input field's value

Android - Regex-Pattern for hidden html-input field's value - android

I'm pretty new/bad with regex-patterns, but this is what I want:
I've got a webpage with html, and somewhere on that page I have: <input name="__RequestVerificationToken" type="hidden" value="the_value_I_want" />
So, my question is: How can I get the value (the_value_I_want) of the hidden text field in Android?
I did make the HttpGet already (see code below), I just need to know the correct Pattern for this.
Code:
// Method to get the hidden-input value of the Token
private String getToken(){
String url = "http://myhost/Account/Login";
String hidden_token = "";
String response = "";
HttpGet get = new HttpGet(url);
try{
// Send the GET-request
HttpResponse execute = MainActivity.HttpClient.execute(get);
// Get the response of the GET-request
InputStream content = execute.getEntity().getContent();
BufferedReader buffer = new BufferedReader(new InputStreamReader(content));
String s = "";
while((s = buffer.readLine()) != null)
response += s;
}
catch(Exception ex){
ex.printStackTrace();
}
// Get the value of the hidden input-field with the name __RequestVerificationToken
Pattern pattern = Pattern.compile("<input name=\"" + TOKEN + "\" type=\"hidden\" value=\".\" />", Pattern.DOTALL);
Matcher matcher = pattern.matcher(response);
while(matcher.find())
hidden_token = matcher.group();
return hidden_token;
}
So, what should I replace the following line with?
Pattern pattern = Pattern.compile("<input name=\"" + TOKEN + "\" type=\"hidden\" value=\".\" />", Pattern.DOTALL);
Or should I also change something else?
Thanks in advance for the responses.
PS: For those wondering: I need this token to be able to Log-in using a Google-account with a POST-request, combined with the token I got from a Cookie.
Edit 1:
After reading the answer of this stackoverflow question I think it's better to not use a regex-pattern for the HTML page. Does anyone know a better solution (I would appreciate it if this better solution would be with a code sample).
Edit 2:
I tried using Illegal Argument's answer and added the Jsoup library. I did indeed manage to get the token by making the following changes to my code above:
Replace everything in the try { ... } with:
// Get the value of the hidden input-field with the name __RequestVerificationToken
Document doc = Jsoup.connect(url).get();
org.jsoup.nodes.Element el = doc.select("input[name*=" + TOKEN).first();
hidden_token = el.attr("value");
This does indeed get me the token of the hidden field, but now I have an entire new problem.. The token changed, because Jsoup opens a new session. So basically I can't use the Jsoup and are "forced" to use the already open DefaultHttpClient that I also use for the POST.
I will make a new question for this though, since my original answer was just bad questioning by myself (not providing all the details) and so I accept Illegal Argument's answer as the correct one (though it didn't solved my current problem, it might help others).

Try using Jsoup library. Its is a regex parser built for this purpose.

Related

When is it necessary to specific application/json Content-Type explicitly

Currently, I'm building a Android mobile app & Python restful server services.
I found that, it makes no different, whether or not I'm using
self.response.headers['Content-Type'] = "application/json"
The following code (which doesn't specific Content-Type explicitly) works fine for me. I was wondering, in what situation, I should specific Content-Type explicitly?
Python restful server services code
class DebugHandler(webapp2.RequestHandler):
def get(self):
response = {}
response["key"] = "value"
self.response.out.write(json.dumps(response))
application = webapp2.WSGIApplication([
('/debug', DebugHandler),
], debug = True)
Android mobile app client code
public static String getResponseBodyAsString(String request) {
BufferedReader bufferedReader = null;
try {
URL url = new URL(request);
HttpURLConnection httpURLConnection = (HttpURLConnection)url.openConnection();
initHttpURLConnection(httpURLConnection);
InputStream inputStream = httpURLConnection.getInputStream();
bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
int charRead = 0;
char[] buffer = new char[8*1024];
// Use StringBuilder instead of StringBuffer. We do not concern
// on thread safety. stringBuffer = new StringBuffer();
StringBuilder stringBuilder = new StringBuilder();
while ((charRead = bufferedReader.read(buffer)) > 0) {
stringBuilder.append(buffer, 0, charRead);
}
return stringBuilder.toString();
} catch (MalformedURLException e) {
Log.e(TAG, "", e);
} catch (IOException e) {
Log.e(TAG, "", e);
} finally {
close(bufferedReader);
}
return null;
}

Content-Type specifies what's inside the response (i.e. how to interpret the body of the response). Is it JSON, a HTML document, a JPEG, etc? It is useful when you have different representations of your resources and together with Accept it's a header involved in doing content negotiation between client and server.
Different clients might need different formats. A C# client might prefer XML, a Javascript client might prefer JSON, another client could work with multiple representations but try to request the most efficient one first and then settle for others if the server can't serve the preferred one, etc.
Content-Type is very important in the browser so that the user agent knows how to display the response. If you don't specify one the browser will try to guess, usually based on the extension and maybe fallback to some Save as... dialog if that fails also. In a browser, the lack of a Content-Type might cause some HTML to open a Save as... dialog, or a PDF file to be rendered as gibberish in the page.
In an application client, not having a Content-Type might cause a parsing error or might be ignored. If you server only serves JSON and your client only expects JSON then you can ignore the Content-Type, the client will just assume it's JSON because that's how it was built.
But what if at some point you want to add XML as a representation, or YAML or whatever? Then you have a problem because the client assumed it's always JSON and ignored the Content-Type. Now when it receives XML it will try to parse as JSON and fail. If instead the client was built with content types in mind and you always specify a Content-Type then your client will then take it into account and select an appropriate parser instead of blindly making assumptions.

Android - Get hidden input value of html page using DefaultHttpClient

I already asked a similar question before, but wasn't providing enough details. So here is the "same" question, but this time more in-depth.
I've got a webpage with html, and somewhere on that page I have: <input name="__RequestVerificationToken" type="hidden" value="the_value_I_want" />
So, my question is: How can I get the value (the_value_I_want) of the hidden text field in Android, using HttpGet of an already open DefaultHttpClient connection?
My current code:
// Method to get the hidden-input value of the Token
private String getToken(){
String url = "http://myhost/Account/Login";
String hidden_token = "";
String response = "";
HttpGet get = new HttpGet(url);
try{
// Send the GET-request
HttpResponse execute = MainActivity.HttpClient.execute(get);
// Get the response of the GET-request
InputStream content = execute.getEntity().getContent();
BufferedReader buffer = new BufferedReader(new InputStreamReader(content));
String s = "";
while((s = buffer.readLine()) != null)
response += s;
content.close();
buffer.close();
// Get the value of the hidden input-field with the name __RequestVerificationToken
// TODO
}
catch(Exception ex){
ex.printStackTrace();
}
return hidden_token;
}
So, what should I add on the TODO-line?
Because the Token and Cookie only remain as long as the Session stays open, I can't use the Jsoup library for finding the hidden field (which I did by using the code below). Instead I need to use the already open DefaultHttpClient.
Jsoup code:
Document doc = Jsoup.connect(url).get(); // <- this opens a new session
org.jsoup.nodes.Element el = doc.select("input[name*=" + TOKEN).first();
hidden_token = el.attr("value");
Thanks in advance for the responses.
PS: For those wondering: I need this token to be able to Log-in using a Google-account with a POST-request, combined with the token I got from a Cookie.

Ok, lucky for me this was very simple. I just replaced Document doc = Jsoup.connect(url).get(); with Document doc = Jsoup.parse(html);
So in the code of the main post I replaced //TODO with:
Document doc = Jsoup.parse(response);
org.jsoup.nodes.Element el = doc.select("input[name*=" + TOKEN).first();
hidden_token = el.attr("value");
Edit 1:
I thought this did the trick, but it doesn't.. It still thinks there are two different Sessions opened.. :S
Does Jsoup.parse(...) open a new Jsoup.get-session behind the scenes?
Edit 2:
It's even worse.. Every time another page is opened on the website, another session is created and therefore another token is needed.. So I need to discuss some things with the creator of the website/web api hybrid and figure some things out.. Perhaps create a different log-in just for the Web API..
All in all I'm kinda frustrated right now, even though all the problems I've encountered are "solved"..

Android: showing a captcha from server

For my application I need to use a captcha for verification. I am using this link for it: http://vocublablair.nl/webservices/send.php
It returns a string something like this: /webservices/simple-php-captcha.php?_CAPTCHA&t=0.59145200+1338304461
Then that link should be called, being: http://vocublablair.nl/webservices/simple-php-captcha.php?_CAPTCHA&t=0.59145200+1338304461
When I call this with the same HttpClient (so with the right session cookie) it gives the following error:
java.lang.IllegalArgumentException: Illegal character in query at index 94: http://vocublablair.nl/webservices/simple-php-captcha.php?_CAPTCHA&t=0.59145200+1338304461

best way is to generate your own captcha image because using third party resources may cause inconvenient for your customers.
for generating capthcha in android you could use this simple library :
http://simplecaptcha.sourceforge.net/

I solved the problem doing the following:
First of all the link I got back had to be HTML encoded using:
Html.fromHtml(String)
Calling the correct URL:
content = sb.toString(); //the fetched link
String myCookie = "";
//get session cookie from other call.
List<Cookie> cookies = ((DefaultHttpClient)httpClient).getCookieStore().getCookies();
for(Cookie c : cookies){
if(c.getName().equals("PHPSESSID")){
myCookie = c.getName() + "=" + c.getValue();
break;
}
}
URL url = new URL(Html.fromHtml("http://vocublablair.nl" + content).toString());
URLConnection connection = url.openConnection();
connection.setDoInput(true);
connection.setRequestProperty("Cookie", myCookie);
connection.connect();
Bitmap bit = BitmapFactory.decodeStream(new FlushedInputStream((InputStream) connection.getContent()));
I am using a FlushedInputStream. Read about it here: http://twigstechtips.blogspot.com/2011/10/android-loading-jpgpng-images-from-url.html

Gain URL from google image search in android

I am trying to be able to view the source code of a webpage after being given a URL in order to parse the text for a certain string which represents and image url.
I found this post which is pretty much what I am after trying to do but can't get it working:
Post
This is my code below.
public String fetchImage() throws ClientProtocolException, IOException {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet("www.google.co.uk/images?q=songbird+oasis");
HttpResponse response = client.execute(request);
String html = "";
InputStream in = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder str = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null)
{
str.append(line);
}
in.close();
html = str.toString();
return html;
}
but for some reason it just does not work. It forces me to use a try catch statement in calling the method. Once this works I think it will simple from here using regex to find the string "href="/imgres?imgurl=........jpg" to find the url of a jpg image to then be shown in an image view.
Please tell me if i'm going at this all wrong.

First, Google has a search API, which will be a better solution than the scraping you are going through, since the API will be reliable, and your solution will not be.
Second, use the BasicResponseHandler pattern for string responses, as it is much simpler.
Third, saying something "just does not work" is a pretty useless description for a support site like this one. If it crashes, as kgiannakakis pointed out, you will have an exception. Use adb logcat, DDMS, or the DDMS perspective in Eclipse to examine the stack trace and find out what the exception is. That will give you some clues for how to solve whatever problem you have.

how to url encode in android?

I am using grid view for displaying image using xml parsing,i got some exception like
java.lang.IllegalArgumentException: Illegal character in path at
index 80:
http://www.theblacksheeponline.com/party_img/thumbspps/912big_361999096_Flicking
Off Douchebag.jpg
How to solve this problem? I want to display all kind of url,anybody knows please give sample code for me.
Thanks All

URL encoding is done in the same way on android as in Java SE;
try {
String url = "http://www.example.com/?id=123&art=abc";
String encodedurl = URLEncoder.encode(url,"UTF-8");
Log.d("TEST", encodedurl);
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}

Also you can use this
private static final String ALLOWED_URI_CHARS = "##&=*+-_.,:!?()/~'%";
String urlEncoded = Uri.encode(path, ALLOWED_URI_CHARS);
it's the most simple method

As Ben says in his comment, you should not use URLEncoder.encode to full URLs because you will change the semantics of the URL per the following example from the W3C:
The URIs
http://www.w3.org/albert/bertram/marie-claude
and
http://www.w3.org/albert/bertram%2Fmarie-claude
are NOT identical, as in the second
case the encoded slash does not have
hierarchical significance.
Instead, you should encode component parts of a URL independently per the following from RFC 3986 Section 2.4
Under normal circumstances, the only
time when octets within a URI are
percent-encoded is during the process
of producing the URI from its
component parts. This is when an
implementation determines which of the
reserved characters are to be used as
subcomponent delimiters and which can
be safely used as data. Once
produced, a URI is always in its
percent-encoded form.
So, in short, for your case you should encode/escape your filename and then assemble the URL.

You don't encode the entire URL, only parts of it that come from "unreliable sources" like.
String query = URLEncoder.encode("Hare Krishna ", "utf-8");
String url = "http://stackoverflow.com/search?q=" + query;

URLEncoder should be used only to encode queries, use java.net.URI class instead:
URI uri = new URI(
"http",
"www.theblacksheeponline.com",
"/party_img/thumbspps/912big_361999096_Flicking Off Douchebag.jpg",
null);
String request = uri.toASCIIString();

you can use below method
public String parseURL(String url, Map<String, String> params)
{
Builder builder = Uri.parse(url).buildUpon();
for (String key : params.keySet())
{
builder.appendQueryParameter(key, params.get(key));
}
return builder.build().toString();
}

I tried with URLEncoder that added (+) sign in replace of (" "), but it was not working and getting 404 url not found error.
Then i googled for get better answer and found this and its working awesome.
String urlStr = "http://www.example.com/test/file name.mp4";
URL url = new URL(urlStr);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
url = uri.toURL();
This way of encoding url its very useful because using of URL we can separate url into different part. So, there is no need to perform any string operation.
Then second URI class, this approach takes advantage of the URI class feature of properly escaping components when you construct a URI via components rather than from a single string.

I recently wrote a quick URI encoder for this purpose. It even handles unicode characters.
http://www.dmurph.com/2011/01/java-uri-encoder/

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.