Hello, I have a website part described as below:
<div id="insertA">
<form class="MultiFile-intercepted" enctype="multipart/form-data"
method="post" onsubmit="return checkAnomalyFields();"
action="dodajN.html">
<table style="border-weight: 0px">
<tbody>
<tr>
<tr>
<td id="wybory"><select id="typ" onchange="typeSelected()" size="1"
name="typuId">
</td>
<td>
</tr>
<tr>
<td>Szerokość: <input id="szer" type="text" onchange="setMarker()"
value="" name="szer">
<div id="szerErr" class="err">Proszę podać szerokość na terenie
Polski (49-55).</div>
</td>
<td>Długość: <input id="dlug" type="text" onchange="setMarker()"
value="" name="dlug">
<div id="dlugErr" class="err">Proszę podać długość na terenie
Polski (14-25).</div> <input id="id" type="hidden" value=""
name="id">
</td>
</tr>
I want to make a HTTP POST request to send data from my client and put it into forms.
I am doing this as follows:
try {
HttpClient client = new MyHttpClient(Send.this);
String postURL = "url";
HttpPost post = new HttpPost(postURL);
//FileBody bin = new FileBody(file);
MultipartEntity reqEntity = new MultipartEntity(HttpMultipartMode.BROWSER_COMPATIBLE);
//reqEntity.addPart("myFile", bin);
reqEntity.addPart("typuId", new StringBody("1"));
reqEntity.addPart("statusuId", new StringBody("2"));
reqEntity.addPart("szer", new StringBody("52.321911"));
reqEntity.addPart("dlug", new StringBody("19.464111000000003"));
reqEntity.addPart("opis", new StringBody("jakis opis"));
post.setEntity(reqEntity);
HttpResponse response = client.execute(post);
HttpEntity resEntity = response.getEntity();
AlertDialog.Builder alert=new AlertDialog.Builder(Send.this);
alert.setTitle("Niepoprawne dane").setMessage(EntityUtils.toString(resEntity)).setNeutralButton("OK", null).show();
if (resEntity != null) {
Log.i("RESPONSE",EntityUtils.toString(resEntity));
}
} catch (Exception e) {
e.printStackTrace();
}
The problem is when I read the response I get the HTML code of the site that I am requesting without a success code or anything similar. It looks like I am requesting for site content, but not submitting the form. Any idea what I am doing wrong?
You're submitting to a .html file. Generally servers aren't configured to treat those files as scripts, which means the data you're submitting is simply ignored and dumped. To handle a form submission, you have to submit to a script or other program specifically designed to handle that submission, e.g. a php script.
OK, to clarify what Marc B said: action="dodajN.html" is almost certainly wrong. I've never seen a web server that lets you do this (of course, anything is possible). It should probably be action="cgi-bin/something" or something like that.
It's actually not that important however, since your app isn't using the action clause anyway, but rather writing to "url" which is even more wrong. If you would tell us exactly what url you're really writing to, it might help.
But ultimately, the way you debug this is to look at the server logs and see what's happening at that end.
As a general rule, when I'm developing something like this, I first write the server-side cgi script and a web page to use it. Once my API is working through the web page, only then do I start trying to call the cgi script from an Android app.
My debugging process consists of: 1) Reading the server logs. 2) Having my cgi scripts write their own debug logs. 3) Having my android app dump the response code and headers to logcat.
Related
I'm having trouble getting all the html code under the tags. Here is my current code:
Document document = Jsoup.connect("http://stackoverflow.com/questions/2971155/what-is-the-fastest-way-to-scrape-html-webpage-in-android").get();
Elements desc = document.select("tr");
System.out.println(desc.toString());
It's for that question, and I'm trying to get the text from the question's description. But I'm getting not getting certain tr or td tags like the ones for the question. Here is td tag I'm trying to get:
<td class="postcell">
Under that tag is the actual post. Now when I print out what I'm actually getting, I'm getting a ton of empty td tags and some comments, but not the actual post.
<tr id="comment-37956942" class="comment ">
<td>
<table>
<tbody>
<tr>
<td class=" comment-score"> </td>
<td> </td>
</tr>
</tbody>
</table> </td>
<td class="comment-text">
<div style="display: block;" class="comment-body">
<span class="comment-copy">You shouldn't parse HTML with regexes: blog.codinghorror.com/parsing-html-the-cthulhu-way</span> –
﹕ motobói
And it keeps on going with empty td and tr tags. I can't find the actual question. Anyone know why this is happening?
Essentially, I just want the text from the question's post, and I don't know how to get it, so it would be nice if someone could show me how to get the text.
Jsoup is a parser. That means that it can't execute any javascript code, that could generate html. When you encounter this problem the only way to retrieve that content is through a headless browser, that includes a javascript engine. A popular library is selenium webdriver.
In order to determine if the content you are trying to parse is generated in the server (static content) or in the client (dynamic content-javascript generated) you can do the following:
Visit the page you want to parse
Press Ctrl + U
The steps above will open a new tab that contains the content that jsoup receives. If the content you need is not there, then it's generated by javascript.
Follow the steps and search for the content. If it's there, but jsoup still has problems, then most probably the case is that the site considers you a bot or a mobile device. Try setting the userAgent of a desktop browser and see what happens.
Document document = Jsoup.connect("http://stackoverflow.com/questions/2971155/what-is-the-fastest-way-to-scrape-html-webpage-in-android").userAgent("USER_AGENT_HERE").get();
Most importantly, when the site exposes and API for the users to extract information programmatically then it's better to just use that.
Stackoverflow has an API available
I am new to Android Development and on my Android app, I want to retrieve HTML table values from website. I will use the values (in the background) which I'll get from the HTML table.
I want to get (td) values.
HTML table is like this.
<table width="482" height="187" cellspacing="1" cellpadding="1" border="1">
<tbody>
<tr>
<td> </td>
<td colspan="3" style="text-align:center;>
<strong>"UKOME KARARI</strong>
</td>
</tr>
<tr>
<td>Elektronik Bilet</td>
.....
</table>
Can I use the Jsoup? How?
You can use jsoup library inside your android application to achieve HTML table from web site.
Using this libray you can parse html page in Android.
Download jsoup library" from http://jsoup.org/download link.
Happy Coding...
In a browser, if I want to submit a form containing a username and a password input, I only need to add an "action" attribute and set the "method" attribute to "post":
<form method="post" name="form" action="https://www.xxx">
<input id="username" type="text" value="xxxxxx" name="username">
<input id="password" type="password" autocomplete="off" name="password">
<input type="submit" value="submit" >
</form>
then the browser will handle the post and concatenate the password and username as the request and send to the server.
My question is: in the webkit (what I concern is the Android webkit, but I think others will be ok),where is the code of handling such process? Can I find the code that get the text from the input element, concatenate them, and then send to the server?
Thanks
where there's no answer, I finally find it.
For webkit, there's mainly four directory we need to consider in android:
for java part:
framework/base/core/java/androud/webkit
for native c part
external/webkit/Source/WebCore
external/webkit/Source/WebKit
external/webkit/Source/JavaScriptCore
for the form submit, the java part webkit will send a key event which will be handled by C WebCore.
we can see the code from
WebCore/html/HTMLFormElement.cpp
WebCore/loader/FormSubmission.cpp
WebCore/loader/FrameLoader.cpp
I am making an Android application that can fetch the new announcements from the website of my university.
This is the HTML code in the website:
sample_html_code http://img690.imageshack.us/img690/1079/88210050.png
Text version:
<table border="1" width="90%" class="duyuru">
<tbody>
<tr>
<td>
<h3 class="duyuru">Additional Quotas for the Technical Electives</h3>
"19/09/2012"
<h4 class="duyuru">"Additional Quotas for Technical Electives offered in...</h4>
<span class="duyuru"></span>
<br>
Download
</td>
</tr>
</tbody>
</table>
I can get the first and third lines "Additional Quotas for Technical Electives" and "Additional Quotas for ..." by using the piece of code below. However, I cannot get the date information (19/09/2012) located between h3 and h4 lines.
String patternStr ="\\<h3 class=\"duyuru\".*?\\>(.*?)\\</h3\\>";
patternStr+="(.*?)"; // This line is problematic
patternStr+=".*?\\<h4 class=\"duyuru\".*?\\>(.*?)\\</h4\\>";
Pattern pattern = Pattern.compile(patternStr, Pattern.DOTALL);
Matcher matcher = pattern.matcher(content);
String name = "";
String date = "";
String details = "";
while (matcher.find()){
name = matcher.group(1);
date = matcher.group(2);
details = matcher.group(3);
Announcement announcement = new Announcement();
announcement.setName(name);
announcement.setDate(date);
announcement.setDetails(details);
announcements.add(announcement);
}
I tried using
.*?\"(.*?)\"
but it didn't work. When I do this, it gets the string "duyuru" from the line starting with h4 tag instead of the date information.
Anyone have an idea how can I grab the date information?
Thanks in advance.
Your regular expression misses the newlines and whitespace in the input.
The simplest possible match I could come up with is:
"\\<h3 class=\"duyuru\".*?\\>\\n?\\s*(.*?)\\n?\\s*\\</h3\\>"
But keep in mind that such a regular expression is highly specific to your HTML.
My advice would be to have a look at a real HTML parser for Java, such as TagSoup. Once you start using one of those, parsing this type of HTML document becomes a breeze.
When I post my events' xml file to google's server , sometimes I will receive the html below , I am very confused why it happens ,but sometimes it is OK. Any one can help me?
Is it caused by the connection error? or the token is invalid ? or what?
<html><head><meta http-equiv="content-type" content="text/html;charset=UTF-8">
<title>Error</title>
<style type="text/css">body {font-family: arial,sans-serif}</style></head>
<body text="#000000" bgcolor="#ffffff"><table border="0" cellpadding="2" cellspacing="0" width="100%"><tr><td rowspan="3" width="1%" nowrap><b><font face="times" size="10"><font color="#0039b6">G</font> <font color="#c41200">o</font> <font color="#f3c518">o</font> <font color="#0039b6">g</font> <font color="#30a72f">l</font> <font color="#c41200">e</font></font> </b></td>
<td> </td></tr>
<tr><td bgcolor="#3366cc"><font face="arial,sans-serif" color="#ffffff"><b>Error</b></font></td></tr>
<tr><td> </td></tr></table>
<blockquote>Cannot access the calendar you requested</blockquote>
<p></p>
<div style="background:#3366cc; width:1px; height:4px"></div></body></html>
Well, I can't say I really like the answer to this question, but I was having the same issue and found an answer after a bit of finagling.
Google has its own session ID that it uses for these kinds of requests. The first time you make a request, it starts the session and gives you a redirect; it also causes the error you saw above. From what I can gather, if you try the request again after the session ID has been set, the request will go through.
In other words, you have to send the request and check the response from Google to see if you're being redirected. If you are, you have a couple of options to grab the URL that has Google's session ID (gsessionid) included; I chose to parse the Location header out of the response, which shows the URL to which the data should be posted. Try your request again (and any subsequent requests) by posting to that new URL, and it should work like a charm. Just takes a bit to get there.
For more info about this, check the Google documentation on redirects and this somewhat related StackOverflow question.