JSoup get absolute url of an image with special characters - android

i am working with JSoup and Android to get image urls from some site but some urls contains special characters like (é,è,à...) example :
http://www.mysite.com/détail du jour.jpg
the element.attr("abs:src") returns the same url as above
till now no problem to retrieve the url but when i submit this url in the code below it returns file not found (i grabbed this function from an example on the internet) :
public Object fetch(String address) throws MalformedURLException,IOException {
try {
URL url = new URL(address);
Object content = url.getContent();
return content;
} catch (Exception e) {
return null;
}
}
i think the problem is in the url format because when i get the real address of the image in google chrome :
http://www.mysite.com/d%C3%A9tail%20du%20jour.jpg
and submit it in the code like :
URL url = new URL("http://www.mysite.com/d%C3%A9tail%20du%20jour.jpg");
the image loads correctly so how to get this formatted url from JSoup?
thanks

You need to use URLEncoder for the extracted url from the JSoup.
Something like:
URL url = new URL(URLEncoder.encode(address));
The spaces between will be replaced with special character values %something

Related

Android - large data json download

I'm facing a problem in my app Android that consists in a failure to recover large json data from web service. The JSON data is large because contains images which are recovered from a table in my database.
Some days ago everything was working fine, but the number of registers in this table grew up fastly and then the problem rises.
This problem is not happening on iOS or on Android emulator, just on real Android device.
The code is stopping the download suddenly and throwing the error
org.json.JSONException: Unterminated string at character <cccc> of <json data>
That is, the code is not downloading the entire JSON data. Besides, the download stops in a different point always I run the code.
Someone know why that is happening?
This is the function which tries to recover the data from web service:
protected static JSONObject executeJSONQuery(ContentValues values,
WebServiceResolverCode code, Context context)
throws ContratoInativoException, JSONException {
URL url;
HttpURLConnection conn;
JSONObject jArray = null;
// build the string to store the response text from the server
String response = "";
try {
url = new URL(WebServiceResolver.getWebServiceSufix(code, context));
String param = "";
for (String key : values.keySet()) {
Object value = values.get(key);
param += key+"="+ URLEncoder.encode(value!=null?values.getAsString(key):"", "UTF-8")+"&";
}
conn = (HttpURLConnection) url.openConnection();
conn.setDoOutput(true);
conn.setRequestMethod("POST");
conn.setFixedLengthStreamingMode(param.getBytes().length);
conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
// send the POST out
OutputStreamWriter out = new OutputStreamWriter(conn.getOutputStream());
out.write(param);
out.flush();
out.close();
// start listening to the stream
Scanner inStream = new Scanner(conn.getInputStream());
while (inStream.hasNextLine()){
response += (inStream.nextLine());
}
inStream.close();
// process the stream and store it in StringBuilder
jArray = new JSONObject(response);
} catch (IOException ex) {
ex.printStackTrace();
Log.d("ERROR JSON STRING", response);
}
return jArray;
}
Firstly, do not keep an image in a table. Keep it in your server space and keep the URL to the image in your table. When you need to load the image, load it with the URL kept in the table.
Secondly, JSON is a light weight data interchange format. You should limit the number of items it carries. Its advisable not to let it carry up to a hundred rows in your table(when converting to JSON). If you have to, then load a little at first, then load another little, till it gets to the end. That's the idea behind infinite scrolling.
That way, it loads your data efficiently.
Looks like your json string object is not forming correctly. It may be missing s double quote at the end of a string. Check the param string object.

How To pass a Value in Html webview url?

i want to load html url in webview from raw folder it working fine
url = "file:///android_res/raw/a1.html";
webView.loadUrl(url);
But i want pass a value in url like this
String s = "1";
url = "file:///android_res/raw/a"+s+".html";
but its not working please help how can i achieve this.
first, u can not insure url = "file:///android_res/raw/a"+s+".html"; is a useful file path. so, this method can not work as you planed.
you can use
webview.loadUrl("javascript:xxxx");
to pass a parameter to html .
or use url = "file:///android_res/raw/a.html?action=go";
This is done in the same way on android as in Java SE.
Put your complete URL inside URLEncoder
try {
String url = "http://www.example.com/?id=123&art=abc";
String encodedurl = URLEncoder.encode(url,"UTF-8");
Log.d("TEST", encodedurl);
}
catch (UnsupportedEncodingException e) {
e.printStackTrace();
}

URL Malformed Exception error in Android (java.net.MalformedURLException: Protocol not found)

One of my URL is like the following: "h--p://www.test.com///rss.xml"
When I run the following code:
private String RSSFEEDURL = Uri.encode("h--p://www.test.com/path/*/*/rss.xml");
URL url = null;
try {
url = new URL(xml);
} catch (MalformedURLException e1) {
e1.printStackTrace();
}
I am getting "java.net.MalformedURLException: Protocol not found: http%3A%2F%2Ftest.com%2Fpath%2F*%2F*%2Frss.xml"
I have already done Uri encode as shown above. Any idea, what is causing this issue and how I could resolve it?
if you call new URL the thing you put in there should be a valid URL.
You're putting this in there: http%3A%2F%2Ftest.com%2Fpath%2F*%2F*%2Frss.xml, and that's not a valid URL, so the exception is expected.
You shouldn't encode your whole URL.

android: parse html from page

i would like to parse out some text from a page.
Is there an easy way to save the product info in to a string for example? Example url: http://upcdata.info/upc/7310870008741
Thanks
Jsoup is excellent at parsing simple HTML from Android applications:
http://jsoup.org/
To get the page, just do this:
URL url = new URL("http://upcdata.info/upc/7310870008741");
Document document = Jsoup.parse(url, 5000);
Then you can parse out whatever you need from the Document. Check out this link for a brief description of how to extract parts of the page:
http://jsoup.org/cookbook/extracting-data/dom-navigation
If you want to read from a URL into a String:
StringBuffer myString = new StringBuffer();
try {
String thisLine;
URL u = new URL("http://www.google.com");
DataInputStream theHTML = new DataInputStream(u.openStream());
while ((thisLine = theHTML.readLine()) != null) {
myString.append(thisLine);
}
} catch (MalformedURLException e) {
} catch (IOException e) {
}
// call toString() on myString to get the contents of the file your URL is
// pointing to.
This will give you a plain old string, HTML markup and all.
String tmpHtml = "<html>a whole bunch of html stuff</html>";
String htmlTextStr = Html.fromHtml(tmpHtml).toString();

how to url encode in android?

I am using grid view for displaying image using xml parsing,i got some exception like
java.lang.IllegalArgumentException: Illegal character in path at
index 80:
http://www.theblacksheeponline.com/party_img/thumbspps/912big_361999096_Flicking
Off Douchebag.jpg
How to solve this problem? I want to display all kind of url,anybody knows please give sample code for me.
Thanks All
URL encoding is done in the same way on android as in Java SE;
try {
String url = "http://www.example.com/?id=123&art=abc";
String encodedurl = URLEncoder.encode(url,"UTF-8");
Log.d("TEST", encodedurl);
} catch (UnsupportedEncodingException e) {
e.printStackTrace();
}
Also you can use this
private static final String ALLOWED_URI_CHARS = "##&=*+-_.,:!?()/~'%";
String urlEncoded = Uri.encode(path, ALLOWED_URI_CHARS);
it's the most simple method
As Ben says in his comment, you should not use URLEncoder.encode to full URLs because you will change the semantics of the URL per the following example from the W3C:
The URIs
http://www.w3.org/albert/bertram/marie-claude
and
http://www.w3.org/albert/bertram%2Fmarie-claude
are NOT identical, as in the second
case the encoded slash does not have
hierarchical significance.
Instead, you should encode component parts of a URL independently per the following from RFC 3986 Section 2.4
Under normal circumstances, the only
time when octets within a URI are
percent-encoded is during the process
of producing the URI from its
component parts. This is when an
implementation determines which of the
reserved characters are to be used as
subcomponent delimiters and which can
be safely used as data. Once
produced, a URI is always in its
percent-encoded form.
So, in short, for your case you should encode/escape your filename and then assemble the URL.
You don't encode the entire URL, only parts of it that come from "unreliable sources" like.
String query = URLEncoder.encode("Hare Krishna ", "utf-8");
String url = "http://stackoverflow.com/search?q=" + query;
URLEncoder should be used only to encode queries, use java.net.URI class instead:
URI uri = new URI(
"http",
"www.theblacksheeponline.com",
"/party_img/thumbspps/912big_361999096_Flicking Off Douchebag.jpg",
null);
String request = uri.toASCIIString();
you can use below method
public String parseURL(String url, Map<String, String> params)
{
Builder builder = Uri.parse(url).buildUpon();
for (String key : params.keySet())
{
builder.appendQueryParameter(key, params.get(key));
}
return builder.build().toString();
}
I tried with URLEncoder that added (+) sign in replace of (" "), but it was not working and getting 404 url not found error.
Then i googled for get better answer and found this and its working awesome.
String urlStr = "http://www.example.com/test/file name.mp4";
URL url = new URL(urlStr);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
url = uri.toURL();
This way of encoding url its very useful because using of URL we can separate url into different part. So, there is no need to perform any string operation.
Then second URI class, this approach takes advantage of the URI class feature of properly escaping components when you construct a URI via components rather than from a single string.
I recently wrote a quick URI encoder for this purpose. It even handles unicode characters.
http://www.dmurph.com/2011/01/java-uri-encoder/

Categories

Resources