I am parsing some values from a website with JSoup, some of them are also url links (href).
When I get the url link, which I set to a string. The string sometimes does not become a valid URL link as it has a special character like '!?()
Example: https://somelink.com/King's+Beak (The ' makes the link not valid).
Now I tackle this by replacing the characters with the default character set for UTF-8, which works as it should.
Example code:
String test = arTD.select("a.wiki_link").get(0).attr("href").replaceAll("'", "%27");
I also set JSoup to UTF-8 but that does not seem to work.
Document document = Jsoup.parse(response.body().string(), "UTF-8");
Now my question is, is there a more convenient way to tackle this?, as I need to escape more characters like '!?().
Thank you in advance.
One way to solve this issue is to use the URLEncoder.encode() method to encode the URL string. This method will replace special characters with their corresponding ASCII codes.
String test = arTD.select("a.wiki_link").get(0).attr("href");
String encodedUrl = URLEncoder.encode(test, StandardCharsets.UTF_8);
Another way to solve this issue is by using the Uri.Builder class in android to encode the URL.
Uri.Builder builder = new Uri.Builder()
.scheme("https")
.authority("link.com")
.appendPath("King's+Beak");
Uri uri = builder.build();
String encodedUrl = uri.toString();
The first method will encode the whole url while the second method will only encode the path of the url.
Choose the suitable one for you.
Related
I have a strange issue about using Retrofit2 in my android project. I got the issue about the server error since the request is something like that.
https://www.example.com/api/v1/skills?q=Good%00
Since the invalid value "%00" is not acceptable in our server, so it showed error on my activity.
API service
#GET("skills")
Observable<SearchItem> getSkills(#Query("q") String keyword);
In my fragment, I just get the text using following simple statement.
String keyword = editText.getText().toString()
api.getSkills(keyword);
What I want to know is the following:
Is it possible to have a word can be converted to "%00" ?
How to avoid this "Good%00" before I send to getSkills function?
To enable compile time checks on nullity add #NonNull annotation,
#GET("skills")
Observable<SearchItem> getSkills(#NonNull #Query("q") String keyword);
Another way is to change each "%00" in your string, using .replace()
Replace the string with "" if it contains %00
if (text.toString().contains("%00")){
text = text.replace("%00", "");
}
and then call getSkills(text) with updated value
Try this
Use trim()
The java string trim() method eliminates leading and trailing spaces. The unicode value of space character is '\u0020'. The trim() method in java string checks this unicode value before and after the string, if it exists then removes the spaces and returns the omitted string.
SAMPLE CODE
String keyword = editText.getText().toString().trim();
api.getSkills(keyword)
or your can use
replace()
The java string replace() method returns a string replacing all the old char or CharSequence to new char or CharSequence.
SAMPLE CODE
url =url.replace("%00", "");
or You can use URLEncoder
Utility class for HTML form encoding. This class contains static methods for converting a String to the application/x-www-form-urlencoded MIME format. For more information about HTML form encoding
String encodedurl = URLEncoder.encode(yourURL,"UTF-8");
There is two way to solve this
1. you restrict the entry write below code
android:digits="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
you can add more thing which you want
String keyword = editText.getText().toString().replace("%00", "");
api.getSkills(keyword);
Using the below url I got an error:
java.lang.IllegalArgumentException: Illegal character in path at index 47: http://safetracker-threetinker.rhcloud.com/api/{userid}/locations?lat={latitude}&lng={longitude}.
URL:
URL=http://safetracker-threetinker.rhcloud.com/api/{userid}/locations?lat={latitude}&lng={longitude}
how to solve the error. I don't have good knowledge in URL encoding. please help me to find the solution.
The problem is actually it is looking for long/integer value and you are passing a { just put a $ so that it will be replaced by the actual value pointed by the variable
http://safetracker-threetinker.rhcloud.com/api/1/locations?lat=5&lng=5
your Address is like this
URL=http://safetracker-threetinker.rhcloud.com/api/{userid}/locations?lat={latitude}&lng={longitude}
it should be like this
URL=http://safetracker-threetinker.rhcloud.com/api/${userid}/locations?lat=${latitude}&lng={longitude}
We can not use some special characters in URL, so we have to replace these special characters with its encoded form.
Replace your URL with following URL
URL=http://safetracker-threetinker.rhcloud.com/api/%7Buserid%7D/locations?lat=%7Blatitude%7D&lng=%7Blongitude%7D
May this help you.
URLEncoder should be the way to go. You only need to keep in mind to encode only the individual query string parameter name and/or value, not the entire URL, for sure not the query string parameter separator character & nor the parameter name-value separator character =.
String q = "replace_with user_id/locations?lat=replace with latitude&lng=replace with longitude";
String url = "http://safetracker-threetinker.rhcloud.com/api/=" + URLEncoder.encode(q, "UTF-8");
I have a builtUri that is appending a String that contains a special character and when I log the final built string it appears wrong.
String signature = "D662636E84CD1A4%26";
...
.appendQueryParameter(SIGNATURE, signature)
The problem in the final built Uri that is used to connect is that at the end, instead of "%26" it shows "%2526"
Anyone knows how to fix this?
cheers
The character '%' is getting url escaped to '%25'.
It's intended behavior, as this is how the character is represented in an url
If you want to prevent it, you might want to check out How to avoid getting URL encoded paths from URL.getFile()?
I'm writing an asynchronous image downloader for Android and was just wondering, given an arbitary URL such as:
http://www.android.com/images/brand/droid.gif
What would be the best way to convert the unique url to a filename. I thought about simply splitting the url and grabbing the last section, but I want the filename to be representative of the whole URL. The other alternatives I thought were replacing all the forward slashes with underscores or simply hashing the whole URL and storing this.
If anyone has any ideas I'd love to hear them!
Thanks
In case, usually uses MD5 hash. but I suggest to use 'aquery' library. In library you can simply download Image asynchronous and put it to view. It also support disk cache, memory cache simply.
This method will be fulfill your requirements. It will generate a name which will represent original URL. You can call generateNameFromUrl(String url) method like this.
String url = "http://www.android.com/images/brand/droid.gif";
String uniqueName = generateNameFromUrl(url));
Method is given below:
public static String generateNameFromUrl(String url){
// Replace useless chareacters with UNDERSCORE
String uniqueName = url.replace("://", "_").replace(".", "_").replace("/", "_");
// Replace last UNDERSCORE with a DOT
uniqueName = uniqueName.substring(0,uniqueName.lastIndexOf('_'))
+"."+uniqueName.substring(uniqueName.lastIndexOf('_')+1,uniqueName.length());
return uniqueName;
}
Input: "http://www.android.com/images/brand/droid.gif"
Output: "http_www_android_com_images_brand_droid.gif"
i have a String displayed on a WebView as "Siwy & Para Wino"
i fetch it from url , i got a string "Siwy%2B%2526%2BPara%2BWino". // be corrected
now i'm trying to use URLDecoder to solve this problem :
String decoded_result = URLDecoder.decode(url); // the url is "Siwy+%26+Para+Wino"
then i print it out , i still saw "Siwy+%26+Para+Wino"
Could anyone tell me why?
From the documentation (of URLDecoder):
This class is used to decode a string which is encoded in the application/x-www-form-urlencoded MIME content type.
We can look at the specification to see what a form-urlencoded MIME type is:
The form field names and values are escaped: space characters are replaced by '+', and then reserved characters are escaped as per [URL]; that is, non-alphanumeric characters are replaced by '%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks, as in multi-line text field values, are represented as CR LF pairs, i.e. '%0D%0A'.
Since the specification calls for a percent sign followed by two hexadecimal digits for the ASCII code, the first time you call the decode(String s) method, it converts those into single characters, leaving the two additional characters 26 intact. The value %25 translates to % so the result after the first decoding is %26. Running decode one more time simply translates %26 back into &.
String decoded_result = URLDecoder.decode(URLDecoder.decode(url));
You can also use the Uri class if you have UTF-8-encoded strings:
Decodes '%'-escaped octets in the given string using the UTF-8 scheme.
Then use:
String decoded_result = Uri.decode(Uri.decode(url));
thanks for all answers , i solved it finally......
solution:
after i used URLDecoder.decode twice (oh my god) , i got what i want.
String temp = URLDecoder.decode( url); // url = "Siwy%2B%2526%2BPara%2BWino"
String result = URLDecoder.decode( temp ); // temp = "Siwy+%26+Para+Wino"
// result = "Swy & Para Wino". !!! oh good job.
but i still don't know why.. could someone tell me?