How to work with JSON URL - android

I've got a JSON object that looks something like this: (the following links are fake)
"results": {
"urlStuff": [
{"pic_url": "http:\/\/www.youtube.com\/inside\/kslkjfldkf\/234.jpg?v=7475646"},
{"other_pic_url": "http:\/\/www.youtube.com\/outside\/kslkjfldkf\/234.jpg?v=7475646"}
]
}
or something to that effect. My question is, why do the urls have escape characters if they are already strings? I am having to get rid of them to make the method calls on the URL to get the pics. Am I missing something? I am using Android to make this call.
Thanks,
Matt

why do the urls have escape characters if they are already strings?
They have escape characters because they are strings -- specifically, because they are JSON strings they have JSON string escape characters, and the entity that sent them to you decided to use the option to escape the solidus. For more information on why the sending entity may have made that choice, see the Why does the Groovy JSONBuilder escape slashes in URLs? post.
I am having to get rid of them to make the method calls on the URL to get the pics. Am I missing something?
Take the easy route and just use a decent JSON parsing API to take care of automatically removing the JSON escape characters for you, when translating the JSON string into a Java String. Android has such a built-in JSON library available.
package com.stackoverflow.q6564078;
import org.json.JSONObject;
import android.app.Activity;
import android.os.Bundle;
import android.util.Log;
public class Foo extends Activity
{
#Override
public void onCreate(Bundle savedInstanceState)
{
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
// {"pic_url": "http:\/\/www.youtube.com\/inside\/kslkjfldkf\/234.jpg?v=7475646"}
String jsonInput = "{\"pic_url\": \"http:\\/\\/www.youtube.com\\/inside\\/kslkjfldkf\\/234.jpg?v=7475646\"}";
Log.d("JSON INPUT", jsonInput);
// output: {"pic_url": "http:\/\/www.youtube.com\/inside\/kslkjfldkf\/234.jpg?v=7475646"}
try
{
JSONObject jsonObject = new JSONObject(jsonInput);
String javaUrlString = jsonObject.getString("pic_url");
Log.d("JAVA URL STRING", javaUrlString);
// output: http://www.youtube.com/inside/kslkjfldkf/234.jpg?v=7475646
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
}

I couldn't see any escape character in the urls you provided but, nevertheless, URLs are encoded. I suggest you have a look at URLEncoder. This class offers different ways to encode a URL.
Normally the standard implies that URLs are encoded using UTF-8. But, for some languages, the encoding and charset can different. Recently I had to deal with urls containing asian characters and they were encoded in other charsets (namely eur-ko for Korean, for instance).
I used this site to decode/encode maually a few urls and find out the charset.
Once you found the right charsets to use, you can use the Charset class of the Java sdk to transform urls into normal utf-16 java string. Tutorial here.
Regards,
Stéphane

why do the urls have escape characters if they are already strings?
Since:
it is not uncommon to use JSON generating functions to produce JavaScript literals for embedding inside <script> elements
HTML is often embedded in JSON and
The sequence </ will terminate script blocks in HTML 4 (and </script> will in all browsers)
… escaping / characters ensures the data will be safe to drop into a <script> element.
I am having to get rid of them to make the method calls on the URL to get the pics.
Your JSON library should do that for you. Err … you are using a JSON library and not trying something crazy involving regular expressions, aren't you?

'/' characters must can be escaped, as per JSON syntax: http://www.json.org/. Normally, whatever JSON API you are using should properly restore the escaped characters.
Edit: Correction as per comments

Related

Retrofit 2.1 Posting Cyrillic Field Error

I am using Retrofit 2.1. But when I post a field that contains cyrillic word, it gives an empty response, however it should return 2-3 items. Here is the api:
#FormUrlEncoded
#POST("my_awesome_base_url")
Call<Questions> getQuestions(#Field(value = "rowsdata", encoded = false) String rowsdata);
And the rowsdata contains some cyrillic word that db should search and respond similar results. Here is an example rowsdata:
rowsdata = {"code":"-4","start":"1","where":"where short_question like 'Вақт' ","end":"2"}
In the rowsdata, Вақт is in cyrillic, but it is somehow encoding it to some chars so that server is giving me an empty list.
I checked this on Postman, and it gave me the desired results, but when I send a request using Retrofit, it is responding like nothing is found...
Probably an encoding issue.
From developers site :
A String represents a string in the UTF-16 format in which
supplementary characters are represented by surrogate pairs (see the
section Unicode Character Representations in the Character class for
more information). Index values refer to char code units, so a
supplementary character uses two positions in a String.
Try encoding the string into UTF-8, make sure your file is UTF-8 as well (default in Android Studio I think).

Android JSON escape unicode chars

In android I create JSON String like this
jsonObject.toString();
But the problem is that Unicode chars aren't encoded, for example:
{"number":123456,"name":"בית"}
Should be:
{"number":123456,"name":"\u05D1\u05D9\u05EA"}
How do I do that?
I tried different JSON converting libs but couldn't one that does it.
One solution that I found is to escape it using StringEscapeUtils, and remove double slash in the string, like this:
jsonArray.put("name", StringEscapeUtils.escapeJson(name));
jsonArray.toString().replace("\\\\u", "\\u");
But I don't think this is the best solution for this case..

How to parse HTML-formatted String to plain string?

I'm getting a JSON response string similar to this:
<strong>B.<\/strong> Because there is no indication of Miss Manette’s feelings
The string text that I'm receiving is full of tags like <strong>, <em> and ’
“
” etc. How can I parse it to a plain String with same features?
The only way I could think of is replacing such characters and using Html.fromHtml() method. Is there a built-in parser available? How could I parse such HTML text?
Use Html.fromHtml only. It'll parse most of the tags supplied and give you the formatted output. The point to note here is that not all of the HTML tags are supported by this method. Checkout this link for more information about what tags are supported. Also check this, though it's a bit old.
If you know what text you'll be parsing, and you have tags that aren't parsed by fromHtml, your best bet would be to replace them with empty string and then use this method.

XML parse error on ë (crash with accented letters)

I have a crash in an xml file. it occurs on a ë, in this case belgië (dutch for belgium).
I'm busy with searching for an answer but I just can't find a solution.
I'm using the sax parser under Android.
error: org.apache.harmony.xml.ExpatParser$ParseException: At line 2, column 204: not well-formed
xml source: http://biohorma.weatheronyoursite.com/villadm_hooikoortsverwachting_be.xml
Side note, i get the data via a stream, is the only option to put this stream to a temp value, replace the illegal character with a valid one and make a new stream of it or can you add something in the stream to do this?
It seems you should use the String (byte[] bytes, String enc) constructor, assuming what server sends you is encoded in UTF-8:
String properXml = new String(byteArrayIReceivedFromServer, "UTF-8");
The issue is not with the parser - it's acting correctly - but with whatever code is sending the XML. ë needs to be encoded and passed as ë. The same also must be done to other accented characters, ampersands and angle brackets.
You should replace special characters in the xml I think..
See a comprehensive list of chars here: http://www.w3schools.com/tags/ref_entities.asp
it says your umlaut e is like : Ë Ë Ë capital e, umlaut mark
Then also for a brief explanation if u feel like reading.
Hope it helps.
The server sends these headers:
Content-Type: text/xml
Content-Length: 124512
Since no charset is specified for content type, the normally correct assumption is US_ASCII. However, the XML payload seems to be encoded in ISO-8859-1
<?xml version="1.0" encoding="iso-8859-1"?>
and the 'ë' is encoded as 0xEB (235). It is very common for servers to encode text payload in ISO-8859-1, so this is something that one simply has to deal with.
My guess is that if you serve the parser with a byte stream directly, it will detect the encoding an act accordingly. If you use a character stream (not recommended), make sure to specify correct encoding.

extracting strings from KML file

I am extracting strings from KML file, if the string contains special character like !, #, #, ', " etc. its using codes like '
I am not able to extract entire string if it is like above, by calling getNodeValue(). It is terminating the string at special character.
<name>Continue onto Royal's Market</name>
If i extract the string i am getting only ""Continue onto Royal". I want entire string as
Continue onto Royal's Market.
How to achieve this ?? If anybody familiar with this please reply to this one.
Thanks
Your problem has nothing to do with KML but is general for XML parsning:
Don't use getNodeValue(), as there is no guarantee in DOM that text isn't actually split over several nodes.
Try using getTextContent() instead.
You might also have to replace entities, as in: node.getTextContent().replaceAll("'","'");
In general I wouldnt use DOM at all for extracting data.
I'd use the XmlPullParser as its simpler to work with - and parses faster.

Categories

Resources