In android I create JSON String like this
jsonObject.toString();
But the problem is that Unicode chars aren't encoded, for example:
{"number":123456,"name":"בית"}
Should be:
{"number":123456,"name":"\u05D1\u05D9\u05EA"}
How do I do that?
I tried different JSON converting libs but couldn't one that does it.
One solution that I found is to escape it using StringEscapeUtils, and remove double slash in the string, like this:
jsonArray.put("name", StringEscapeUtils.escapeJson(name));
jsonArray.toString().replace("\\\\u", "\\u");
But I don't think this is the best solution for this case..
Related
I have two strings retrived from json file (Arabic json files previously parsed from txt files). I use kotlin trim() function to remove leading and trailing newlines after parsing from json. The problem is, one of them, say file1, is successfuly trimmed while the other, say file2, is not.
I have thought of the encoding, but never managed to get my way through it. All what I know is json files are most likely encoded from utf-8 source. So I convert both files with Kotlin String function toByteArray(Charsets.UTF-8).contentToString:
file1 always has: [32, 10] as last elements in its bytes array (where newline character should be).
file2 always has: [32, 10, -30, -128, -113] as last elements in its bytes array (where newline character should be).
It sounds like there are additional three byte chracters at the end of the file with the problem (I have no idea what these minus signs stand for).
This is my way to fetch json and create JSONOBject:
val file: String = applicationContext.assets.open("poets/${poetID}.txt").bufferedReader().use {
it.readText()
}
val json = JSONObject(file)
here, ${poetID}.txt is actually json file in asset folder poets/.
I have the same application written in Swift with no such problems.
My question is: What are these assitional bytes at the end? Is there a way to check for encoding of a string parsed from json files? Or a way to change the encoding programmatically?
I have found the answer. The additional character represnts the Right-to-Left Mark. It is a common unicode character in Arabic language.
I am getting these characters as a JSON response :
This characters should be translated to Ukrainian word Активная
How can I decode this set of characters, tried it with java URLDecoder, no luck so far, any ideas ?
The encoding is XML entity encoding. Use an XML parser or Html.fromHtml() to decode it.
Also, consider fixing the server side to use JSON \uNNNN encoding for character literals instead.
I'm reading a file into a jsonobject from my assets folder. The file contains json string.
Some of the strings contain "'" (apsotrophe) character. The problem is that the textview shows "?" in place of these apostrophes. Why is this happening. When I print the json string to logcat using mJsonObject.toString(), it shows proper character.
How can I get rid of this "?" and show actual character?
The Apostrophe probably isn't a simple ' apostrophe, but some advanced typographic apostrophe that is missing in your font and/or gets mangled during charset conversions. Preferably, replace the typographic apostrophes with plain apostrophes in the JSON file.
If you don't want to do so, escape them using the \u escape. This makes sure that the correct character ends up in the JsonObject. If you still get the question mark, make sure your font supports the character and that you don't break it in other charset conversions.
If you cannot use \u escapes for some reason, make sure you read the file with the correct charset.
I have xml file in server. I am parsing this xml using DOM(xml is not big). In one node there is string with double quotes.
<NODE1>hello "world"</NODE1>
When i see this xml url in browser and check its source it looks like this:
<NODE1>hello "world"</NODE1>
So when i parse this value i get string till double quotes. It seems after double quotes parser doesn't go forward. Any help ? I want to use DOM only in my current situation. This xml is used by other platform also apart from android. Like in iPhone its working perfectly. What should I do to read all value in android using DOM.
Thanks.
I'm not sure if I understand your question, but have a look at using CDATA
http://www.globalguideline.com/xml/XML_CDATA.php
A xml that contains double quote in a value is not valid according to specification: http://www.devx.com/tips/Tip/14068. I would suggest that you escape the special characters server side otherwise you will not be able to parse the xml.
I've got a JSON object that looks something like this: (the following links are fake)
"results": {
"urlStuff": [
{"pic_url": "http:\/\/www.youtube.com\/inside\/kslkjfldkf\/234.jpg?v=7475646"},
{"other_pic_url": "http:\/\/www.youtube.com\/outside\/kslkjfldkf\/234.jpg?v=7475646"}
]
}
or something to that effect. My question is, why do the urls have escape characters if they are already strings? I am having to get rid of them to make the method calls on the URL to get the pics. Am I missing something? I am using Android to make this call.
Thanks,
Matt
why do the urls have escape characters if they are already strings?
They have escape characters because they are strings -- specifically, because they are JSON strings they have JSON string escape characters, and the entity that sent them to you decided to use the option to escape the solidus. For more information on why the sending entity may have made that choice, see the Why does the Groovy JSONBuilder escape slashes in URLs? post.
I am having to get rid of them to make the method calls on the URL to get the pics. Am I missing something?
Take the easy route and just use a decent JSON parsing API to take care of automatically removing the JSON escape characters for you, when translating the JSON string into a Java String. Android has such a built-in JSON library available.
package com.stackoverflow.q6564078;
import org.json.JSONObject;
import android.app.Activity;
import android.os.Bundle;
import android.util.Log;
public class Foo extends Activity
{
#Override
public void onCreate(Bundle savedInstanceState)
{
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
// {"pic_url": "http:\/\/www.youtube.com\/inside\/kslkjfldkf\/234.jpg?v=7475646"}
String jsonInput = "{\"pic_url\": \"http:\\/\\/www.youtube.com\\/inside\\/kslkjfldkf\\/234.jpg?v=7475646\"}";
Log.d("JSON INPUT", jsonInput);
// output: {"pic_url": "http:\/\/www.youtube.com\/inside\/kslkjfldkf\/234.jpg?v=7475646"}
try
{
JSONObject jsonObject = new JSONObject(jsonInput);
String javaUrlString = jsonObject.getString("pic_url");
Log.d("JAVA URL STRING", javaUrlString);
// output: http://www.youtube.com/inside/kslkjfldkf/234.jpg?v=7475646
}
catch (Exception e)
{
throw new RuntimeException(e);
}
}
}
I couldn't see any escape character in the urls you provided but, nevertheless, URLs are encoded. I suggest you have a look at URLEncoder. This class offers different ways to encode a URL.
Normally the standard implies that URLs are encoded using UTF-8. But, for some languages, the encoding and charset can different. Recently I had to deal with urls containing asian characters and they were encoded in other charsets (namely eur-ko for Korean, for instance).
I used this site to decode/encode maually a few urls and find out the charset.
Once you found the right charsets to use, you can use the Charset class of the Java sdk to transform urls into normal utf-16 java string. Tutorial here.
Regards,
Stéphane
why do the urls have escape characters if they are already strings?
Since:
it is not uncommon to use JSON generating functions to produce JavaScript literals for embedding inside <script> elements
HTML is often embedded in JSON and
The sequence </ will terminate script blocks in HTML 4 (and </script> will in all browsers)
… escaping / characters ensures the data will be safe to drop into a <script> element.
I am having to get rid of them to make the method calls on the URL to get the pics.
Your JSON library should do that for you. Err … you are using a JSON library and not trying something crazy involving regular expressions, aren't you?
'/' characters must can be escaped, as per JSON syntax: http://www.json.org/. Normally, whatever JSON API you are using should properly restore the escaped characters.
Edit: Correction as per comments