Android: Character encoding raw resource files - android

I'm in the process of translating one of my apps to Spanish, and I'm having a character encoding problem with a raw HTML file I'm sticking into a WebView. I have the spanish translation of the file in my raw-es folder, and I'm reading it in with the following function:
private CharSequence getHtmlText(Activity activity) {
BufferedReader in = null;
try {
in = new BufferedReader(new InputStreamReader(getResources().openRawResource(R.raw.help), "utf-8"));
String line;
StringBuilder buffer = new StringBuilder();
while ((line = in.readLine()) != null) buffer.append(line).append('\n');
return buffer;
} catch (IOException e) {
return "";
} finally {
closeStream(in);
}
}
But everywhere there is a spanish character in the file, there is a diamond with a question mark inside of it when I run the app, and look at the activity that displays the HTML. I'm using the following to load the text into the WebView:
mWebView.loadData(text, "text/html", "utf-8");
I originally created the file in Microsoft Word, so I'm sure there is some sort of character encoding issue going on, but I'm not really sure how to fix it, and a Google search isn't helping. Any ideas?

Don't use loadData. Use loadDataWithBaseURL instead. You would say:
mWebView.loadDataWithBaseURL( null, text, "text/html", "utf-8", null );

I had a similar issue with a French translation where diamond symbols with question marks were appearing in place of certain characters, including those which I had escaped. I got around it by opening file properties in Eclipse and changing the encoding to "ISO-8859-1". Don't know if this would work for Spanish though.

Related

Android espresso compare strings on the screen with string from assets

I would like to compare content of .txt file that i have in my assets folder with some text on the screen.
Usually when I assert text on the screen I use:
onView(withId(R.id.someId)).check(matches(withText("String")));
is ther any easy way so i can assert it from file?
Also, if you want to shorten your assertions and actions when using Expresso, check this library: https://github.com/SchibstedSpain/Barista (disclaimer: I'm a contributor).
It contains a set of quick actions and assertions that make the tests much more readable.
Here is a code to read text from text file.
StringBuilder buf=new StringBuilder();
InputStream json=getAssets().open("book/contents.json");
BufferedReader in=
new BufferedReader(new InputStreamReader(json, "UTF-8"));
String str;
while ((str=in.readLine()) != null) {
buf.append(str);
}
in.close();
now compare your string from assets to you screen text.
buf.toString().equals("your text here");

Load strings.xml from sd card to application android

Is it possible to load strings.xml from sd card instead of application res/values/... Search on the web but didn't find any tutorials. My thought is download the xml to sd card then save the strings element to an array.
public void stringsxml(){
File file = new File(Environment.getExternalStorageDirectory()
+ ".strings.xml");
StringBuilder contents = new StringBuilder();
try {
//use buffering, reading one line at a time
//FileReader always assumes default encoding is OK!
BufferedReader input = new BufferedReader(new FileReader(file));
try {
String line = null; //not declared within while loop
/*
* readLine is a bit quirky :
* it returns the content of a line MINUS the newline.
* it returns null only for the END of the stream.
* it returns an empty String if two newlines appear in a row.
*/
while (( line = input.readLine()) != null){
contents.append(line);
contents.append(System.getProperty("line.separator"));
}
}
finally {
input.close();
}
}
catch (IOException ex){
ex.printStackTrace();
}
String data= contents.toString();
}
Well, actually it is semi-possible, but you have to create a derivate LayoutInflater which will replace string codes with thus read strings.
I have documented my attempts and failings together with initial implementation here.
Summary: simple strings work, string arrays do not
No, this is not possible. Check Android decoumentation about resources:
The Android SDK tools compile your application's resources into the application binary at build time. To use a resource, you must install it correctly in the source tree (inside your project's res/ directory) and build your application.
Resources are built-in into the application binary and you can't read them from a file.

Open raw text recourse and diplay wrong next line

InputStream inputStream = getResources().openRawResource(idtext);
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
String myText = "";
int in;
try {
in = inputStream.read();
while (in != -1) {
byteArrayOutputStream.write(in);
in = inputStream.read();
}
inputStream.close();
myText = byteArrayOutputStream.toString();
} catch (IOException e) {
e.printStackTrace();
}
myTextView.setText(myText); `
My code is used to display long text file in raw res. I don't know why, but some of text file display wrong about next line, any help?
It would be good if you can share some sample display vs your expected output.
From the initial guess, it could be because of couple of reasons:
The encoding used in the text file. So, if you have written the text in ASCII and displaying it while using UTF-8 strings, it would mess up few things. This should be consistent.
Could be a good case of how line feeds are encoding in the file, like \r\n or just \n.
You can also try encapsulating your InputStream to FileReader and line reading streams which are more specialized in directly reading strings, rather than converting it.
You can probably use a library like Apache Commons IO to manage all the stuffs for you.

Why are foreign characters not read using inputStream?

I have a text file which contains data I need to preload into a SQLite database. I saved in in res/raw.
I read the whole file using readTxtFromRaw(), then I use the StringTokenizer class to process the file line by line.
However the String returned by readTxtFromRaw does not show foreign characters that are in the file. I need these as some of the text is Spanish or French. Am I missing something?
Code:
String fileCont = new String(readTxtFromRaw(R.raw.wordstext));
StringTokenizer myToken = new StringTokenizer(fileCont , "\t\n\r\f");
The readTxtFromRaw method is:
private String readTxtFromRaw(Integer rawResource) throws IOException
{
InputStream inputStream = mCtx.getResources().openRawResource(rawResource);
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
int i = inputStream.read();
while (i != -1)
{
byteArrayOutputStream.write(i);
i = inputStream.read();
}
inputStream.close();
return byteArrayOutputStream.toString();
}
The file was created using Eclipse, and all characters appear fine in Eclipse.
Could this have something to do with Eclipse itself? I set a breakpoint and checked out myToken in the Watch window. I tried to manually replace the weird character for the correct one (for example í, or é), and it would not let me.
Have you checked the several encodings?
what's the encoding of your source file?
what's the encoding of your output stream?
the byteArrayOutputStream.toString() converts according to the platform's default character encoding. So I guess it will strip the foreign characters or convert them in a way that they are not displayed in your output.
Have you already tried to use byteArrayOutputStream.toString(String enc)? Try "UTF-8" or "iso-8859-1" or "UTF-16" for the encoding.

Android: How to read a txt file which contains Chinese characters?

i have a txt file which contains many chinese characters, and the txt file is in the directory res/raw/test.txt. I want to read the file but somehow i can't make the chinese characters display correctly. Here is my code:
try {
InputStream inputstream = getResources().openRawResource(R.raw.test);
BufferedReader bReader = new BufferedReader(
new InputStreamReader(inputstream,Charset.forName("UTF-8")));
String line = null;
while ((line= bReader.readLine())!= null) {
Log.i("lolo", line);
System.out.println("here is some chinese character 这是一些中文字");
}
} catch (IOException e) {
e.printStackTrace();
}
Both Log.i("lolo", line); and System.out.println("here is some chinese character 这是一些中文字") don't show characters correctly, i can not even see the chinese characters in the println() method.
What can i do to fix this problem? Can anybody help me?
In order to correctly handle non-ASCII characters such as UTF-8 multi-byte characters, it's important to understand how these characters are encoded and displayed.
Your console (output screen) may not support the display of non-ASCII characters. If that's the case, your UTF-8 characters will be displayed as garbage. Sometimes, you will be able to change the character encoding on the console. Sometimes not.
Even if the console correctly displayed UTF-8 characters, it's possible that your string does not correctly store the Chinese characters. You may think that it's correct because your editor displays them, but ensure that the character encoding of your editor also supports UTF-8.
I also was trying to figure out that. First you need to open the .txt file with the notepad and then click on File->Save as, there you will see a dropdown menu that says Enconding, so change it to UTF-8. After saving the file you should remove the .txt extension to the file and then add the file to the path res/raw and then you can refer to it from the code as R.raw.txtFileName.
That's all, i will put my code where I used the chinese characters and I could show them in the emulator.
If you have any other question, let me know because i am also developing something related with characters. Here is the code:
public List<String> getWords() {
List<String> contents = new ArrayList<String>();
try {
InputStream inputStream = getResources().openRawResource(R.raw.chardb);
BufferedReader input = new BufferedReader(new InputStreamReader(inputStream,Charset.forName("UTF-8")));
try {
String line = null;
while (( line = input.readLine()) != null){
contents.add(line);
}
}
finally {
input.close();
}
}
catch (IOException ex){
ex.printStackTrace();
}
return contents;
}

Categories

Resources