When I display bullet-points, copyright symbols, trademark signs in a web browser, they
look fine.
// bullets: http://losangeles.craigslist.org/wst/acc/2900906683.html
// bullets: http://losangeles.craigslist.org/lac/acc/2902059059.html
// bullets: http://indianapolis.craigslist.org/acc/2867115357.html
// bullets: http://indianapolis.craigslist.org/ofc/2885697780.html
// bullets: http://indianapolis.craigslist.org/ofc/2887554512.html
// copyright: http://chicago.craigslist.org/nwc/acc/2854640931.html
But I get "question marks inside triangles" when I use an Android WebView with:
web.loadDataWithBaseURL(null, myHtml, null, "UTF-8", null);
Should I be using a different encoding?
Should I be searching/replacing certain characters myself... 1-by-1?
Try using WebView settings
myWebView = (WebView)findViewById(R.id.mywebView);
WebSettings settings = myWebView.getSettings();
settings.setDefaultTextEncodingName("UTF-8");
I've run into this problem before. I would make sure that your myHtml String already has good encoding before you load it into your WebView. You can check that by logging it using Log.d(). If the encoding is wrong in that String, that it won't show properly in WebView either. You'll see those weird characters in LogCat.
If that is the case, you'll want to make sure that when you're reading the data into your myHtml String, that you use something like an InputStreamReader and pass it "UTF-8" as the character encoding.
I would change the line of code that you're using from:
BufferedReader buffer = new BufferedReader(new InputStreamReader(content), 1000);
to:
BufferedReader buffer = new BufferedReader(new InputStreamReader(content, "UTF-8"), 1000);
This version of the constructor is documented to:
Constructs a new InputStreamReader on the InputStream in. The character converter that is used to decode bytes into characters is identified by name by enc. If the encoding cannot be found, an UnsupportedEncodingException error is thrown.
at http://developer.android.com/reference/java/io/InputStreamReader.html and look at the second one.
EDIT: If that doesn't work, you could try using:
String s = EntityUtils.toString(entity, HTTP.UTF_8);
which is from Android Java UTF-8 HttpClient Problem
Related
My android application uses a web service.The web service returns response in json format (which is UTF8 encoded). Here I am using the same for decoding the json data. still some special symbols(eg degree celcius symbol) are displays a question mark
InputStream is = con.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(is,"UTF-8"));
JSON:
{
"option1":"109.5?",
"option2":"109?",
"option3":"120?",
"option4":"180?",
"ans_option":"",
"qd_id":76,
"questions":"In alkanes the bond angle is"
}
You have to use "UTF-8" Mark for this issue:
http://developer.android.com/reference/java/nio/charset/Charset.html
You have to encode for your expected character like this way :
URLEncoder.encode("Your Special Character", "UTF8");
Check this question as well:
Android: Parsing special characters (ä,ö,ü) in JSON
I am using this httpclient: http://loopj.com/android-async-http/
I am getting a json with this httpclient.
I want to set character enconding of this httpclient. The JSONObject that the client returns contains turkish chars such as şğöü. But it is corrupted and i cant view this characters.
How can i set character encoding of this httpclient?
The correct would be that server provides the encoding of the returned page.
If it does that you will receive the correct one.
But if it doesn't provides the encoding Async-http seems to assume UTF-8 and looking at the code it doesn't seems to support providing a default alternative one.
Relevant code in AsyncHttpResponseHandler :
// Interface to AsyncHttpRequest
void sendResponseMessage(HttpResponse response) {
...
responseBody = EntityUtils.toString(entity, "UTF-8");
If you want to do you will need to user your own version of AsyncHttpResponseHandler or suggest a patch to be able to specify default encoding.
i resolved this problem by modifying the loopj source code file "AsyncHttpResponseHandler.java"...
void sendResponseMessage(HttpResponse response){
.........
//responseBody = EntityUtils.toString(entity, "UTF-8");
responseBody = EntityUtils.toString(entity, "ISO-8859-1");
}
ISO-8859-1 encoding will give you the correct characters..
How do I get text from a basic HTML page and show it in a TextView.
I want to do it this way because it will look better than having a webview showing the text.
There is only one line on the html page. I can change it to a txt file if needed.
Could you also provide a quick example?
You would need to download the HTML first using something like HttpClient to retrieve the data from the Internet (assuming you need to get it from the Internet and not a local file). Once you've done that, you can either display the HTML in a WebView, like you said, or, if the HTML is not complex and contains nothing other than some basic tags (<a>, <img>, <strong>, <em>, <br>, <p>, etc), you can pass it straight to the TextView since it supports some basic HTML display.
To do this, you simply call Html.fromHtml, and pass it your downloaded HTML string. For example:
TextView tv = (TextView) findViewById(R.id.MyTextview);
tv.setText(Html.fromHtml(myHtmlString));
The fromHtml method will parse the HTML and apply some basic formatting, returning a Spannable object which can then be passed straight to TextView's setText method. It even supports links and image tags (for images, though, you'll need to implement an ImageGetter to actually provide the respective Drawables). But I don't believe it supports CSS or inline styles.
How to download the HTML:
myHtmlString in the snippet above needs to contain the actual HTML markup, which of course you must obtain from somewhere. You can do this using HttpClient.
private String getHtml(String url)
{
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url);
try
{
HttpResponse response = client.execute(request);
BufferedReader reader = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
String line;
StringBuilder builder = new StringBuilder();
while((line = reader.readLine()) != null) {
builder.append(line + '\n');
}
return builder.toString();
}
catch(Exception e)
{
//Handle exception (no data connectivity, 404, etc)
return "Error: " + e.toString();
}
}
It's not enough to just use that code, however, since it should really be done on a separate thread (in fact, Android might flat out refuse to make a network connection on the UI thread. Take a look at AsyncTasks for more information on that. You can find some documentation here (scroll down a bit to "Using Asynctask").
I'm downloading website's source code using HttpClient and then I want to extract some data using regular expressions. Unfortunetely the website is encoded in iso-8859-1 which seems to be causing problems. Here's the sample code to download website:
HttpGet query = new HttpGet(url);
HttpResponse queryResponse = httpClient.execute(query);
String queryText = EntityUtils.toString(queryResponse.getEntity()).replaceAll("\r", " ").replaceAll("\n", " ");
And then the expression:
Pattern patter = Pattern.compile("<p class=\"qt\">(.*?)</p>");
Matcher matcher = pattern.matcher(queryText);
while (matcher.find()) // do something
The problem is that it's missing some occurences, when there are special iso-8859-1 characters. (.*?) doesn't seem to match them. What's the reason of this problem? How do I fix it?
Are you sure this has to do with "special iso-8859-1 characters" and not newlines? . does not match line terminators by default. You can use the DOTALL flag to enable matching of line terminators as well. eg:
Pattern patter = Pattern.compile("<p class=\"qt\">(.*?)</p>", Pattern.DOTALL);
I need to use a WebView to load certain webpages and dynamically change the css before showing them to the user (which means I have to delete all <link> tags and append the one with my css). (Why? Because I want to adapt the look of a particular site - which is not mine - for smartphones)
Now, I've seen that similar questions have been answered that the only way to modify the html before showing it to the user is by executing some javascript in the onPageFinished method; this could be a solution, but I'd like to consider other possibilities as well.
So, my questions are:
1) If I go deeper in the source of the WebView class, is it possible to find where the html is loaded from the site, so that I have direct access to the html and I can modify it as I want?
2) If yes, is WebView the class that handles the communication and retrieves the html? If else, which one is it?
3) Assuming that what I asked is possible, do you think that the application would perform better if the modification to the html where made this way instead of using javascript?
You can use HttpClient to perform an HTTP GET and retrieve the HTML response, something like this:
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet(url);
HttpResponse response = client.execute(request);
String html = "";
InputStream in = response.getEntity().getContent();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
StringBuilder str = new StringBuilder();
String line = null;
while((line = reader.readLine()) != null)
{
str.append(line);
}
in.close();
html = str.toString();
You can now have fun with your String html and place it in the Webview
WebView webview=(WebView)findViewById(R.id.mywebview);
webview.loadData(myModifiedHtml, "text/html", "UTF-8");
You can do it easily by enabling JavaScript on your webview and executing
document.getElementsByTagName('html')[0].innerHTML
Check this answer for detailed procedure.