SpannableStringBuffer limited to 9,999 characters? - android

My app reads in large amounts of data from text files assets and displays them on-screen in a TextView. (The largest is ~450k.) I read the file in, line-by-line into a SpannableStringBuffer (since there is some metadata I remove, such as section names). This approach has worked without complaints in the two years that I've had the app on the market (over 7k active device installs), so I know that the code is reasonably correct.
However, I got a recent report from a user on a LG Lucid (LGE VS840 4G, Android 2.3.6) that the text is truncated. From log entries, my app only got 9,999 characters in the buffer. Is this a known issue with a SpannableStringBuffer? Are there other recommended ways to build a large Spannable buffer? Any suggested workarounds?
Other than keeping a separate expected length that I update each time I append to the SpannableStringBuilder, I don't even have a good way to detect the error, since the append interface returns the object, not an error!
My code that reads in the data is:
currentOffset = 0;
try {
InputStream is = getAssets().open(filename);
BufferedReader br = new BufferedReader(new InputStreamReader(is));
ssb.clear();
jumpOffsets.clear();
ArrayList<String> sectionNamesList = new ArrayList<String>();
sectionOffsets.clear();
int offset = 0;
while (br.ready()) {
String s = br.readLine();
if (s.length() == 0) {
ssb.append("\n");
++offset;
} else if (s.charAt(0) == '\013') {
jumpOffsets.add(offset);
String name = s.substring(1);
if (name.length() > 0) {
sectionNamesList.add(name);
sectionOffsets.add(offset);
if (showSectionNames) {
ssb.append(name);
ssb.append("\n");
offset += name.length() + 1;
}
}
} else {
if (!showNikud) {
// Remove nikud based on Unicode character ranges
// Does not replace combined characters (\ufb20-\ufb4f)
// See
// http://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
s = s. replaceAll("[\u05b0-\u05c7]", "");
}
if (!showMeteg) {
// Remove meteg based on Unicode character ranges
// Does not replace combined characters (\ufb20-\ufb4f)
// See
// http://en.wikipedia.org/wiki/Unicode_and_HTML_for_the_Hebrew_alphabet
s = s.replaceAll("\u05bd", "");
}
ssb.append(s);
ssb.append("\n");
offset += s.length() + 1;
}
}
sectionNames = sectionNamesList.toArray(new String[0]);
currentFilename = filename;
Log.v(TAG, "ssb.length()=" + ssb.length() +
", daavenText.getText().length()=" +
daavenText.getText().length() +
", showNikud=" + showNikud +
", showMeteg=" + showMeteg +
", showSectionNames=" + showSectionNames +
", currentFilename=" + currentFilename
);
After looking over the interface, I plan to replace the showNikud and showMeteg cases with InputFilters.

Is this a known issue with a SpannableStringBuffer?
I see nothing in the source code to suggest a hard limit on the size of a SpannableStringBuffer. Given your experiences, my guess is that this is a problem particular to that device, due to a stupid decision by an engineer at the device manufacturer.
Any suggested workarounds?
If you are distributing through the Google Play Store, block this device in your console.
Or, don't use one massive TextView, but instead use several smaller TextView widgets in a ListView (so they can be recycled), perhaps one per paragraph. This should have the added benefit of reducing your memory footprint.
Or, generate HTML and display the content in a WebView.

After writing (and having the user run) a test app, it appears that his device has this arbitrary limit for SpannableStringBuilder, but not StringBuilder or StringBuffer. I tested a quick change to read into a StringBuilder and then create a SpannableString from the result. Unfortunately, that means that I can't create the spans until it is fully read in.
I have to consider using multiple TextView objects in a ListView, as well as using Html.FromHtml to see if that works better for my app's long term plans.

Related

Most efficient way of comparing long arrays of strings

I'm using the speech recognizer to get a voice input from the user, it returns an array of 5 strings which I pass to this method
public int analyzeTag(ArrayList<String> voiceResults,Editor editor, Context context){
for (String match : voiceResults) {
Log.d(TAG, match);
if (match.equalsIgnoreCase(context.getResources().getString(R.string.first_tag))){
editor.append(context.getResources().getString(R.string.first_tag));
return 1;
}
else if (match.equalsIgnoreCase(context.getResources().getString(R.string.second_tag))){
editor.append(context.getResources().getString(R.string.second_tag));
return 1;
}
//etc....(huge list of tags)
//Some tags might also have acceptable variations, example:
else if (match.equalsIgnoreCase("img") || match.equalsIgnoreCase("image")
{
editor.append("img"); //the string to append is always taken from the first variation
}
}
return 0;
}
This method compares the results with a list of tags, the tag list will be pretty big with hundreds of tags so I would like to find the most efficient way to do this operation.
I need help with:
1.Is my way of comparing results the most efficient? Is there a better way? (from the user experience perspective, I don't want users waiting a long time to get a result).
The voice input will be a big part of my app so this method will be called quite often
2.I have a long list of tags, obviously the if(), elseIf() route is gonna be quite repetitive, is there a way to iterate this? Considering the fact that some tags might have variations (even more than 1)and that the variation 1 ("img") will be the same for everyone, but other variations will be locale/language sensitive example: "image" for english users "immagini" for italian users etc.
Text appended to the editor will be always taken from the first variation
How about puting tags in a StringArray and then iterate though the array ?
String[] tags = context.getResources().getStringArray(R.array.tags);
for (String match : voiceResults) {
for (int index = 0; index < tags.length; index++ ) {
if (match.equalsIgnoreCase(tags[index]) {
editor.append(tags[index]);
}
}
}
Here's the doc on StringArray

Android - How to filter emoji (emoticons) from a string?

I'm working on an Android app, and I do not want people to use emoji in the input.
How can I remove emoji characters from a string?
Emojis can be found in the following ranges (source) :
U+2190 to U+21FF
U+2600 to U+26FF
U+2700 to U+27BF
U+3000 to U+303F
U+1F300 to U+1F64F
U+1F680 to U+1F6FF
You can use this line in your script to filter them all at once:
text.replace("/[\u2190-\u21FF]|[\u2600-\u26FF]|[\u2700-\u27BF]|[\u3000-\u303F]|[\u1F300-\u1F64F]|[\u1F680-\u1F6FF]/g", "");
Latest emoji data can be found here:
http://unicode.org/Public/emoji/
There is a folder named with emoji version.
As app developers a good idea is to use latest version available.
When You look inside a folder, You'll see text files in it.
You should check emoji-data.txt. It contains all standard emoji codes.
There are a lot of small symbol code ranges for emoji.
Best support will be to check all these in Your app.
Some people ask why there are 5 digit codes when we can only specify 4 after \u.
Well these are codes made from surrogate pairs. Usually 2 symbols are used to encode one emoji.
For example, we have a string.
String s = ...;
UTF-16 representation
byte[] utf16 = s.getBytes("UTF-16BE");
Iterate over UTF-16
for(int i = 0; i < utf16.length; i += 2) {
Get one char
char c = (char)((char)(utf16[i] & 0xff) << 8 | (char)(utf16[i + 1] & 0xff));
Now check for surrogate pairs. Emoji are located on the first plane, so check first part of pair in range 0xd800..0xd83f.
if(c >= 0xd800 && c <= 0xd83f) {
high = c;
continue;
}
For second part of surrogate pair range is 0xdc00..0xdfff. And we can now convert a pair to one 5 digit code.
else if(c >= 0xdc00 && c <= 0xdfff) {
low = c;
long unicode = (((long)high - 0xd800) * 0x400) + ((long)low - 0xdc00) + 0x10000;
}
All other symbols are not pairs so process them as is.
else {
long unicode = c;
}
Now use data from emoji-data.txt to check if it's emoji.
If it is, then skip it. If not then copy bytes to output byte array.
Finally byte array is converted to String by
String out = new String(outarray, Charset.forName("UTF-16BE"));
For those using Kotlin, Char.isSurrogate can help as well. Find and remove the indexes that are true from that.
Here is what I use to remove emojis. Note: This only works on API 24 and forwards
public String remove_Emojis_For_Devices_API_24_Onwards(String name)
{
// we will store all the non emoji characters in this array list
ArrayList<Character> nonEmoji = new ArrayList<>();
// this is where we will store the reasembled name
String newName = "";
//Character.UnicodeScript.of () was not added till API 24 so this is a 24 up solution
if (Build.VERSION.SDK_INT > 23) {
/* we are going to cycle through the word checking each character
to find its unicode script to compare it against known alphabets*/
for (int i = 0; i < name.length(); i++) {
// currently emojis don't have a devoted unicode script so they return UNKNOWN
if (!(Character.UnicodeScript.of(name.charAt(i)) + "").equals("UNKNOWN")) {
nonEmoji.add(name.charAt(i));//its not an emoji so we add it
}
}
// we then cycle through rebuilding the string
for (int i = 0; i < nonEmoji.size(); i++) {
newName += nonEmoji.get(i);
}
}
return newName;
}
so if we pass in a string:
remove_Emojis_For_Devices_API_24_Onwards("๐Ÿ˜Š test ๐Ÿ˜Š Indic:เคข Japanese:ใช ๐Ÿ˜Š Korean:ใ…‚");
it returns: test Indic:เคข Japanese:ใช Korean:ใ…‚
Emoji placement or count doesn't matter

Separating the words after the last integer in a large String

I've seen many people do similar to this in order to get the last word of a String:
String test = "This is a sentence";
String lastWord = test.substring(test.lastIndexOf(" ")+1);
I would like to do similar but get the last few words after the last int, it can't be hard coded as the number could be anything and the amount of words after the last int could also be unlimited. I'm wondering whether there is a simple way to do this as I want to avoid using Patterns and Matchers again due to using them earlier on in this method to receive a similar effect.
Thanks in advance.
I would like to get the last few words after the last int.... as the number could be anything and the amount of words after the last int could also be unlimited.
Here's a possible suggestion. Using Array#split
String str = "This is 1 and 2 and 3 some more words .... foo bar baz";
String[] parts = str.split("\\d+(?!.*\\d)\\s+");
And now parts[1] holds all words after the last number in the string.
some more words .... foo bar baz
What about this one:
String test = "a string with a large number 1312398741 and some words";
String[] parts = test.split();
for (int i = 1; i < parts.length; i++)
{
try
{
Integer.parseInt(parts[i])
}
catch (Exception e)
{
// this part is not a number, so lets go on...
continue;
}
// when parsing succeeds, the number was reached and continue has
// not been called. Everything behind 'i' is what you are looking for
// DO YOUR STUFF with parts[i+1] to parts[parts.length] here
}

Trying to only do math functions on edittexts users have entered information in on android

I have a 10-field average lap calculator. However, in testing, someone said they normally only run X laps in practice, vs. 10 (let's say 7).
I think I could use an if statement, but there'd be at least 10 of them and a bunch of clumsy code, and I'm not sure on arrays/switch statements exactly. I think all of those might be possible, but my low level of experience has yet to fully comprehend these useful tools.
CURRENT CODE:
double tenLapAvgVar = ((lap1Var + lap2Var + lap3Var + lap4Var + lap5Var + lap6Var + lap7Var + lap8Var + lap9Var + lap10Var) / 10);
So essentially, if someone leaves a field or fields blank, I want to calculate the average based on the populated fields, not 10 (if they leave 3 fields blank, calculate based on 7, for instance). Any help you guys could provide would be much appreciated, thanks!
You could have an ArrayList<EditText> object and a method which iterates over it and adds up the values. Something like:
public double getLapAverage()
{
int noOfCompletedLaps = 0;
double lapAve = 0;
double lapsTotal = 0;
for(EditText text : textBoxes)
{
if(text.getText().toString().length() > 0)
{
//psuedo code, and assuming text is numerical
lapsTotal += Double.parse(text.getText().toString());
noOfCompletedLaps++;
}
}
if( noOfCompletedLaps > 0)
{
lapAve = lapsTotal / noOfCompletedLaps;
}
return lapAve;
}
Maybe it would be better if you used an array instead of 10 different variables.
Then you can use a for statement and initialize them to 0, afterwords let the user fill the array and count how many are not zero.
Finally sum up all the array and divide by the count you previously calculated.

Processing large files on Android

I am writing an android app which requires me to process a very large file(say 50MB). The file contains three entries-A userid, artistid and the number of times the user has listened to that artist. So now I write the code to find out who the most popular artist is based on the number of times the artist has been heard. I achieve this by hashmapping each of the artist id(key) and then the number of times he's been heard (value).
The code works fine as long as the file is below 20B (I run the app on Nexus S 4g, so the heap is 32MB), but I get an 'Out of memory' error for larger files. I realize the code is very badly written. Any suggestions as to how I could get through this problem!
while ((text = inRd.readLine()) != null) {
StringTokenizer st = new StringTokenizer(text);
while (st.hasMoreTokens()) {
int userid = Integer.parseInt(st.nextToken());
artistid = Integer.parseInt(st.nextToken());
int no_times = Integer.parseInt(st.nextToken());
if (hm.containsKey(artistid)) {
Integer oldval = (Integer) hm.get(artistid);
Integer newval = no_times + oldval;
hm.remove(artistid);
hm.put(artistid, newval);
} else
hm.put(artistid, no_times);
}
}
A hashmapping is not the most efficient way to store things. Try a Vector or straight arrays.
Hope this Helps
Cliff

Categories

Resources