Android Java.Lang Locale Number Format I/O Asymmetry Problem

Android Java.Lang Locale Number Format I/O Asymmetry Problem - android

Historically the Android phones sold in South Africa provided English.US and English.UK locale support, but recently English.ZA (South Africa) has made an appearance, on Android 9.0 Samsung Galaxy A10, for example.
This particular Locale is showing asymmetric treatment of number formats, using the Locale.DE (German/Dutch) convention when converting Floats and Doubles into character strings[*1], but raising Java.Lang.NumberFormatException when reading back the self-same generated strings. For instance:
// on output
Float fltNum = 1.23456F;
System.out.println(String.format(Locale.getDefault(),"%f",fltNum)); // prints '1,23456'
// on Input
String fltStr = "1,23456";
Float fltVal;
fltVal = Float(fltStr); // generates NumberFormatException
fltVal = Float.parseFloat(fltStr); // also generates NumberFormatException
// Giving the compiler Float hints fltStr = "1,23456F" does not help
// Only fltStr = '1.23456' converts into a Float.
The temptation would be to swap decimal separators on reads, but that is the task of Float.parseFloat(), a not of the programer, for doing so shall again break other Locale.DE-likes, such as Locale.ID (Indonesia) which my App supports.
My additional question directed more at Locale arbitrators is: Does English.ZA not imply English conformant as would say German.NA (Namibia) be German conformant? One would think the natural designation for this particular number conversion would be Dutch.ZA (colloquially 'Afrikaans'), for Dutch conformance, but Android designates it as English.ZA?
NB (*1) This Android English.ZA conforms only partially as it does not produce either the German point group separator or the local clerical (pen-and-paper) space character group separator.

Apologies for using 'Answer' to respond to diogenesgg's comment suggestion:
"Hi, please take a look at this answer stackoverflow.com/questions/5233012/…. TL/DR."
In it I found a few gems -
(1)
NumberFormat f = NumberFormat.getInstance(Locale.getDefault());
if (f instanceof DecimalFormat) {
((DecimalFormat) f).setDecimalSeparatorAlwaysShown(true);
}
But this is neutral and not value-specific so I added after the above,
(2) Given:
String axisValue("some-float-value-rendered as string");
NumberFormat nf = new DecimalFormat(axisValue);
Which I incorporate sequentially:
NumberFormat nf = new DecimalFormat(axisValue);
Number n;
if(nf instanceof DecimalFormat){
try{
n = nf.parse (axisValue);
axisComponent = (Double) n;
} catch (java.text.ParseException jtpe) {
jtpe.printStackTrace();
}
}
Notice the need to cast the Number n to Double.
This worked mostly under the problematic Locale English.ZA, until the value 0,00000 showed up.
For the string value "0,00000", NumberFormat decides Number n is a Long, and the system throws a (Long to Double) CastException.
I tried to trick NumberFormat in all ways I can to view 0 as a Float or Double to no avail, so 0 is a border problem that Number (NumberFormat.DecimalFormat) does not tolerate.
But this NumberFormat workaround does not resolve the assymmetry problem of Android 9 Locale.English(ZA).DecimalFormat emitting Locale.DE (comma decimal separator) but parsing only System.Primitives (decimal dot separator).
Incidentally, getting past the DecimalFormat problem exposed a myriad of other problems under this new-fangled English.ZA, of my App assuming system primitives working equally well with native resources. Well semantics so used require string comparison to work between primitive and native!
For example system file Primitive path names rendered in Native generating 'file not found', or even more problematic, using primitive string keys semantically only to being rendered meaningless on Native lookup.
I'm not sure which is the lesser evil, assymmetric locale English.ZA or my use of Primitives in semantics to thrust upon Natives. A futile exercise!
Right now I'm embarking on separating system primitives, including their semantic variants from ANY Native language resource strings ...
My lifetime of programming under system primitives needs an overall makeover.
Maybe I can keep an Assets repository for the primitives (resource or semantic) and have Natives look that up for system resources or for semantic Meaning.

Related

iOS get width of each character in a string for a given font

I would like a similar function to Android's:
TextPaint.getTextWidths(String text, int index, int count, float[] widths);
Basically what this does is - for a given font, it will return all the individual character widths of that string. I do not need the range (index/count) as I will always be doing it for the full string.
The following is actually Java RoboVM bindings of iOS functions, but it should be fairly easy to understand the classes I am using and how. I know how to get the bounding box of the string, like this;
NSString nsString = new NSString(string);
CGSize dim = nsString.getSize(uiFont);
Unfortunately the solution is not as simple as interrelating through the characters in the string, because this gives invalid results for complex scripts such as Arabic, where a characters appearance and size is also dependent on it's siblings. Android's getTextWidths() deals with this though, so I am hoping there is similar functionality available on iOS.
Thanks!

How to describe duration in Android?

I'm writing small app and I need to write duration of sport event in i18n. Im using PrettyTime library for date, but when I attempt to use DateUtils or PrettyTime, I have issues..
For example I want to say that duration is 2 minutes. I need some way to pass it to library which supports i18n and accept milliseconds and return Chars.
In android we have:
com.android.internal.R.plurals.duration_minutes
But I can't access to it from my App. Is there any way to make it using correct way and not writing own plurals for all languages?
Thank you

I am not sure which issues you are talking about in context of Android-DateUtils and PrettyTime-library. But I know for sure that Android-DateUtils does not perfectly manage the plural rules of various languages (especially not slavish languages or arabic because it only knows singular and one plural form which is too simple). See for example this Android-issue. About the PrettyTime-library, the same objection is valid if you consider Arabic - see the source.
My recommendation:
Try out my library Time4A (a new AAR-library). Then you can use this code to process a millisecond-input and to produce a localized minute-string:
// input
long millis = 1770123;
// create a duration
Duration<ClockUnit> duration = Duration.of(millis, ClockUnit.MILLIS);
// normalization to (rounded) minutes
duration = duration.with(ClockUnit.MINUTES.rounded());
String s = PrettyTime.of(Locale.ENGLISH).print(duration, TextWidth.WIDE);
System.out.println(s); // 30 minutes
Example for Korean (answer to comment of #Gabe Sechan):
String s = PrettyTime.of(new Locale("ko")).print(duration, TextWidth.WIDE);
System.out.println(s); // 30분 (korean translation of "30 minutes")
Example for Arabic (right to left):
String s = PrettyTime.of(new Locale("ar")).print(duration, TextWidth.WIDE);
System.out.println(s); // ٣٠ دقيقة
This solution currently supports ~90 languages (more than in PrettyTime-library) and three text widths (full, abbreviated or narrow). Accurate pluralization handling is automatically included. Time4A uses its own language resources based on CLDR-data (independent from Android). But you are free to override those resources by defining your own assets (in UTF-8).
About normalization: I just showed the variant which you have described in your question. However, there are many more ways how to normalize durations in Time4A(J). This page will give you more ideas how to use that feature.
If you still miss some languages then just tell me, and I will support it in the next versions of Time4A. Currently supported languages can be found in the tutorial.

Parsing HLS m3u8 file using regular expressions

I want to parse HLS master m3u8 file and get the bandwidth, resolution and file name from it. Currently i am using String parsing to search string for some patterns and do the sub string to get value.
Example File:
#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=476416,RESOLUTION=416x234
Stream1/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=763319,RESOLUTION=480x270
Stream2/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1050224,RESOLUTION=640x360
Stream3/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1910937,RESOLUTION=640x360
Stream4/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=3775816,RESOLUTION=1280x720
Stream5/index.m3u8
But i found that we can parse it using regular expressions like mentioned in this question:
Problem matching regex pattern in Android
I don't have any Idea of regular expression so can some one please guide me to parse this using regular expression.
Or can someone help me in writing regexp for parsing out BANDWIDTH and RESOLUTION values from below string
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=476416,RESOLUTION=416x234

You could try something like this:
final Pattern pattern = Pattern.compile("^#EXT-X-STREAM-INF:.*BANDWIDTH=(\\d+).*RESOLUTION=([\\dx]+).*");
Matcher matcher = pattern.matcher("#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=476416,RESOLUTION=416x234");
String bandwidth = "";
String resolution = "";
if (matcher.find()) {
bandwidth = matcher.group(1);
resolution = matcher.group(2);
}
Would set bandwidth and resolution to the correct (String) values.
I haven't tried this on an android device or emulator, but judging from the link you sent and the android API it should work the same as the above plain old java.
The regex matches strings starting with #EXT-X-STREAM-INF: and contains BANDWIDTH and RESOLUTION followed by the correct value formats. These are then back-referenced in back-reference group 1 and 2 so we can extract them.
Edit:
If RESOLUTION isn't always present then you can make that portion optional as such:
"^#EXT-X-STREAM-INF:.*BANDWIDTH=(\\d+).*(?:RESOLUTION=([\\dx]+))?.*"
The resolution string would be null in cases where only BANDWIDTH is present.
Edit2:
? makes things optional, and (?:___) means a passive group (as opposed to a back-reference group (___). So it's basically a optional passive group. So yes, anything inside it will be optional.
A . matches a single character, and a * makes means it will be repeated zero or more times. So .* will match zero or more characters. The reason we need this is to consume anything between what we are matching, e.g. anything between #EXT-X-STREAM-INF: and BANDWIDTH. There are many ways of doing this but .* is the most generic/broad one.
\d is basically a set of characters that represent numbers (0-9), but since we define the string as a Java string, we need the double \\, otherwise the Java compiler will fail because it does not recognize the escaped character \d (in Java). Instead it will parse \\ into \ so that we get \d in the final string passed to the Pattern constructor.
[\dx]+ means one or more characters (+) out of the characters 0-9 and x. [\dx\d] would be a single character (no +) out of the same set of characters.
If you are interested in regex you could check out regular-expressions.info or/and regexone.com, there you will find much more in depth answers to all your questions.

you could just split strings, here's what I mean in python.
fu ="#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=476416,RESOLUTION=416x234"
for chunk in fu.split(':')[1].split(','):
if chunk.startswith('BANDWIDTH'):
bandwidth = int(chunk.split('=')[1])
if chunk.startswith('RESOLUTION'):
resolution = chunk.split('=')[1]
for Jorr-el
>>>> fu = '#EXT-X-STREAM-INF:BANDWIDTH=5857392,RESOLUTION=1980x1080,CODECS="avc1.42c02a,mp4a.40.2"'
>>>> for chunk in fu.split(':')[1].split(','):
.... if chunk.startswith('BANDWIDTH'):
.... bandwidth = int(chunk.split('=')[1])
.... if chunk.startswith('RESOLUTION'):
.... resolution = chunk.split('=')[1]
....
>>>> bandwidth
5857392
>>>> resolution
'1980x1080'
>>>>

I found this one might be help.
http://sourceforge.net/projects/m3u8parser/
(License: LGPLv3)

You can also use: Python m3u8 parser.
Example below:
import m3u8
playlist = """
#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=476416,RESOLUTION=416x234
Stream1/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=763319,RESOLUTION=480x270
Stream2/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1050224,RESOLUTION=640x360
Stream3/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1910937,RESOLUTION=640x360
Stream4/index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=3775816,RESOLUTION=1280x720
Stream5/index.m3u8
"""
_playlist = m3u8.loads(playlist).playlists
for item in _playlist:
item_uri = item.uri
resolution = item.stream_info.resolution
bandwidth = item.stream_info.bandwidth
print(item_uri ,resolution , bandwidth )
result will be :
Stream1/index.m3u8 (416, 234) 476416
Stream2/index.m3u8 (480, 270) 763319
Stream3/index.m3u8 (640, 360) 1050224
Stream4/index.m3u8 (640, 360) 1910937
Stream5/index.m3u8 (1280, 720) 3775816

String.format uses comma instead of point

My app is working on many devices without problems so far. But now I got my new Galaxy Tab with Android 3.2 where it crashes all the time. I found out that the problem was a float in an EditText.
I am using myEditText.setText(String.format("%.1f", fMyFloat)); to put the float in the EditText. But somehow the float on my 3.2 Galaxy Tab is generated with a comma instead of a point. When I read the EditText back the app crashes of course, telling me that this is no valid float because of the comma...
What is going wrong here?

Convert float to string..
From the documentation of String.format:
String.format(String format, Object... args)
Returns a localized formatted string, using the supplied format and arguments, using the user's default locale.
The quoted text above means that the output of String.format will match the default locale the user uses.
As an example a comma would be used as the decimal-point-delimiter if it's a user using Swedish locale, but a dot if it's using an American.
If you'd like to force what locale is going to be used, use the overload of String.format that accepts three parameters:
String.format (Locale locale, String format, Object... args)
Convert string to float..
Parsing an arbitrary string into a float using the default locale is quite easy, all you need to do is to use DecimalFormat.parse.
Then use .parse to get a Number and call floatValue on this returned object.

Your format call on your Galaxy Tab uses some default Locale which in turn uses , for floats. You could use String.format(Locale,String,...) version with specific locale to make things work.
Or you should've used same locale both for parsing and formatting the number. So you should probably go with NumberFormat to format and parse your floats.

String.format uses the locale you are in. You should do something like this if you want a dot:
NumberFormat formatter = NumberFormat.getInstance(Locale.US);
myEditText.setText(formatter.format(fMyFloat);
Have a look into NumberFormat for more formatting options

Use below code it's works for me:
NumberFormat nf = NumberFormat.getNumberInstance(Locale.US);
DecimalFormat df = (DecimalFormat)nf;
df.applyPattern(pattern);
String output = df.format(value);
System.out.println(pattern + " " + output + " " + loc.toString());

Summing up previous answers, an easy way to have the dot instead of the comma in all country, is this:
myEditText.setText(Locale.CANADA, String.format("%.1f", fMyFloat));
And you will have your String formatted with the dot

How to call mbstowcs properly?

size_t mbstowcs(wchar_t *dest, const char *src, size_t n);
I have some information encoded using gb2312 which needs to change to unicode in android platform.
1.before calling this method, is it right to setlocale(LC_ALL, "zh_CN.UTF-8")?
2.how large need to allocate to dest?
3.What to pass to n, is it strlen(src)?
Thank you very much.

mbstowcs() will convert a string from the current locale's multibyte encoding into a wide character string. Wide character strings are not necessarily unicode, but on Linux they are (UCS32).
If you set the locale to zh_CN.UTF-8 then the current locale's multibyte encoding will be UTF-8, not GB2312. You would need to set a GB2312 locale for the input to be treated using that multibyte encoding.
The C standard implies that a single multibyte character will produce at most one wide character, so you can use strlen(src) as the upper bound on the number of wide characters required:
size_t n = strlen(src) + 1;
wchar_t *dest = malloc(n * sizeof dest[0]);
(glibc has an extension to the standard mbstowcs() interface, which allows you to pass it a NULL pointer to find out exactly how many wide characters will be produced by the conversion, but that won't help you on Android.) It works like this:
size_t n = mbstowcs(NULL, src, 0) + 1;
The value of n that should be passed is the maximum number of wide characters that should be written through the dest pointer, including the terminating null wide character.
However, you should instead look into using libiconv, which has been successfully compiled for Android. It allows you to explicitly choose the source and destination character sets you are interested in, and is a much better fit for this problem.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.

Android Java.Lang Locale Number Format I/O Asymmetry Problem - android

Related

iOS get width of each character in a string for a given font

How to describe duration in Android?

Parsing HLS m3u8 file using regular expressions

String.format uses comma instead of point

How to call mbstowcs properly?

Categories

Resources