Android Extract url with specific domain name from String - android

I am developing a JSON application. I am able to download all of the data but I'm running into an interesting issue. I am trying to grab a string with the domain name:
http://www.prindlepost.org/
When grabbing all of the JSON, I get an extremely large string which I am unable to paste in there. The part I am trying to parse out is:
<p>The road through Belgrade was quiet at 4 A.M. Besides the occasional whir of another car speeding by, my taxi was largely alone on the road. Through the windshield I could see the last traces of apartment blocks pass by as we left the outskirts of the city. Somewhere beyond the limits of my vision, I knew the airport waited, its converging neon runway lines already lighting up the pre-dawn darkness.</p>
<div class="more-link-wrap wpb_button"> Read more</div>
where I am focusing on:
Read more</div>
I'm unfamiliar with extracting strings like this. In the end, I want to be able to save the URL as its own string. For example, the above would be converted into:
String url = "http://www.prindlepost.org/2015/06/this-is-a-self-portrait/";
One thing to note, there are A LOT of URLs to narrowing down by class name may help me a bunch.
My initial guess was:
// <READ MORE>
Pattern p = Pattern.compile("href=\"(.*?)\"");
Matcher m = p.matcher(content);
String urlTemp = null;
if (m.find()) {
urlTemp = m.group(1); // this variable should contain the link URL
}
Log.d("LINK WITHIN TEXT", ""+urlTemp);
// </READ MORE>
Any help is appreciated!

It may be work trying to use something like: http://jsoup.org/
If you check out their example for parsing out links:
String html = "<p>The road through Belgrade was quiet at 4 A.M. Besides the occasional whir of another car speeding by, my taxi was largely alone on the road. Through the windshield I could see the last traces of apartment blocks pass by as we left the outskirts of the city. Somewhere beyond the limits of my vision, I knew the airport waited, its converging neon runway lines already lighting up the pre-dawn darkness.</p>"
+ "<div class=\"more-link-wrap wpb_button\">"
+ "<a href=\"http://www.prindlepost.org/2015/06/this-is-a-self-portrait/\" class=\"more-link\">"
+ "Read more</a></div>";
Document doc = Jsoup.parse(html);
Element link = doc.select("a").first();
String relHref = link.attr("href"); // == "/2015/06/this-is-a-self-portrait/"
String absHref = link.attr("abs:href"); // "http://www.prindlepost.org/2015/06/this-is-a-self-portrait/"

Related

Convert country code 2 letter to 3 letter in Objective-C

I'm working on a function that gets the country code from the phone, but when I get the country code it consists of 2 letters, but I want it to return three letters.
For example US -> USA
In Android, java supports converting from 2 characters to 3 characters with the following code:
Locale locale = new Locale("en", countryCode);
return locale.getISO3Country();
But in iOS with Objective-C I don't know how to convert it, so can anyone help me to solve this problem?
for the sake of standardisation there is no ISO 3166-1 alpha-3 code on apple platforms to convert to. More the other way around, you could use a 3 letter code and still find the 2 letter code.
and if you want to keep at least some consistency to your android code then you need to implement some LUT table supporting this off-standard feature yourself. The available list is not very long anyway (256 codes).
NSArray *isoCountrys = [NSLocale ISOCountryCodes];
for (NSString *code in isoCountrys) {
NSLocale *locale = [[NSLocale alloc] initWithLocaleIdentifier:code];
// country name in native language
NSString *country = [locale localizedStringForCountryCode:code];
NSString *iso3 = LUTisoA3counterpartCodes[code];
NSLog(#"%# %# %# %#",code, iso3, country, locale.localeIdentifier);
}
Docu NSLocale -localizedStringForCountryCode:
Docu NSLocale -countryCode
the LUT could look like.. and is much better stored in a plist then the following runtime allocated dictionary.
NSDictionary *LUTisoA3counterpartCodes = #{
#"AC":#"SHN",#"AW":#"ABW",#"AF":#"AFG",#"AO":#"AGO",#"AI":#"AIA",#"AX":#"ALA",
#"AL":#"ALB",#"AD":#"AND",#"AE":#"ARE",#"AR":#"ARG",#"AM":#"ARM",#"AS":#"ASM",
#"AQ":#"ATA",#"TF":#"ATF",#"AG":#"ATG",#"AU":#"AUS",#"AT":#"AUT",#"AZ":#"AZE",
#"BI":#"BDI",#"BE":#"BEL",#"BJ":#"BEN",#"BQ":#"BES",#"BF":#"BFA",#"BD":#"BGD",
#"BG":#"BGR",#"BH":#"BHR",#"BS":#"BHS",#"BA":#"BIH",#"BL":#"BLM",#"BY":#"BLR",
#"BZ":#"BLZ",#"BM":#"BMU",#"BO":#"BOL",#"BR":#"BRA",#"BB":#"BRB",#"BN":#"BRN",
#"BT":#"BTN",#"BV":#"BVT",#"BW":#"BWA",#"CF":#"CAF",#"CA":#"CAN",#"CC":#"CCK",
#"CH":#"CHE",#"CL":#"CHL",#"CN":#"CHN",#"CI":#"CIV",#"CM":#"CMR",#"CD":#"COD",
#"CG":#"COG",#"CK":#"COK",#"CO":#"COL",#"KM":#"COM",#"CV":#"CPV",#"CR":#"CRI",
#"CU":#"CUB",#"CW":#"CUW",#"CX":#"CXR",#"KY":#"CYM",#"CY":#"CYP",#"CZ":#"CZE",
#"DE":#"DEU",#"DJ":#"DJI",#"DM":#"DMA",#"DK":#"DNK",#"DO":#"DOM",#"DZ":#"DZA",
#"EC":#"ECU",#"EG":#"EGY",#"ER":#"ERI",#"EH":#"ESH",#"ES":#"ESP",#"EE":#"EST",
#"ET":#"ETH",#"FI":#"FIN",#"FJ":#"FJI",#"FK":#"FLK",#"FR":#"FRA",#"FO":#"FRO",
#"FM":#"FSM",#"GA":#"GAB",#"GB":#"GBR",#"GE":#"GEO",#"GG":#"GGY",#"GH":#"GHA",
#"GI":#"GIB",#"GN":#"GIN",#"GP":#"GLP",#"GM":#"GMB",#"GW":#"GNB",#"GQ":#"GNQ",
#"GR":#"GRC",#"GD":#"GRD",#"GL":#"GRL",#"GT":#"GTM",#"GF":#"GUF",#"GU":#"GUM",
#"GY":#"GUY",#"HK":#"HKG",#"HM":#"HMD",#"HN":#"HND",#"HR":#"HRV",#"HT":#"HTI",
#"HU":#"HUN",#"ID":#"IDN",#"IM":#"IMN",#"IN":#"IND",#"IO":#"IOT",#"IE":#"IRL",
#"IR":#"IRN",#"IQ":#"IRQ",#"IS":#"ISL",#"IL":#"ISR",#"IT":#"ITA",#"JM":#"JAM",
#"JE":#"JEY",#"JO":#"JOR",#"JP":#"JPN",#"KZ":#"KAZ",#"KE":#"KEN",#"KG":#"KGZ",
#"KH":#"KHM",#"KI":#"KIR",#"KN":#"KNA",#"KR":#"KOR",#"KW":#"KWT",#"LA":#"LAO",
#"LB":#"LBN",#"LR":#"LBR",#"LY":#"LBY",#"LC":#"LCA",#"LI":#"LIE",#"LK":#"LKA",
#"LS":#"LSO",#"LT":#"LTU",#"LU":#"LUX",#"LV":#"LVA",#"MO":#"MAC",#"MF":#"MAF",
#"MA":#"MAR",#"MC":#"MCO",#"MD":#"MDA",#"MG":#"MDG",#"MV":#"MDV",#"MX":#"MEX",
#"MH":#"MHL",#"MK":#"MKD",#"ML":#"MLI",#"MT":#"MLT",#"MM":#"MMR",#"ME":#"MNE",
#"MN":#"MNG",#"MP":#"MNP",#"MZ":#"MOZ",#"MR":#"MRT",#"MS":#"MSR",#"MQ":#"MTQ",
#"MU":#"MUS",#"MW":#"MWI",#"MY":#"MYS",#"YT":#"MYT",#"NA":#"NAM",#"NC":#"NCL",
#"NE":#"NER",#"NF":#"NFK",#"NG":#"NGA",#"NI":#"NIC",#"NU":#"NIU",#"NL":#"NLD",
#"NO":#"NOR",#"NP":#"NPL",#"NR":#"NRU",#"NZ":#"NZL",#"OM":#"OMN",#"PK":#"PAK",
#"PA":#"PAN",#"PN":#"PCN",#"PE":#"PER",#"PH":#"PHL",#"PW":#"PLW",#"PG":#"PNG",
#"PL":#"POL",#"PR":#"PRI",#"KP":#"PRK",#"PT":#"PRT",#"PY":#"PRY",#"PS":#"PSE",
#"PF":#"PYF",#"QA":#"QAT",#"RE":#"REU",#"RO":#"ROU",#"RU":#"RUS",#"RW":#"RWA",
#"SA":#"SAU",#"SD":#"SDN",#"SN":#"SEN",#"SG":#"SGP",#"GS":#"SGS",#"SH":#"SHN",
#"SJ":#"SJM",#"SB":#"SLB",#"SL":#"SLE",#"SV":#"SLV",#"SM":#"SMR",#"SO":#"SOM",
#"PM":#"SPM",#"RS":#"SRB",#"SS":#"SSD",#"ST":#"STP",#"SR":#"SUR",#"SK":#"SVK",
#"SI":#"SVN",#"SE":#"SWE",#"SZ":#"SWZ",#"SX":#"SXM",#"SC":#"SYC",#"SY":#"SYR",
#"TC":#"TCA",#"TD":#"TCD",#"TG":#"TGO",#"TH":#"THA",#"TJ":#"TJK",#"TK":#"TKL",
#"TM":#"TKM",#"TL":#"TLS",#"TO":#"TON",#"TT":#"TTO",#"TN":#"TUN",#"TR":#"TUR",
#"TV":#"TUV",#"TW":#"TWN",#"TZ":#"TZA",#"UG":#"UGA",#"UA":#"UKR",#"UM":#"UMI",
#"UY":#"URY",#"US":#"USA",#"UZ":#"UZB",#"VA":#"VAT",#"VC":#"VCT",#"VE":#"VEN",
#"VG":#"VGB",#"VI":#"VIR",#"VN":#"VNM",#"VU":#"VUT",#"WF":#"WLF",#"WS":#"WSM",
#"XK":#"XKV",#"YE":#"YEM",#"ZA":#"ZAF",#"ZM":#"ZMB",#"ZW":#"ZWE",
//unknown status or codes, to be changed soon
#"DG":#"DGA" , //Diego Garcia
#"EA":#"EA_" , //Ceuta and Melilla
#"CP":#"CPT" , //Clipperton Island -> French Polynesia
#"IC":#"IC_" , //Kanarian Island
#"TA":#"TAA" , //டிரிஸ்டன் டா குன்ஹா , Tristan da Cunha -> St.Helena
};
this LUT makes it easy to lookup by 2 letter code and get the 3 letter codes. And in reality the list is much longer and matter of permanent changes.
and if you trust the sorting of Apples API you could just use a static NSArray instead of a plist or NSDictionary. The following prints it for use..
int i=1;
fprintf(stderr,"\nstatic NSString *isoA3accordingToAppleSorting[256] = {\n");
for (NSString *code in isoCountrys) {
if (i%20 == 19) fprintf(stderr,"\n");
NSString *iso3 = LUTisoA3counterpartCodes[code];
fprintf(stderr,"#\"%s\",",iso3.UTF8String);
i++;
}
fprintf(stderr,"};\n");
which looks like..
static NSString *countryCodeAsA3accordingToAppleSorting[256] = {
#"SHN",#"AND",#"ARE",#"AFG",#"ATG",#"AIA",#"ALB",#"ARM",#"AGO",#"ATA",#"ARG",#"ASM",#"AUT",#"AUS",#"ABW",#"ALA",#"AZE",#"BIH",
#"BRB",#"BGD",#"BEL",#"BFA",#"BGR",#"BHR",#"BDI",#"BEN",#"BLM",#"BMU",#"BRN",#"BOL",#"BES",#"BRA",#"BHS",#"BTN",#"BVT",#"BWA",#"BLR",#"BLZ",
#"CAN",#"CCK",#"COD",#"CAF",#"COG",#"CHE",#"CIV",#"COK",#"CHL",#"CMR",#"CHN",#"COL",#"CPT",#"CRI",#"CUB",#"CPV",#"CUW",#"CXR",#"CYP",#"CZE",
#"DEU",#"DGA",#"DJI",#"DNK",#"DMA",#"DOM",#"DZA",#"EA_",#"ECU",#"EST",#"EGY",#"ESH",#"ERI",#"ESP",#"ETH",#"FIN",#"FJI",#"FLK",#"FSM",#"FRO",
#"FRA",#"GAB",#"GBR",#"GRD",#"GEO",#"GUF",#"GGY",#"GHA",#"GIB",#"GRL",#"GMB",#"GIN",#"GLP",#"GNQ",#"GRC",#"SGS",#"GTM",#"GUM",#"GNB",#"GUY",
#"HKG",#"HMD",#"HND",#"HRV",#"HTI",#"HUN",#"IC_",#"IDN",#"IRL",#"ISR",#"IMN",#"IND",#"IOT",#"IRQ",#"IRN",#"ISL",#"ITA",#"JEY",#"JAM",#"JOR",
#"JPN",#"KEN",#"KGZ",#"KHM",#"KIR",#"COM",#"KNA",#"PRK",#"KOR",#"KWT",#"CYM",#"KAZ",#"LAO",#"LBN",#"LCA",#"LIE",#"LKA",#"LBR",#"LSO",#"LTU",
#"LUX",#"LVA",#"LBY",#"MAR",#"MCO",#"MDA",#"MNE",#"MAF",#"MDG",#"MHL",#"MKD",#"MLI",#"MMR",#"MNG",#"MAC",#"MNP",#"MTQ",#"MRT",#"MSR",#"MLT",
#"MUS",#"MDV",#"MWI",#"MEX",#"MYS",#"MOZ",#"NAM",#"NCL",#"NER",#"NFK",#"NGA",#"NIC",#"NLD",#"NOR",#"NPL",#"NRU",#"NIU",#"NZL",#"OMN",#"PAN",
#"PER",#"PYF",#"PNG",#"PHL",#"PAK",#"POL",#"SPM",#"PCN",#"PRI",#"PSE",#"PRT",#"PLW",#"PRY",#"QAT",#"REU",#"ROU",#"SRB",#"RUS",#"RWA",#"SAU",
#"SLB",#"SYC",#"SDN",#"SWE",#"SGP",#"SHN",#"SVN",#"SJM",#"SVK",#"SLE",#"SMR",#"SEN",#"SOM",#"SUR",#"SSD",#"STP",#"SLV",#"SXM",#"SYR",#"SWZ",
#"TAA",#"TCA",#"TCD",#"ATF",#"TGO",#"THA",#"TJK",#"TKL",#"TLS",#"TKM",#"TUN",#"TON",#"TUR",#"TTO",#"TUV",#"TWN",#"TZA",#"UKR",#"UGA",#"UMI",
#"USA",#"URY",#"UZB",#"VAT",#"VCT",#"VEN",#"VGB",#"VIR",#"VNM",#"VUT",#"WLF",#"WSM",#"XKV",#"YEM",#"MYT",#"ZAF",#"ZMB",#"ZWE",};
but then you have to find the index of your 2 letter code in apples ISOCountryCodes to look them up accordingly.
Reminder. The ISO 3166-1 alpha-3 explains only that it should have 3 letters, not which letter exactly

How to convert a base64 String to a ZPL label for print on a Zebra ZQ630 printer

For context, our application is an Android WebView that loads a url (web app written in React) with a print feature. The flow of the app is that once the print button is clicked, it triggers a print method on the Android side through an #Javascript Interface bringing with it a payload - A base 64 String, that we convert in the Android side of code to print. Note -- ( The printer is connected to the android device )
Issue is that the conversion is coming out like instead of like .
To further complicate the issue, on base64decode.net using google chrome, the conversion presents no issues, but if you try the same payload on the same site using Safari, it ends up scrambled as in our app as also shown above.
I have tried using Zebra SDK Base64 API and none seems to help thus far.
I've tried to convert the Base64 String on the React side of my app using atob but even when it successfully converts and displays this code. On Labelary.com it wouldn't generate any image and throws error.
I guess my question would be if anyone has experienced this before and does anyone know a way around it. --A good say to generate a ZPL string that would work on Labelary.com either on Java or Javascript
// This code is the result of atob conversion that wouldn't generate a ZPL on labelary.com
^XA
^PW812
^CI13
^FT0,510^GB809,0,2^FS
^FT0,423^GB809,0,20^FS
^FT244,402^GB0,215,2^FS
^FT0,187^GB809,0,2^FS
^FT20,20^A0N,18,22^FDJCPENNEY.COM^FS
^FT20,43^A0N,18,22^FD5555 SCARBOROUGH BLVD^FS
^FT20,65^A0N,18,22^FDCOLUMBUS OH 43232^FS
^FT447,30^A0N,23,29^FD1 LBS^FS
^FT630,30^A0N,23,29^FD1 OF 1^FS
^FT20,122^A0N,28,35^FDSHIP^FS
^FT20,150^A0N,28,35^FD TO:^FS
^FT122,118^A0N,23,29^FDUSPS 48182^FS
^FT122,144^A0N,23,29^FD8149 LEWIS AVE^FS
^FT122,177^A0N,28,35^FH^FDTEMPERANCE MI 48182_F09998^FS
^FT20,396^BD2^FH^FD988840481829998[)>_1E01_1D961Z00316075_1DUPSN_1DW2A813_1E07L$4Y29L'_1D+_1DH:ZGX/,ZX2&O#( *XZ6F+XD1A/*_0D:+GDI_0D_1E_04^FS
^FT284,252^A0N,65,81^FH^FD MI 482 0_F001 X^FS
^BY4,,102^FT330,382^BCN,,N^FD>;420481829998^FS
^FT20,467^A0N,42,52^FDUPS SUREPOST^FS
^FT20,500^A0N,23,29^FDTRACKING #: 1Z W2A 813 YW 0031 6075^FS
^FT687,508^GB122,0,85^FS
^BY3,,142^FT106,664^BCN,,N^FD>:1ZW2A813YW>500316075^FS
^FT0,695^GB809,0,14^FS
^FT20,721^A0N,28,35^FDUSPS DELIVER TO:^FS
^FT20,743^A0N,18,22^FDMARCIA SMOTHERMAN^FS
^FT20,765^A0N,18,22^FD268 HIGHLANDS^FS
^FT20,787^A0N,18,22^FH^FDTEMPERANCE MI 48182_F01189^FS
^FT356,721^A0N,18,22^FH^FDCarrier_F0Leave^FS
^FT356,746^A0N,18,22^FDIf No Response^FS
^FT569,813^GB213,112,2^FS
^FT603,723^A0N,18,22^FH^FDPARCEL SELECT^FS
^FT586,747^A0N,18,22^FH^FDU.S. POSTAGE PAID^FS
^FT658,771^A0N,18,22^FH^FDUPS^FS
^FT659,795^A0N,18,22^FH^FDeVS^FS
^FT0,839^GB809,0,14^FS
^FT221,883^A0N,32,40^FDUSPS TRACKING # eVS^FS
^BY3,,156^FT40,1079^BCN,,N^FD>;>842048182>892612909859896551001000113^FS
^FT156,1135^A0N,28,35^FD9261 2909 8598 9655 1001 0001 13^FS
^FT0,1148^GB809,0,8^FS
^FT508,1193^A0N,23,29^FDREF1: 2020066410165651^FS
^FT508,1215^A0N,23,29^FDContainer ID: 307497242^FS
^BY2,,30^FT20,1189^BCN,,N^FD>;257977480900^FS
^FT20,1215^A0N,23,29^FD257977480900^FS
^XZ
Note: I've had some other base64 String conversion that worked well but not all of them. Below is the same code - converted on Base64decode.net on Chrome but it works well on Labelary.com
^XA
^PW812
^CI13
^FT0,510^GB809,0,2^FS
^FT0,423^GB809,0,20^FS
^FT244,402^GB0,215,2^FS
^FT0,187^GB809,0,2^FS
^FT20,20^A0N,18,22^FDJCPENNEY.COM^FS
^FT20,43^A0N,18,22^FD5555 SCARBOROUGH BLVD^FS
^FT20,65^A0N,18,22^FDCOLUMBUS OH 43232^FS
^FT447,30^A0N,23,29^FD1 LBS^FS
^FT630,30^A0N,23,29^FD1 OF 1^FS
^FT20,122^A0N,28,35^FDSHIP^FS
^FT20,150^A0N,28,35^FD TO:^FS
^FT122,118^A0N,23,29^FDUSPS 48182^FS
^FT122,144^A0N,23,29^FD8149 LEWIS AVE^FS
^FT122,177^A0N,28,35^FH^FDTEMPERANCE MI 48182_F09998^FS
^FT20,396^BD2^FH^FD988840481829998[)>_1E01_1D961Z00316075_1DUPSN_1DW2A813_1E07L$4Y29L'_1D+_1DH:ZGX/,ZX2&O#( *XZ6F+XD1A/*_0D:+GDI_0D_1E_04^FS
^FT284,252^A0N,65,81^FH^FD MI 482 0_F001 X^FS
^BY4,,102^FT330,382^BCN,,N^FD>;420481829998^FS
^FT20,467^A0N,42,52^FDUPS SUREPOST^FS
^FT20,500^A0N,23,29^FDTRACKING #: 1Z W2A 813 YW 0031 6075^FS
^FT687,508^GB122,0,85^FS
^BY3,,142^FT106,664^BCN,,N^FD>:1ZW2A813YW>500316075^FS
^FT0,695^GB809,0,14^FS
^FT20,721^A0N,28,35^FDUSPS DELIVER TO:^FS
^FT20,743^A0N,18,22^FDMARCIA SMOTHERMAN^FS
^FT20,765^A0N,18,22^FD268 HIGHLANDS^FS
^FT20,787^A0N,18,22^FH^FDTEMPERANCE MI 48182_F01189^FS
^FT356,721^A0N,18,22^FH^FDCarrier_F0Leave^FS
^FT356,746^A0N,18,22^FDIf No Response^FS
^FT569,813^GB213,112,2^FS
^FT603,723^A0N,18,22^FH^FDPARCEL SELECT^FS
^FT586,747^A0N,18,22^FH^FDU.S. POSTAGE PAID^FS
^FT658,771^A0N,18,22^FH^FDUPS^FS
^FT659,795^A0N,18,22^FH^FDeVS^FS
^FT0,839^GB809,0,14^FS
^FT221,883^A0N,32,40^FDUSPS TRACKING # eVS^FS
^BY3,,156^FT40,1079^BCN,,N^FD>;>842048182>892612909859896551001000113^FS
^FT156,1135^A0N,28,35^FD9261 2909 8598 9655 1001 0001 13^FS
^FT0,1148^GB809,0,8^FS
^FT508,1193^A0N,23,29^FDREF1: 2020066410165651^FS
^FT508,1215^A0N,23,29^FDContainer ID: 307497242^FS
^BY2,,30^FT20,1189^BCN,,N^FD>;257977480900^FS
^FT20,1215^A0N,23,29^FD257977480900^FS
^XZ
Finally, this is the base64 String in question:
XgBYAEEADQBeAFAAVwA4ADEAMgANAF4AQwBJADEAMwANAF4ARgBUADAALAA1ADEAMABeAEcAQgA4ADAAOQAsADAALAAyAF4ARgBTAA0AXgBGAFQAMAAsADQAMgAzAF4ARwBCADgAMAA5ACwAMAAsADIAMABeAEYAUwANAF4ARgBUADIANAA0ACwANAAwADIAXgBHAEIAMAAsADIAMQA1ACwAMgBeAEYAUwANAF4ARgBUADAALAAxADgANwBeAEcAQgA4ADAAOQAsADAALAAyAF4ARgBTAA0AXgBGAFQAMgAwACwAMgAwAF4AQQAwAE4ALAAxADgALAAyADIAXgBGAEQASgBDAFAARQBOAE4ARQBZAC4AQwBPAE0AXgBGAFMADQBeAEYAVAAyADAALAA0ADMAXgBBADAATgAsADEAOAAsADIAMgBeAEYARAA1ADUANQA1ACAAUwBDAEEAUgBCAE8AUgBPAFUARwBIACAAQgBMAFYARABeAEYAUwANAF4ARgBUADIAMAAsADYANQBeAEEAMABOACwAMQA4ACwAMgAyAF4ARgBEAEMATwBMAFUATQBCAFUAUwAgAE8ASAAgADQAMwAyADMAMgBeAEYAUwANAF4ARgBUADQANAA3ACwAMwAwAF4AQQAwAE4ALAAyADMALAAyADkAXgBGAEQAMQAgAEwAQgBTAF4ARgBTAA0AXgBGAFQANgAzADAALAAzADAAXgBBADAATgAsADIAMwAsADIAOQBeAEYARAAxACAATwBGACAAMQBeAEYAUwANAF4ARgBUADIAMAAsADEAMgAyAF4AQQAwAE4ALAAyADgALAAzADUAXgBGAEQAUwBIAEkAUABeAEYAUwANAF4ARgBUADIAMAAsADEANQAwAF4AQQAwAE4ALAAyADgALAAzADUAXgBGAEQAIABUAE8AOgBeAEYAUwANAF4ARgBUADEAMgAyACwAMQAxADgAXgBBADAATgAsADIAMwAsADIAOQBeAEYARABVAFMAUABTACAANAA4ADEAOAAyAF4ARgBTAA0AXgBGAFQAMQAyADIALAAxADQANABeAEEAMABOACwAMgAzACwAMgA5AF4ARgBEADgAMQA0ADkAIABMAEUAVwBJAFMAIABBAFYARQBeAEYAUwANAF4ARgBUADEAMgAyACwAMQA3ADcAXgBBADAATgAsADIAOAAsADMANQBeAEYASABeAEYARABUAEUATQBQAEUAUgBBAE4AQwBFACAATQBJACAANAA4ADEAOAAyAF8ARgAwADkAOQA5ADgAXgBGAFMADQBeAEYAVAAyADAALAAzADkANgBeAEIARAAyAF4ARgBIAF4ARgBEADkAOAA4ADgANAAwADQAOAAxADgAMgA5ADkAOQA4AFsAKQA+AF8AMQBFADAAMQBfADEARAA5ADYAMQBaADAAMAAzADEANgAwADcANQBfADEARABVAFAAUwBOAF8AMQBEAFcAMgBBADgAMQAzAF8AMQBFADAANwBMACQANABZADIAOQBMACcAXwAxAEQAKwBfADEARABIADoAWgBHAFgALwAsAFoAWAAyACYATwAjACgAIAAqAFgAWgA2AEYAKwBYAEQAMQBBAC8AKgBfADAARAA6ACsARwBEAEkAXwAwAEQAXwAxAEUAXwAwADQAXgBGAFMADQBeAEYAVAAyADgANAAsADIANQAyAF4AQQAwAE4ALAA2ADUALAA4ADEAXgBGAEgAXgBGAEQAIABNAEkAIAA0ADgAMgAgADAAXwBGADAAMAAxACAAWABeAEYAUwANAF4AQgBZADQALAAsADEAMAAyAF4ARgBUADMAMwAwACwAMwA4ADIAXgBCAEMATgAsACwATgBeAEYARAA+ADsANAAyADAANAA4ADEAOAAyADkAOQA5ADgAXgBGAFMADQBeAEYAVAAyADAALAA0ADYANwBeAEEAMABOACwANAAyACwANQAyAF4ARgBEAFUAUABTACAAUwBVAFIARQBQAE8AUwBUAF4ARgBTAA0AXgBGAFQAMgAwACwANQAwADAAXgBBADAATgAsADIAMwAsADIAOQBeAEYARABUAFIAQQBDAEsASQBOAEcAIAAjADoAIAAxAFoAIABXADIAQQAgADgAMQAzACAAWQBXACAAMAAwADMAMQAgADYAMAA3ADUAXgBGAFMADQBeAEYAVAA2ADgANwAsADUAMAA4AF4ARwBCADEAMgAyACwAMAAsADgANQBeAEYAUwANAF4AQgBZADMALAAsADEANAAyAF4ARgBUADEAMAA2ACwANgA2ADQAXgBCAEMATgAsACwATgBeAEYARAA+ADoAMQBaAFcAMgBBADgAMQAzAFkAVwA+ADUAMAAwADMAMQA2ADAANwA1AF4ARgBTAA0AXgBGAFQAMAAsADYAOQA1AF4ARwBCADgAMAA5ACwAMAAsADEANABeAEYAUwANAF4ARgBUADIAMAAsADcAMgAxAF4AQQAwAE4ALAAyADgALAAzADUAXgBGAEQAVQBTAFAAUwAgAEQARQBMAEkAVgBFAFIAIABUAE8AOgBeAEYAUwANAF4ARgBUADIAMAAsADcANAAzAF4AQQAwAE4ALAAxADgALAAyADIAXgBGAEQATQBBAFIAQwBJAEEAIABTAE0ATwBUAEgARQBSAE0AQQBOAF4ARgBTAA0AXgBGAFQAMgAwACwANwA2ADUAXgBBADAATgAsADEAOAAsADIAMgBeAEYARAAyADYAOAAgAEgASQBHAEgATABBAE4ARABTAF4ARgBTAA0AXgBGAFQAMgAwACwANwA4ADcAXgBBADAATgAsADEAOAAsADIAMgBeAEYASABeAEYARABUAEUATQBQAEUAUgBBAE4AQwBFACAATQBJACAANAA4ADEAOAAyAF8ARgAwADEAMQA4ADkAXgBGAFMADQBeAEYAVAAzADUANgAsADcAMgAxAF4AQQAwAE4ALAAxADgALAAyADIAXgBGAEgAXgBGAEQAQwBhAHIAcgBpAGUAcgBfAEYAMABMAGUAYQB2AGUAXgBGAFMADQBeAEYAVAAzADUANgAsADcANAA2AF4AQQAwAE4ALAAxADgALAAyADIAXgBGAEQASQBmACAATgBvACAAUgBlAHMAcABvAG4AcwBlAF4ARgBTAA0AXgBGAFQANQA2ADkALAA4ADEAMwBeAEcAQgAyADEAMwAsADEAMQAyACwAMgBeAEYAUwANAF4ARgBUADYAMAAzACwANwAyADMAXgBBADAATgAsADEAOAAsADIAMgBeAEYASABeAEYARABQAEEAUgBDAEUATAAgAFMARQBMAEUAQwBUAF4ARgBTAA0AXgBGAFQANQA4ADYALAA3ADQANwBeAEEAMABOACwAMQA4ACwAMgAyAF4ARgBIAF4ARgBEAFUALgBTAC4AIABQAE8AUwBUAEEARwBFACAAUABBAEkARABeAEYAUwANAF4ARgBUADYANQA4ACwANwA3ADEAXgBBADAATgAsADEAOAAsADIAMgBeAEYASABeAEYARABVAFAAUwBeAEYAUwANAF4ARgBUADYANQA5ACwANwA5ADUAXgBBADAATgAsADEAOAAsADIAMgBeAEYASABeAEYARABlAFYAUwBeAEYAUwANAF4ARgBUADAALAA4ADMAOQBeAEcAQgA4ADAAOQAsADAALAAxADQAXgBGAFMADQBeAEYAVAAyADIAMQAsADgAOAAzAF4AQQAwAE4ALAAzADIALAA0ADAAXgBGAEQAVQBTAFAAUwAgAFQAUgBBAEMASwBJAE4ARwAgACMAIABlAFYAUwBeAEYAUwANAF4AQgBZADMALAAsADEANQA2AF4ARgBUADQAMAAsADEAMAA3ADkAXgBCAEMATgAsACwATgBeAEYARAA+ADsAPgA4ADQAMgAwADQAOAAxADgAMgA+ADgAOQAyADYAMQAyADkAMAA5ADgANQA5ADgAOQA2ADUANQAxADAAMAAxADAAMAAwADEAMQAzAF4ARgBTAA0AXgBGAFQAMQA1ADYALAAxADEAMwA1AF4AQQAwAE4ALAAyADgALAAzADUAXgBGAEQAOQAyADYAMQAgADIAOQAwADkAIAA4ADUAOQA4ACAAOQA2ADUANQAgADEAMAAwADEAIAAwADAAMAAxACAAMQAzAF4ARgBTAA0AXgBGAFQAMAAsADEAMQA0ADgAXgBHAEIAOAAwADkALAAwACwAOABeAEYAUwANAF4ARgBUADUAMAA4ACwAMQAxADkAMwBeAEEAMABOACwAMgAzACwAMgA5AF4ARgBEAFIARQBGADEAOgAgADIAMAAyADAAMAA2ADYANAAxADAAMQA2ADUANgA1ADEAXgBGAFMADQBeAEYAVAA1ADAAOAAsADEAMgAxADUAXgBBADAATgAsADIAMwAsADIAOQBeAEYARABDAG8AbgB0AGEAaQBuAGUAcgAgAEkARAA6ACAAMwAwADcANAA5ADcAMgA0ADIAXgBGAFMADQBeAEIAWQAyACwALAAzADAAXgBGAFQAMgAwACwAMQAxADgAOQBeAEIAQwBOACwALABOAF4ARgBEAD4AOwAyADUANwA5ADcANwA0ADgAMAA5ADAAMABeAEYAUwANAF4ARgBUADIAMAAsADEAMgAxADUAXgBBADAATgAsADIAMwAsADIAOQBeAEYARAAyADUANwA5ADcANwA0ADgAMAA5ADAAMABeAEYAUwANAF4AWABaAA==
Convert your base64 String to UTF-8 using this code:
String Mybase64 = "dGVjaFBhC3M=";
//1- Convert to byte
byte[] X = Base64.decode(Mybase64);
//2- Convert to UTF-8
String ZPL_Result = new String(X, "UTF-8");
Update*
string b64 = "XgBYAEEADQBeAFAAVwA4ADEAMgANAF4AQwBJADEAMwANAF4ARgBUADAALAA1ADEAMABeAEcAQgA4ADAAOQAsADAALAAyAF4ARgBTAA0AXgBGAFQAMAAsADQAMgAzAF4ARwBCADgAMAA5ACwAMAAsADIAMABeAEYAUwANAF4ARgBUADIANAA0ACwANAAwADIAXgBHAEIAMAAsADIAMQA1ACwAMgBeAEYAUwANAF4ARgBUADAALAAxADgANwBeAEcAQgA4ADAAOQAsADAALAAyAF4ARgBTAA0AXgBGAFQAMgAwACwAMgAwAF4AQQAwAE4ALAAxADgALAAyADIAXgBGAEQASgBDAFAARQBOAE4ARQBZAC4AQwBPAE0AXgBGAFMADQBeAEYAVAAyADAALAA0ADMAXgBBADAATgAsADEAOAAsADIAMgBeAEYARAA1ADUANQA1ACAAUwBDAEEAUgBCAE8AUgBPAFUARwBIACAAQgBMAFYARABeAEYAUwANAF4ARgBUADIAMAAsADYANQBeAEEAMABOACwAMQA4ACwAMgAyAF4ARgBEAEMATwBMAFUATQBCAFUAUwAgAE8ASAAgADQAMwAyADMAMgBeAEYAUwANAF4ARgBUADQANAA3ACwAMwAwAF4AQQAwAE4ALAAyADMALAAyADkAXgBGAEQAMQAgAEwAQgBTAF4ARgBTAA0AXgBGAFQANgAzADAALAAzADAAXgBBADAATgAsADIAMwAsADIAOQBeAEYARAAxACAATwBGACAAMQBeAEYAUwANAF4ARgBUADIAMAAsADEAMgAyAF4AQQAwAE4ALAAyADgALAAzADUAXgBGAEQAUwBIAEkAUABeAEYAUwANAF4ARgBUADIAMAAsADEANQAwAF4AQQAwAE4ALAAyADgALAAzADUAXgBGAEQAIABUAE8AOgBeAEYAUwANAF4ARgBUADEAMgAyACwAMQAxADgAXgBBADAATgAsADIAMwAsADIAOQBeAEYARABVAFMAUABTACAANAA4ADEAOAAyAF4ARgBTAA0AXgBGAFQAMQAyADIALAAxADQANABeAEEAMABOACwAMgAzACwAMgA5AF4ARgBEADgAMQA0ADkAIABMAEUAVwBJAFMAIABBAFYARQBeAEYAUwANAF4ARgBUADEAMgAyACwAMQA3ADcAXgBBADAATgAsADIAOAAsADMANQBeAEYASABeAEYARABUAEUATQBQAEUAUgBBAE4AQwBFACAATQBJACAANAA4ADEAOAAyAF8ARgAwADkAOQA5ADgAXgBGAFMADQBeAEYAVAAyADAALAAzADkANgBeAEIARAAyAF4ARgBIAF4ARgBEADkAOAA4ADgANAAwADQAOAAxADgAMgA5ADkAOQA4AFsAKQA+AF8AMQBFADAAMQBfADEARAA5ADYAMQBaADAAMAAzADEANgAwADcANQBfADEARABVAFAAUwBOAF8AMQBEAFcAMgBBADgAMQAzAF8AMQBFADAANwBMACQANABZADIAOQBMACcAXwAxAEQAKwBfADEARABIADoAWgBHAFgALwAsAFoAWAAyACYATwAjACgAIAAqAFgAWgA2AEYAKwBYAEQAMQBBAC8AKgBfADAARAA6ACsARwBEAEkAXwAwAEQAXwAxAEUAXwAwADQAXgBGAFMADQBeAEYAVAAyADgANAAsADIANQAyAF4AQQAwAE4ALAA2ADUALAA4ADEAXgBGAEgAXgBGAEQAIABNAEkAIAA0ADgAMgAgADAAXwBGADAAMAAxACAAWABeAEYAUwANAF4AQgBZADQALAAsADEAMAAyAF4ARgBUADMAMwAwACwAMwA4ADIAXgBCAEMATgAsACwATgBeAEYARAA+ADsANAAyADAANAA4ADEAOAAyADkAOQA5ADgAXgBGAFMADQBeAEYAVAAyADAALAA0ADYANwBeAEEAMABOACwANAAyACwANQAyAF4ARgBEAFUAUABTACAAUwBVAFIARQBQAE8AUwBUAF4ARgBTAA0AXgBGAFQAMgAwACwANQAwADAAXgBBADAATgAsADIAMwAsADIAOQBeAEYARABUAFIAQQBDAEsASQBOAEcAIAAjADoAIAAxAFoAIABXADIAQQAgADgAMQAzACAAWQBXACAAMAAwADMAMQAgADYAMAA3ADUAXgBGAFMADQBeAEYAVAA2ADgANwAsADUAMAA4AF4ARwBCADEAMgAyACwAMAAsADgANQBeAEYAUwANAF4AQgBZADMALAAsADEANAAyAF4ARgBUADEAMAA2ACwANgA2ADQAXgBCAEMATgAsACwATgBeAEYARAA+ADoAMQBaAFcAMgBBADgAMQAzAFkAVwA+ADUAMAAwADMAMQA2ADAANwA1AF4ARgBTAA0AXgBGAFQAMAAsADYAOQA1AF4ARwBCADgAMAA5ACwAMAAsADEANABeAEYAUwANAF4ARgBUADIAMAAsADcAMgAxAF4AQQAwAE4ALAAyADgALAAzADUAXgBGAEQAVQBTAFAAUwAgAEQARQBMAEkAVgBFAFIAIABUAE8AOgBeAEYAUwANAF4ARgBUADIAMAAsADcANAAzAF4AQQAwAE4ALAAxADgALAAyADIAXgBGAEQATQBBAFIAQwBJAEEAIABTAE0ATwBUAEgARQBSAE0AQQBOAF4ARgBTAA0AXgBGAFQAMgAwACwANwA2ADUAXgBBADAATgAsADEAOAAsADIAMgBeAEYARAAyADYAOAAgAEgASQBHAEgATABBAE4ARABTAF4ARgBTAA0AXgBGAFQAMgAwACwANwA4ADcAXgBBADAATgAsADEAOAAsADIAMgBeAEYASABeAEYARABUAEUATQBQAEUAUgBBAE4AQwBFACAATQBJACAANAA4ADEAOAAyAF8ARgAwADEAMQA4ADkAXgBGAFMADQBeAEYAVAAzADUANgAsADcAMgAxAF4AQQAwAE4ALAAxADgALAAyADIAXgBGAEgAXgBGAEQAQwBhAHIAcgBpAGUAcgBfAEYAMABMAGUAYQB2AGUAXgBGAFMADQBeAEYAVAAzADUANgAsADcANAA2AF4AQQAwAE4ALAAxADgALAAyADIAXgBGAEQASQBmACAATgBvACAAUgBlAHMAcABvAG4AcwBlAF4ARgBTAA0AXgBGAFQANQA2ADkALAA4ADEAMwBeAEcAQgAyADEAMwAsADEAMQAyACwAMgBeAEYAUwANAF4ARgBUADYAMAAzACwANwAyADMAXgBBADAATgAsADEAOAAsADIAMgBeAEYASABeAEYARABQAEEAUgBDAEUATAAgAFMARQBMAEUAQwBUAF4ARgBTAA0AXgBGAFQANQA4ADYALAA3ADQANwBeAEEAMABOACwAMQA4ACwAMgAyAF4ARgBIAF4ARgBEAFUALgBTAC4AIABQAE8AUwBUAEEARwBFACAAUABBAEkARABeAEYAUwANAF4ARgBUADYANQA4ACwANwA3ADEAXgBBADAATgAsADEAOAAsADIAMgBeAEYASABeAEYARABVAFAAUwBeAEYAUwANAF4ARgBUADYANQA5ACwANwA5ADUAXgBBADAATgAsADEAOAAsADIAMgBeAEYASABeAEYARABlAFYAUwBeAEYAUwANAF4ARgBUADAALAA4ADMAOQBeAEcAQgA4ADAAOQAsADAALAAxADQAXgBGAFMADQBeAEYAVAAyADIAMQAsADgAOAAzAF4AQQAwAE4ALAAzADIALAA0ADAAXgBGAEQAVQBTAFAAUwAgAFQAUgBBAEMASwBJAE4ARwAgACMAIABlAFYAUwBeAEYAUwANAF4AQgBZADMALAAsADEANQA2AF4ARgBUADQAMAAsADEAMAA3ADkAXgBCAEMATgAsACwATgBeAEYARAA+ADsAPgA4ADQAMgAwADQAOAAxADgAMgA+ADgAOQAyADYAMQAyADkAMAA5ADgANQA5ADgAOQA2ADUANQAxADAAMAAxADAAMAAwADEAMQAzAF4ARgBTAA0AXgBGAFQAMQA1ADYALAAxADEAMwA1AF4AQQAwAE4ALAAyADgALAAzADUAXgBGAEQAOQAyADYAMQAgADIAOQAwADkAIAA4ADUAOQA4ACAAOQA2ADUANQAgADEAMAAwADEAIAAwADAAMAAxACAAMQAzAF4ARgBTAA0AXgBGAFQAMAAsADEAMQA0ADgAXgBHAEIAOAAwADkALAAwACwAOABeAEYAUwANAF4ARgBUADUAMAA4ACwAMQAxADkAMwBeAEEAMABOACwAMgAzACwAMgA5AF4ARgBEAFIARQBGADEAOgAgADIAMAAyADAAMAA2ADYANAAxADAAMQA2ADUANgA1ADEAXgBGAFMADQBeAEYAVAA1ADAAOAAsADEAMgAxADUAXgBBADAATgAsADIAMwAsADIAOQBeAEYARABDAG8AbgB0AGEAaQBuAGUAcgAgAEkARAA6ACAAMwAwADcANAA5ADcAMgA0ADIAXgBGAFMADQBeAEIAWQAyACwALAAzADAAXgBGAFQAMgAwACwAMQAxADgAOQBeAEIAQwBOACwALABOAF4ARgBEAD4AOwAyADUANwA5ADcANwA0ADgAMAA5ADAAMABeAEYAUwANAF4ARgBUADIAMAAsADEAMgAxADUAXgBBADAATgAsADIAMwAsADIAOQBeAEYARAAyADUANwA5ADcANwA0ADgAMAA5ADAAMABeAEYAUwANAF4AWABaAA==";
byte[] data = Base64.decode(b64, Base64.DEFAULT);
String ZPL_Result = new String(data, StandardCharsets.UTF_8);
Figured I leave an answer here on the approach I employed. I resorted to creating a regex expression to filter out the unicode characters that was appearing in the conversion. That way I had a clean String to print.
The precise unicode character is "u + FFFD"

Uri and WebView classes parsing URLs containing backslashes in authority (host or user information) differently

When using the URIs
String myUri = "https://evil.example.com\\.good.example.org/";
// or
String myUri = "https://evil.example.com\\#good.example.org/";
in Java on Android, the backslash in the host or user information of the authority part of the URI causes a mismatch between how Android’s android.net.Uri and android.webkit.WebView parse the URI with regard to its host.
The Uri class (and cURL) treat evil.example.com\.good.example.org (first example) or even good.example.org (second example) as the URI’s host.
The WebView class (and Firefox and Chrome) treat evil.example.com (both examples) as the URI’s host.
Is this known, expected or correct behavior? Do the two classes simply follow different standards?
Looking at the specification, it seems neither RFC 2396 nor RFC 3986 allows for a backslash in the user information or authority.
Is there any workaround to ensure a consistent behavior here, especially for validation purposes? Does the following patch look reasonable (to be used with WebView and for general correctness)?
Uri myParsedUri = Uri.parse(myUri);
if ((myParsedUri.getHost() == null || !myParsedUri.getHost().contains("\\")) && (myParsedUri.getUserInfo() == null || !myParsedUri.getUserInfo().contains("\\"))) {
// valid URI
}
else {
// invalid URI
}
One possible flaw is that this workaround may not catch all the cases that cause inconsistent hosts to be parsed. Do you know of anything else (apart from a backslash) that causes a mismatch between the two classes?
It's known that Android WebView 4.4 converts some URLs, in the linked issue are some steps described how to prevent that. From your question is not completely clear if your need is based in that issue or something else.
You can mask the backslashes and other signs with there according number in the character-table. In URLs the the number is written in hexademcimal.
Hexadecimal: 5C
Dezimal: 92
Sign: \
The code is the prepended with a % for each sign in the URL, your code looks like this after replacement:
String myUri = "https://evil.example.com%5C%5C.good.example.org/";
// or
String myUri = "https://evil.example.com%5C%5C#good.example.org/";
it might be required still to add a slash to separate domain and path:
String myUri = "https://evil.example.com/%5C%5C.good.example.org/";
// or
String myUri = "https://evil.example.com/%5C%5C#good.example.org/";
Is it possible that the backslashes never shall be used for network-communication at all but serve as escaping for some procedures like regular expressions or for output in JavaScript (Json) or some other steps?
Bonus ;-)
Below is a php-script that prints a table for most UTF-8-signs with the corresponding Numbers in hex and dec. (it still should be wrapped in an html-template including css perhaps):
<?php
$chs = array('0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F');
$chs2 = $chs;
$chs3 = $chs;
$chs4 = $chs;
foreach ($chs as $ch){
foreach ($chs2 as $ch2){
foreach ($chs3 as $ch3){
foreach ($chs4 as $ch4){
echo '<tr>';
echo '<td>';
echo $ch.$ch2.$ch3.$ch4;
echo '</td>';
echo '<td>';
echo hexdec($ch.$ch2.$ch3.$ch4);
echo '</td>';
echo '<td>';
echo '&#x'.$ch.$ch2.$ch3.$ch4.';';
echo '</td>';
echo '</tr>';
}
}
}
}
?>
Is this known, expected or correct behavior?
IMO, it is not. For both URI and WebView. Because RFC won't allow a backslash, they could have warn it. However it is less important because it does not affect the working at all if the input is as expected.
Do the two classes simply follow different standards?
The URI class and WebView strictly follows the same standards. But due to the fact that they are different implementations, they may behave differently to an unexpected input.
For example, "^(([^:/?#]+):)?((//([^/?#]*))?([^?#]*)(\\?([^#]*))?)?(#(.*))?" this is the regular expression in URI which is used to parse URIs. The URI parsing of WebView is done by native CPP methods. Even though they follow same standards, chances are there for them to give different outcome (At least for unexpected inputs).
Does the following patch look reasonable?
Not really (See the answer of next question).
Do you know of anything else (apart from a backslash) that causes a
mismatch between the two classes?
Because you are so concerned about the consistent behavior, I won't suggest a manual validation. Even the programmers who wrote these classes can't list all of such scenarios.
The solution
If I understand correctly, you need to load URLs which is supplied by untrustable external sources (which attackers can exploit if there is a loop hole), but you need to identify it's host correctly.
In that case, you can parse it using URI class itself and use URI#getHost() to identify the host. But for WebView, instead of passing the original URL string, pass URI#toString().

Propper way to handle dumping and reloading of JSON data containing special characters on android?

Not sure if this has been answered already but a quick search didn't turn up a satisfying result..
I'm stuck with the following scenario:
web service with REST API and JSON formatted data blobs
android client app talking to this service and locally caching / processing the data
The we service is run by a German company so some of the strings in the result data contain special characters like German umlauts:
// example resonse
[
{
"title" : "reward 1",
"description" : "Ein gro\u00dfer Kaffee f\u00fcr dich!"
},
{
"title" : "reward 2",
"description" : "Eine Pizza f\u00fcr dich!"
},
...
]
Locally the app is parsing the data using a set of classes which mirror the response objects (e.g. Reward and RewardResponse classes for the upper example). Each of these classes can read and dump itself from / to JSON - however this is where things get ugly.
Taking the example above org.json will correctly parse the data and the resulting strings will contain proper Unicode versions of the special characters 'ß' (\u00df) and 'ü' (\u00fc).
final RewardResponse response = new RewardResponse(jsonData);
final Reward reward = response.get(0);
// this will print "Ein großer Kaffee für dich!"
Log.d("dump server data", reward.getDescription());
final Reward reward2 = new Reward(reward.toJSON());
// this will print "Ein gro�er Kaffee f�r dich!"
Log.d("dump reloaded data", reward2.getDescription());
As you can see there is a problem with loading the data generated by JSONObject.toString().
Mainly whats happening is that JSONObject will parse escapes in the form of "\uXXXX" but it will dump them as plain UTF-8 text.
In turn, when parsing it won't properly read the unicode and instead insert a replacement character in the result string (� above \uffff as code point).
My current workaround consists of a look-up table containing the Unicode Latin1 supplement characters and their respective escaped versions (\u00a0 up to \u00ff). But this also means I have to go over each and every dumped JSON text and replace the characters with their escaped versions each time I dump something.
Please tell me there is a better way for this!
(Note: there is this question however he had problems with local file encoding on disk.
My problem above, as you can see, is reproducible without ever writing to disk)
EDIT: As requested in the comments here's the toJSON() method:
public final String toJSON() {
JSONObject obj = new JSONObject();
// mTitle and mDescription contain the unmodified
// strings received from parsing.
obj.put("title", mTitle);
obj.put("description", mDescription);
return obj.toString();
}
As a side note it makes no difference if I use JSONObject.toString() or a JSONStringer.
(The documentation advises to use .toString())
EDIT: just to remove Reward from the equation, this reproduces the problem:
final JSONObject inputData = new JSONObject("{\"description\":\"Ein gro\\u00dfer Kaffee\"}");
final JSONObject parsedData = new JSONObject(inputData.toString());
Log.d("inputData", inputData.getString("description"));
Log.d("parsedData", parsedData.getString("description"));
[Note: posted as an answer for better formatting]
I just tried the example
final JSONObject inputData = new JSONObject("{\"description\":\"Ein gro\\u00dfer Kaffee\"}");
final JSONObject parsedData = new JSONObject(inputData.toString());
Log.d("inputData", inputData.getString("description"));
Log.d("parsedData", parsedData.getString("description"));
on my Nexus 7 running Android 4.2.1, and on Nexus S running 4.1.2, and it works as intended:
D/inputData(17281): Ein großer Kaffee
D/parsedData(17281): Ein großer Kaffee
In which Android version did you see the problem?

weird behaviour parsing HTML-source-code (WLan vs. mobile Internet (3G))

I have a curious problem, respectively a weird effect of using my self-programmed android App.
My app reads out the HTML-source-code of a website and parse it for my desired information. And it work... oh well, not really consistent.
Scenario 1: I use my WLan at home and run my app -> All is working fine. All desired items can be seen in my ListView
Scenario 2: I use my mobile Internet, like Edge or HSDPA -> My ListView is only presenting 1 Item. All of the others are vanished...
I don' t know why. Could there be any time-out, that detain the app to read out the whole HTML-site? But all of the other items would directly follow in the next line of the HTML-source-code...
I have no idea how could I fix it. On google I didn' t find anyone else with the same problem.
Regards, Julian
Here is some code
// With this I get the HTML-source-code
URL url = new URL("http://www.area4.de);
URLConnection conn = url.openConnection();
DataInputStream dataIn = new DataInputStream(conn.getInputStream());
BufferedReader reader = new BufferedReader(new InputStreamReader(dataIn, "UTF-8"));
String line;
// Then I parse the code with
while ((line=reader.readLine()) != null)
{
if (line.contains(searchPattern))
al.add(line); //al is an ArrayList
}
That was all I do in my app till now (besides presenting the arrayList in a ListView).
The source code of the site you can see in your browser (Ctrl + u). I search for these lines
THIRTY SECONDS TO MARS //
DROPKICK MURPHYS //
With 3G I only get thirty-seconds-to-mars...
Ah, I solved it. I searched, as it can be seen above, with this code-snippet
while ((line=reader.readLine()) != null)
{
if (line.contains(searchPattern))
al.add(line); //al is an ArrayList
}
With WLan (and my emulator) I really have a new line for each band e. g.:
line1
line2
line3
....
But with Edge or HDSPA all lines I get with Wlan are written in one line.
line1line2line3.... And with my regex i delte all before and after the line when I find a desired target. Hope you understand, it' s difficult to explain it in a foreign language.
A simple
while (line.contains(searchPattern))
fixed it.
You can always try reading whole http response before sending it for parsing. This way you get to see whole document is loaded properly.

Categories

Resources