How to embed Wiktionary for offline access in Android App?

How to embed Wiktionary for offline access in Android App? - android

I am currently developing an Android app which is a Dictionary, where I am fetching meanings online with Wiktionary API with this: [http://en.wiktionary.org/w/api.php?action=query&prop=revisions&titles=overflow&rvprop=content&format=jsonfm
But I want to download the Wiktionary database offline and embed it inside my Android App.
Here is the Wiktionary Database Download Page:
1. Wiktionary
2. Wikimedia Downloads
According to my research I found out that Wiktionary Offline Database is in XML and SQL. But these files are too big. Embedding these files would make the APK size huge.So is there any solution to embed this easily in my App?

The developer [ of English Dictionary - Offline ] claims that they are using Wiktionary. I am still
wondering where did they get a Wiktionary Dump File >22 MB
I'm not being paid enough to tell you that.. (joke). Thing is you need to extract the dictionary entries from the XML files and once you get only those then the final content (text) file becomes smaller.
Alternatively...
You can try this TSV file (courtesy of: semisignal.com) which is a snapshot of November 2012 definitions. This contains most words your end-user checking English would need. The TSV is 54MB and is handled like a text file.
Try a definition : brushable -- TSV has below :(Compare to Wiktionary's entry for Brushable).
English brushable Adjective # Able to be [[brushed]]
English brushable Adjective # Able to be controlled by [[brushing]]
TIPS: For reducing filesize, you can trim off the starting "English" since you already know its all English definitions. Each trim will save you 7 bytes (multiply by total definitions).
Use a String.replace on "English " (with that space) to clear it.
Also replace "Adjective" "Verb" "Noun" with short codes that your
App knows the meaning of and shows entry type in the User
Interface. Code could be 1 meaning list entry as Adjective.
Your trimmed text file could like example below. Each double fullstop just means "next section of entry", so basically entry..type..definition where <xyz> is a link to another entry in the dictionary. 54 bytes of TSV entry now becomes 35 bytes for that one line.
brushable..1..Able to be <brushed>.
Save the final edited (reduced) text file. Embed that into your APK.

I suggest implementing the online API access, so small app can be downloaded and used, plus add a button somewhere that downloads the offline part. Also check network connection, and if it's not wi-fi, warn the user so the mobile data plan will not be abused for downloading 100 MB dictionary.

Related

Smartphone apps can't read the entirety of long text files? (Orgzly / Filenotes)

I keep a lot of .txt files synced across all my devices with dropbox, and up until recently I've had no problem accessing them on my phone with an app called Orgzly. However I seem to have hit some sort of character limit as I'm continuously adding to one of these files and while the file is perfectly readable on any PC, Orgzly now seems to cut out near the end of the file. (For reference, I'm now at 1921 words over 37 pages if I paste the contents of the file into a blank word document)
I just tried another app called Filenotes and the exact same thing happens. Perhaps I'm I'm not searching the correct keywords, but a google search is not showing anything remotely useful within the top 50 hits.
So my question is this - Do smartphone operating systems or the app APIs have some sort of built in character limit for files? And is there any sort of workaround to get this file readable on my phone, short of accessing the .txt file via the dropbox app directly? (Which actually works, FYI)

How to convert the localisation in Android string resources files to Windows Phone 8 resources file?

I have the localised string file(s) of an internationalised Android app. Now I want to bring the string translations over to a new Windows Phone 8 ("WP8") app without having to manually copy every string individually.
I found several tools that can do iOS -> Android and/or Android -> iOS (e.g. LocalizedStrings2Android, stringsconvert, etc.), but there seems to be no tool out there that can transform the string files Android -> WP8 (or even iOS -> WP8).
Apple's iOS uses files with simple key-value pairs, Android has an XML file, and WP8 uses "XAML" that contains special binding clauses. WP8's format/content differs quite a bit from iOS's and Android's. Is that the reason no tool(s) exist?
I'd appreciate any pointers to existing tools or hints how to best approach this problem.
If you choose to downvote the question please be so kind to leave a comment.
And finally: No, web searches return nothing, unfortunately!

Microsoft Excel.
Open the Android’s strings.xml with it.
It will ask “How would you like to open?” choose “As an XML table”, it’ll message “Excel will create a schema”, press OK.
In Excel, use cut & paste to reorder the columns so the first column is "name" second is "string". You can cut and paste the compete columns by right-clicking on the headers.
Then you’ll be able to copy-paste the whole table, both names and strings, from Excel to the Visual Studio’s *.resx editor. You might have some issues if e.g. you have many names containing spaces, or with values containing newlines, but it still should be much faster then copy all your individual strings.
If you want to automate (e.g. if you have dozens of languages), the .resx format is a simple XML as well. If you know XSLT, the transformation will only take a few lines, if you don't, use any scripting language instead.

Dynamic Android app languages

Many times I've seen Android apps that have a list of languages displayed and I can tap on any of this language and download it for this specific app (GO Weather widget has this functionality).
I'm interested in how is this implemented and what is the best way to load languages dynamically in Android apps? Adding 100 string.xml resources in app project is not a solution and besides if I want to provide some kind of "funny holiday language" pack or add a new language I would need to upload the project to Google Play again and again...
Thanks!

While it's possible to use Expansion Files to add on to your app, they are limited in some ways. The main problem for you would be that you can only have a limited number of expansion files. If you wanted 100 languages, your only option would be to load them all in the expansion file, and download the whole thing. While that might not be a problem, since a list of translated strings probably isn't that large, you may want to go a different route.
The best option I see for downloading separate language add-ons is to forgo using strings.xml altogether. Just use a simple CSV file to hold your strings, mapping names to strings. When your program starts, read it in to a string array/map/whatever, and you have all your strings at the ready. This way, if you want to add a language, it's as easy as downloading a text file and saving it to your data directory.
Also, you can keep a file listing all the available languages on the same server, so you don't have to update the app if you want to add seasonal or limited-time-only languages, like you mentioned. Just read in the file to get the list.
Note, you'll need somewhere to host the files, but that's hardly a barrier in this day and age.

Max number of file packaged with an app

Bit of an odd question but..
I am currently building an app, it will essentailly be a hotel listings directory with a few frills.
Having never made an app like this before I have suddenly found my self with the following question but cannot find the answer...
Is the there a limit the number of file you can package the app with, ie submit to itunes...
The reason I ask is potentially I will want to submit my app with a minimum 700+ images each in their own directory resulting in 1400+ files (assuming a directory is a file). I can get the size of the images to fit the 'over the air' max app download size.. but cannot find if there is a limit ot the number of files you can submit...

There is no as such limit for the number of files to be uploaded. However, as you mentioned it would be better to download your files from the app after installation.
This would help you reduce the binary size.
This is for iOS.

Your app usually comes in one .apk file. Your resources included. Link: https://en.wikipedia.org/wiki/APK_%28file_format%29 So its size is what matters.
You may want to double check the architecture of your app, it sounds like you want a webservice.

Extract text from PDF in code

I'm making an app for my school which people can check with if they've got a schedule change. All schedule changes are listed here: http://www.augustinianum.eu/roosterwijzigingen/14062012.pdf. I want to search that page for a keyword (the user's group, which is entered in an EditText). I've found out how to make the app check if the edittext matches a certain string, so now I only need to download all of the text on that page to a string. But the problem is that it's not a simple webpage, but a PDFpage. I've heard that you need a special pdf library or something to extract the text from the PDF and then put that text into a string and then search the string for keywords using contains().
However I've got some questions about that:
This PDF is made with a PDF-creator, it's not a scanned page or so. You can actually for example select the text or search it for keywords using CTRL+F. So I wonder if it is actually required to extract the PDF and stuff or is there maybe an easier way.
I want the app to check for changes every, let's say hour. So it also has to download the PDF and extract the text every hour (about 8 pages), would that consume very much juice?
I've heard that there are many many libraries which do what I want. So which should I use? (If possible, I'd like one which is free :))
Could anyone explain to me how to use it in my code? (I'm not really experienced, so plz keep it a little easy :))
THANK YOU ALL SO MUCH!!!

Unfortunately, I did not working with java and you have to implement it in java code by yourself. Now I'll tell you, how finally I did it:
1) I took the file by your link. PHP is doing it by #fopen("http://...").
2) I opened it as a binary (it is important) and extracted two parts:
2.1) Data 3 0 obj part, which represents creation and modification dates. I did it by regex. It was simple and I mention it above.
2.1) Data stream from 5 0 obj, which represents the deflated data. IMPORTANT! Microsoft Excel inserts two bytes 0D 0A as a line break. Do not forget it, when you filtering the content by regexp. This bytes in the start and in the end have not to be included in extracted string.
3) I inflate a coded stuff by function $uncompressed = #gzuncompress($compressed) and put it in external file. You can see results there
4) Funniest part. The raw data inside the file in textual format. It looks like [(V)-4(RI)16(J)] TJ, and means VRIJ. You can read about texts in PDF in the PDF Reference v1.7, part 5.
5) I believe, the regular expressions can help you extract or/and transform the data.
IMPORTANT: I said "data stream from 5 0 obj", but number of the object "is subject of change". You must control the reference to the object from dictionary->pages->page->content chain. Description of the "bread crumbs" you can find in the manual I mentioned above.
Unfortunately, Excel do not embed any table structure in the PDF, but you can find the coordinates of the text portions and interprete it. Anyway it is a mess.
Do you think, dear Merlin, it is hard? No, dear, it is not. It is not hard, because there is no unicode symbols. The unicode in the PDF is THE REAL SUCK!
Good luck!

This PDF was made by Microsoft Excel and have the date stamps:
3 0 obj
<</Author(Janszen, Jan)
/CreationDate(D:20120613153635+02'00')
/ModDate(D:20120613153635+02'00')
/Producer(˛ˇMicrosoftÆ ExcelÆ 2010)
/Creator(˛ˇMicrosoftÆ ExcelÆ 2010)>>
endobj
You can use almost any programming language for taking the file by URL and extraction "ModDate" content. New ModDate means information update. For extracting this information you need not any libraries - this is the text in the file, lines 9, 10 and 11.
Ask Jan Janszen to add you in distribution list. The data in the file is encoded. You have to use a lot of programming techniques to reach source and restore information.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.

How to embed Wiktionary for offline access in Android App? - android

I suggest implementing the online API access, so small app can be downloaded and used, plus add a button somewhere that downloads the offline part. Also check network connection, and if it's not wi-fi, warn the user so the mobile data plan will not be abused for downloading 100 MB dictionary.

Related

Smartphone apps can't read the entirety of long text files? (Orgzly / Filenotes)

How to convert the localisation in Android string resources files to Windows Phone 8 resources file?

Dynamic Android app languages

Max number of file packaged with an app

Extract text from PDF in code

Categories

Resources