Best Tess-two configuration to get optimal recognition results? - android

I'm currently working on an android app utilizing the open source OCR library "Tesseract" to make an app for receipt recognition. I've gotten the library working with the "Tess-two" fork of Tesseract. The problem I'm having is that the recognition is very inconsistent. Even when provided with a good image that is cropped properly, the recognition isn't great. I'd say that when given what I would consider ideal situations, the recognition is about 90% accurate. When provided with any number sub-optimal conditions (dim lighting, blurry image, uncropped, etc...) I find that I'll often get virtually 0% accuracy.
For the purpose of my app, even 90% accuracy pretty much unacceptable, as I need to be able to get the exact information and numbers from the receipt "perfectly" without needing to worry about improperly read information.
So my question: what is the best way to configure Tess-two to get the highest accuracy possible?
In a nutshell, this is what I have done to set up the library:
//prior to running this code, I create the directory for /tessdata and copy my eng.traineddata file in there from the app's assets folder.
baseApi.setVariable("save_best_choices", "T");
baseApi = new TessBaseAPI();
baseApi.init(DATA_PATH, "eng");
baseApi.setVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ$.!?/,+=-*\"'<:&"); //I was experimenting with this to try and improve accuracy, it didn't seem to help tremendously.
baseApi.setImage(photo);//photo is a bitmap that is selected from the phone's gallery.
String tmp = baseApi.getUTF8Text();
Is there something here that I'm doing wrong, or that I could be doing better?
Are there any files other than eng.traineddata that I should be including? I know there are multiple files for each language, but honestly I couldn't figure what was what, and what actually needed to be included. From what I could gather, I got the only file that was needed.
Are there any other settings that I could/should be modifying with the "setVariable" function?
Additionally, does Tess-two have any built in support for "deskewing" images, or adjusting contrast of provided images? I have not messed with either of these techniques much yet, but this would probably help out, right?
Any help is appreciated!

In case your android app should expect on dictionary words, then have a look at the Minimum Edit Distance algorithm and apply it on the results given by tesseract.

Related

Is there any where I can upload two strings xml files to be translated?

My app is going to be translated by several amateur translators for several languages. I can send them the xml file with all the strings that need to be translated. But, is there a cleaner way to have two files uploaded, the one in English and the one to be translated, to easily identify the strings that are still missing? Basically is like having the Translation Editor of Android Studio but online.
Maybe using google docs? How do you do this?
You can use Google Docs, but that's quite an outdated way to handle this.
The major cons:
it would be cumbersome to update strings this way
no easy way to make sure the new ones have new translations, not the old ones, etc.
no good way to provide context, if needed (typically translators have questions). You can create a column with context and take any discussions into comments, but it can get messy
A few pros:
it's fast to create (although slow to keep up-to-date)
you cooperate online and have shared access
Most developers use localization platforms, which makes updating content and online cooperation much faster.
Main pros:
it's easy to identify strings that are missing
any number of translators can translate simultaneously
track work that is done by each of translators
you can add a review/proofreading step to the process to ensure the quality of translations
leverage Machine Translations and then just have translators review them (saves lots of time)
update content, as most platforms support agile workflow
you can see who's the top translator (give some rewards, invite to other projects, etc.)
integrations (with your Git tool, Android Studio, etc), so you can automate content updates, no manual copy-pasting
Cons:
some of them are paid (still, if you're open source, you can expect a free plan)
Regarding the tools, I can suggest looking at Crowdin or Poedit.
There are many alternatives you can research, some are listed on Wikipedia.
At my work we had to translate english into Norwegian, we've done that by working with an python script that generated an ui from an csv file, after that the file could be exported in several formates as well. But your question indicates that you would like to deploy only on android, so this might be an overkill.
a simple python xml filter would fit your aproach and you could work as well with git as long as the lines stays in the same order.
if you need an quick example please comment, and ill edit this answer as soon as i get time.
At one point I have also had the same question. I need the translation for my vernacular app, also I had the requirement to maintain such that I could easily compare the translation. Here I could suggest a few things that worked out for me.
First, take the string XML file and convert it in an Excel sheet, You may generate multiple excel sheet and having a copy, paste and merge all the translations into a single sheet.
Going forward it will be easy for you to maintain all the translation. Just share a single sheet which has a string key and multiple language column. So you could easily have a look at all language translations.
In the long run, it will be helpful to you.
Few links for the conversion of XML to excel -
Convert string XML to Excel sheet
Using the below online tool works for me. Free and Opensource easy and best.
https://asrt.gluege.boerde.de/

How to scan a DPM datamatrix from a mobile app

I'm trying to leverage ZXing in an Android app to scan data matrixes. So far I'm successful with printed data matrixes such as this:
But other data matrixes printed by laser or punched have circle-looking marks instead of square-looking ones.
These present a problem. The only app I've found capable of scanning this is QRDroid. This article says that QRDroid uses ZXing so I'm thinking if they can, there must be a way. Unfortunately QRDroid is not an open source project so I don't know how.
There's the possibility of course that QRDroid is using an algorithm to somehow transform the circled marks in to squared ones before they attempt to read the data matrix. I don't know anything about image manipulation in Java, so I can't imagine how this is done.
My question is whether there's a way to tweak ZXing to read this type of data matrix, or if there's any library I can use to manipulate the image to make it readable by ZXing.
Edit:
If I use an image editor -e.g. I used https://www.befunky.com- to and apply a blur of 10, then it looks like a normal printed data matrix and my scan works. How should I go about doing this in my Android app?
After some research I found out that this type of marking is not really considered a standard data matrix but rather referred to by the manufacturing industry as a DPM, which stands for "Direct Part Marking", although I've read other sources call it "Dot Peen Marking" or "Dot Peen Matrix"
I posted this same question on an already existing issue in the Zxing repository and this was the reply I got:
The problem is the WhiteRectDetector. It finds a white rectangle inside the code, similar to this issue. If you rotate the image slightly (say 10°) or you blur it as you did or you did a suitably sized pixel dilation followed by an erosion, you'll get something that should (mostly) be detectable.
Modifying the WhiteRectDetector, to allow for dots rather than squares was not really an option for me due to deadlines, so I ended up switching from Zxing to Scandit, which is proven to be able to scan this.
Scandit is a proprietary library, but I haven't really found any other alternatives. You can get a trial license though. For those wanting to try it out to scan DPM's, the documentation is not very clear on how to enable scans for this symbology, so here's the trick.
In Android:
settings.getSymbologySettings(Barcode.SYMBOLOGY_DATA_MATRIX)
.setExtensionEnabled("direct_part_marking_mode", true);
In Objective-C:
[[settings settingsForSymbology:SBSSymbologyDatamatrix]
setExtension:#"direct_part_marking_mode" enabled:YES];

Reading ICO file in android along with all sub-images

I have found Is there a way to decode a .ICO file to a resolution bigger than 16x16? from 2 years ago and the best suggestion was to use image4j. Unfortunately it does not work under android in particular (also), because the classes "IndexColorModel", "BufferedImage" and "WritableRaster" are not available.
While working around "BufferedImage" by replacing it with "Bitmap" may perhaps work and not using "WritableRaster", but instead setting individual (or a group of) pixels using setPixel may work as well, I cannot manage to replace "IndexColorModel", because I cannot wrap my head around it.
I am currently downloading a favicon from a website, which stores usually more than one image inside of it. The images are of different size. I read up on the structure of ICO files and analyzed image4j as much as I could. Yet I have troubles refactoring the various classes to not use AWT.
BitmapFactory is able to load ICO files; unfortunately it only loads the first image (this is my guess at least) and thus does not let me decide which image to load (let alone load them all and let me chose).
Does anyone know if anything changed from 2 years ago and/or would anyone be willing to help me refactor e.g. BMPDecoder from image4j? Or is there perhaps a totally different, easier approach to it?
I have created a library based on image4j that will allow reading ICO files into a List of Bitmap-objects. In contrast to image4j ico4a does not use any AWT-classes, but instead only makes use of Bitmap / Bitmap.createBitmap.
See https://github.com/divStar/ico4a .
While the library's performance might not be the best as it uses a Bitmap-object's setPixel method in a loop, it gets the work done and it's good enough for me.
In comparison to image4j my library (ico4a) only decodes/reads files. While saving ICO files could be done probably relatively easy, I have not done so since I do not need it myself.
If you have further questions or issues with the library, post them on gitHub and I will see if I can help.

Data compression on Android (other than java.util.zip ?)

I have a lot of data (text format) to send from a device. It obviously means that I should compress it. But my question is whether there are any ways of doing it other than by zip algorithm (like this). The reason I am asking this question is over here - for a text file i.e. 7-zip is twice (!) better than zip. Which is a significant gain. And maybe there are even better algorithms.
So are there any effective ways of data compression (better than zip) available for Android?
You would need to compile another library into your code, since I doubt that compression algorithms other than zlib are available as part of the standard libraries on the Android.
The 7-zip algorithm you refer to is actually called LZMA, which you can get in library form in the LZMA SDK. The source code is available in Java as well as C. If you can link C code into your application, that would be preferable for speed.
Since there's no such thing as a free lunch, the speed is important. LZMA will require much more memory and much more execution time to achieve the improved compression. You should experiment with LZMA and zlib on your data to see where you would like the tradeoff to fall between execution time and compression, both to choose a package and to pick compression levels within a package.
If you find that you'd like to go the other way, to less compression and even higher speed than zlib, you can look at lz4.
Your question is too general.
You can use any library, as long as it is in Java or C/C++ (via the NDK). If you don't want to use external libraries, you have to stick to what's in the SDK. Depending on how you are sending the data, there might be standard ways to do this. For example, HTTP uses gzip and has the necessary headers already defined.
In short, test different things with your expected data format and size, find the best one and integrate it in your app.

Android game Image format

I have a problem with an image for an android game. The problem is not a problem with the code because the code that I use I took from a book (Beginning Android 4 Games Developer).
The problem is this: I know the format that I have to use in android: png, but I don't know the settings for this format that I have to use (like RGB565...). Because if I use simply png, when I run the game the images are not good. So I need someone to explain to which settings I need to use for images for android games.
P.S The software that I used is photoshop. If there is better software for this purpose tell me.
I think there is a strong misconception in your understanding of Android and how it implements graphics. You are not constrained to .png for nearly any of your development. The .png and .9.png are only enforced strictly for managing drawable constants.
Android uses Java and has the capability to utilize nearly any graphical format. In particular native support for .bmp, .png, and .jpg are present for every device and Android OS version. You may even create your graphics in realtime byte by byte.
As for a good image editor, there are a number out there. I often use a combination of GIMP and Photoshop, myself.
Hope this helps,
FuzzicalLogic

Categories

Resources