Trouble recognizing digits in Tesseract - android - android

I was hoping someone could tell me why it is my Tesseract has trouble recognizing some images with digits, and if there is something i can do about it.
Everything is working according to test, and since it is only digits i need, i thought i could manage with the english pattern untill i had to start with the 7segmented display aswell.
Though i am having a lot of trouble with the appended images, i'd like to know if i should start working on my own recognition algorithms or if I could do my own datasets for Tesseract and then it would work, does anyone know where the limitation lies with Tesseract?
things tried:
tried to set psm to one_line, one_word, one_char(and chop up the picture).
With one_line and one_word there was no significant change.
with one_char it did recognize a bit better, but sometimes, due to big spacing it attached an extra number to it, which then screwed it up, if you look at the attached image then it resulted in 04.
I have also tried to do the binarization myself, this resulted in poorer recognition and was very rescource consuming.
I have tried to invert the pictures, this makes no difference at all for tesseract.
I have attached the pictures i'd need, among others, to be processed.
Explaination about the images:
is a image that the tesseract has no trouble recognizing, though it has been made in word for the conveniences of building an app around a working image.
is real life image matching the image_seven. But it cannot recognize this.
is another image i'd like it to recognize, and yes i know it cant be skrewed, and i did unskrew(think skrew is the term here=="straighting") it when testing.

I know of some options that might help you:
Add extra space between image border and text. Tesseract would work awful if text in the image is positioned at the edge.
Duplicate your image. For example, if you're performing OCR on a word 'foobar', clone the image and send 'foobar foobar foobar foobar foobar' to tesseract, results would be better.
Google for font training and image binarization for tesseract.
Keep in mind, that built-in camera in mobile devices mostly produce low quality images (blured, noised, skewed etc.) OCR itself is a resource comsuming process and if you add a worthy image preprocessing to that, low-end and mid mobile devices (which are likely to have android) could face unexpectedly slow performance or even lack of resources. That's OK for free/study projects, but if you're planning a commercial app - consider using a better SDK.
Have a look at this question for details: OCR for android

Tesseract doesn't do segmentation for you. Tesseract will do a thresholding of the image prior to the actual tesseract algo. After thresholding, there may be some edges, artefacts that remain in the image.
Try to manually modify your images to black and white colors and see what tesseract returns as output.
Try to threshold (automatically) your images and see what tesseract returns as output. The output of thresholding may be too bad causing tesseract to give bad output.
Your 4th image will probably fail due to thresholding (you have 3 colors: black background, greyish background and white letters) and the threshold may be between (black background, greyish background).
Generally Tesseract wants nice black and white images. Preprocessing of your images may be needed for better results.
For your first image (with the result "04"), try to see the box result (char + coordinates of box that contains the recognized char). The "0" may be a small artefact - like a 4 by 4 blob of pixels.

You may give javaocr a try ( http://sourceforge.net/projects/javaocr/ , yes, I'm developer )
Therre is no offocial release though, and you will have to look for sources ( good news: there is working android sample including sampler, offline trainer and recognizer application )
If you only one font, you can get pretty good results with it (I reached up to recognition rates 99.96 on digits of the same font)
PS: it is pure java and uses invariant moments to perform matching ( so no problems with scaling and rotation ) . There is also pretty effective binarisation.
See it in action:
https://play.google.com/store/apps/details?id=de.pribluda.android.ocrcall&feature=search_result#?t=W251bGwsMSwxLDEsImRlLnByaWJsdWRhLmFuZHJvaWQub2NyY2FsbCJd

Related

Junk results when using Tesseract OCR and tess-two

I have developed OCR Application using Tesseract OCR Library and referred from the following Links.
android-ocr
tesseract
But I am getting junk data as results sometimes. Can anyone help me what to do further to get accurate results.
You should provide your test images if you want to get specific help for your case as well as any code you are using but a general rule of thumb for getting accurate results are :
Use a high resolution image (if needed) 300 DPI is minimum
Make sure there is no shadows or bends in the image
If there is any skew, you will need to fix the image in code prior to ocr
Use a dictionary to help get good results
Adjust the text size (12 pt font is ideal)
Binarize the image and use image processing algorithms to remove noise
On top of all this, there are a lot of image processing functions out there that can help increase accuracy depending on your image such as deskew, perspective correction, line removal, border removal, dot removal, despeckle, and many more depending on your image.

How to equalize brightness, contrast, histrograms between two images using EMGUCV

What I am doing is attempting to using EMGU to perform and AbsDiff of two images.
Given the following conditions:
User starts their webcam and with the webcam stationary takes a picture.
User moves into the frame and takes another picture (WebCam has NOT moved).
AbsDiff works well but what I'm finding is that the ISO adjustments and White Balance adjustments made by certain cameras (even on Android and iPhone) are uncontrollable to a degree.
Therefore instead of fighting a losing battle I'd like to attempt some image post processing to see if I can equalize the two.
I found the following thread but it's not helping me much: How do I equalize contrast & brightness of images using opencv?
Can anyone offer specific details of what functions/methods/approach to take using EMGUCV?
I've tried using things like _EqualizeHist(). This yields very poor results.
Instead of equalizing the histograms for each image individually, I'd like to compare the brightness/contrast values and come up with an average that gets applied to both.
I'm not looking for someone to do the work for me (although code example would CERTAINLY be appreciated). I'm looking for either exact guidance or some way to point the ship in the right direction.
Thanks for your time.

To eliminate gray stripes when taking photos from computer displays using Android SDK

Maybe a comparison of pictures best illustrate the problem.
This is the original picture:
Using Android SDK, I managed to take this photo from my Android phone:
You may see that, there are lots of gray strips on the photo.
Although the main shapes are there, for I'm processing these photos on an image recognition project, these gray stripe completely ruined the results.
It (Edit: does not ) seems that the built-in photo app would automatically eliminate them, but I don't know how to do it manually in my app. Seems that this is caused by display having a different refresh rate.
What you're seeing is happening because cameras have a small advantage over the human eye when taking the photo.
The refresh rate for most displays is 50Hz or 60Hz, which is too fast for our eyes to notice.
However, a camera sensor takes the image much faster than the human eye, and can see the scan lines created by the refreshing of the image on the display. You can work around this by using a longer exposure time, closer to the human eyes' speed, but you may not be able to control that on most Android devices.
I suggest you use your operating system's inbuilt screenshot utility instead.

OCR (tesseract), intelligent rotation for Image

I'm developing an Android app which uses tesseract OCR to recognize Text, now I have the Problem that on different Smartphones the image gets rotate in a different way, so on one it is in landscape mode right away and on the other in portrait mode. So now i want to intelligently rotate the Image so that Tesseract can recognize the Text. Which is only in one of the two options possible, but it might be in either, due to the user taking the picture. I don't want the User to have to take the picture in the same format everytime, i want to rotate it so it fits the need, if possible without too much of a performance loss.
The Tesseract lib with the autorotate does not seem to work for me in that way.
Anybody an idea how to solve that problem.
Thanks
If this question is still relevant for you: Maybe you can extract the exif data of the image, to get its orientation?
Otherwise this paper maybe can help you: Combined Orientation and Script Detection using the Tesseract OCR Engine.
If you don't mind rolling your sleeves up, http://www.leptonica.org/ is probably a good option to evaluate the glyphs (raw Pix that is not detected as text yet) and determine orientation. I've seen references to Android bindings for Leptonica.

How to make fast operations on bitmaps in android

I need to make some operations on bitmaps in android, such as chanding pixel colors, transparency, painting another bitmap on the first one with transparency, zooming, croping, applying masks and so on. Now I make it with java and it is really slow. It take about 5 seconds to compute a 600x600 pixels image on nexus 4.
What technology will be suitable for this operations?
I tried library GPUImage that uses OpenGL ES 2.0, but I faced a problem when trying to apply a mask to an image, mask size is 600x600 as the original image, so I should pass an array of 1440000 color values, and the GLSL language doesnt support such big arrays. 360000 of vec4 values is too large for it too.
May be I can pass a whole image as a texture2d object, but I cant find how to do it. Cold someone please give me a tutorial?
The second option is renderscript. It is not so suitable because I want phones with android 2.3 has fast bitmap operations, but all in all improving speed on 3.0+ devices will be a good thing. The problem with it was how to pass it data of second bitmap - 360000 of float4 values or a bitmap itself. I coudnt find information about it too, so I would be thankful if someone give me an advice or tutorial.
The third option is using NDK. But I dont know will it be fast enough and again cant find good information to start with ndk in general and image processing with ndk in particular. So I again need an advice or tutorial.
So, all in all, please tell me what other ways of fast image processing are there and please give me some advices or tutorials on this 3 methods.
And what is the best way to make operations on images really fast?
If you find GLES suitable, divide the image into small tiles, and operate on one tile at a time. (Some drivers work this way.)

Categories

Resources