I have developed OCR Application using Tesseract OCR Library and referred from the following Links.
android-ocr
tesseract
But I am getting junk data as results sometimes. Can anyone help me what to do further to get accurate results.
You should provide your test images if you want to get specific help for your case as well as any code you are using but a general rule of thumb for getting accurate results are :
Use a high resolution image (if needed) 300 DPI is minimum
Make sure there is no shadows or bends in the image
If there is any skew, you will need to fix the image in code prior to ocr
Use a dictionary to help get good results
Adjust the text size (12 pt font is ideal)
Binarize the image and use image processing algorithms to remove noise
On top of all this, there are a lot of image processing functions out there that can help increase accuracy depending on your image such as deskew, perspective correction, line removal, border removal, dot removal, despeckle, and many more depending on your image.
Related
I've been trying to train tesseract engine to ocr images that have numbers written using the seven digital font.
And, after searching, it turned out that tesseract won't ocr a segmented font unless the segments are somehow connected.
So, I used erosion, which is an opencv function, on the images to connect the segments.
http://www.tutorialspoint.com/java_dip/eroding_dilating.htm
And, after that, I used thresholding to convert the image to binary before handing the image to tesseract (This step is redundant because tesseract internally does image binarization).
http://docs.opencv.org/2.4/doc/tutorials/imgproc/threshold/threshold.html
My main problem is that the numbers are written in black on a dark green background.
Here are the results
Original image:
Method 1:
After Erosion and binarization (I tried various threshold max values)
Method 2:
I tried to use k-means or c-means algorithms but the results were no much better.
Method 3:
I also tried adaptive Gaussian thresholding
Method 4:
Adaptive Mean
Method 5:
Handing the original image to tesseract without any image processing and outputting the result image (Tesseract uses leptonica to do image processing internally).
I also tried various samples instead of this one and tried Gimp to enhance the images using the steps in Gimp image processing, but nothing is working for me.
Any suggestions?
Thanks!
I am trying to use OpenCV (Android) for processing image taken using camera and then pass it to Tesseract for text (digits) recognition but am not getting good results till the images are very (almost no noise) fine.
Currently I am performing below processing on taken images as:
1. Applying Gaussian blur.
2. Adaptive threshold: to binarize the image.
3. Inverting colours to make background black.
Then passing the processed image to Tesseract.
But I am not getting good results.
Please suggest what steps/measures I may take further to process image before passing to Tesseract or at stage while processing at Tesseract.
Also, are there any other better libraries in Android for this?
You can isolate/detect characters in images. This can be done with powerful algorithms such as the Stroke Width Transform.
The following steps worked well with me:
Obtain grayscale of image.
Perform canny edge detection on grayscale image.
Apply gaussian blur on grayscale image(store in seperate matrix)
Input matrices from steps 2 & 3 into SWT
algorithm
Binarize(threshhold) resulting image.
Feed image to tesseract.
Please note, for step 4 you will need to build the c++ library in the link and then import into your android project with JNI wrappers. Also, you will need to do micro tweaking for all steps to get the best results. But, this should at least get you started.
I'm developing an Android app which uses tesseract OCR to recognize Text, now I have the Problem that on different Smartphones the image gets rotate in a different way, so on one it is in landscape mode right away and on the other in portrait mode. So now i want to intelligently rotate the Image so that Tesseract can recognize the Text. Which is only in one of the two options possible, but it might be in either, due to the user taking the picture. I don't want the User to have to take the picture in the same format everytime, i want to rotate it so it fits the need, if possible without too much of a performance loss.
The Tesseract lib with the autorotate does not seem to work for me in that way.
Anybody an idea how to solve that problem.
Thanks
If this question is still relevant for you: Maybe you can extract the exif data of the image, to get its orientation?
Otherwise this paper maybe can help you: Combined Orientation and Script Detection using the Tesseract OCR Engine.
If you don't mind rolling your sleeves up, http://www.leptonica.org/ is probably a good option to evaluate the glyphs (raw Pix that is not detected as text yet) and determine orientation. I've seen references to Android bindings for Leptonica.
I am looking for some kind of auto trim/crop functionality in android.
Which detects a object in captured image and creates a square box around object for
cropping. I have found face detection apis in android, but my problem is captured images are documents/pages not human faces so how can I detected documents or any other object from captured picture.
I am thinking of any algorithms for object detection or some color detection. Is there any apis or libraries available for it.
I have tried following link but not found any desired output.
Find and Crop relevant image area automatically (Java / Android)
https://github.com/biokys/cropimage
Any small hint would also help me alot. Please help. Thanks in advance
That depends on what you intend to capture and crop, but there are many ways to achieve this. Like littleimp suggested, you should use OpenCv for the effect.
I suggest you use edge-detection algorithms, such as Sobel, and perform image transformation on it with, for example, a Threshold function that will turn the image into a binary one (only black and white). Afterwards, you can search the image for the geometric shape you want, using what's suggested here. Filter the object you want by calculating the detected geometric figure's area and ratio.
It would help a lot to know what you're trying to detect in an image. Those methods I described were the ones I used for my specific case, which was developing an algorithm to detect and crop the license plate from a given vehicle image. It works close to perfect and it was all done by using OpenCV.
If you have anything else you'd like to know, don't hesitate to ask. I'm watching this post :)
Use OpenCV for android.
You can use the Watershed (Imgproc.watershed) function to segment the image into foreground and background. Then you can crop around the foreground (which will be the document).
The watershed algorithm needs some markers pre-defining the regions. You can for example assume the document to be in the middle of the image, so create a marked region in the middle of the image to get the watershed algorithm started.
I was hoping someone could tell me why it is my Tesseract has trouble recognizing some images with digits, and if there is something i can do about it.
Everything is working according to test, and since it is only digits i need, i thought i could manage with the english pattern untill i had to start with the 7segmented display aswell.
Though i am having a lot of trouble with the appended images, i'd like to know if i should start working on my own recognition algorithms or if I could do my own datasets for Tesseract and then it would work, does anyone know where the limitation lies with Tesseract?
things tried:
tried to set psm to one_line, one_word, one_char(and chop up the picture).
With one_line and one_word there was no significant change.
with one_char it did recognize a bit better, but sometimes, due to big spacing it attached an extra number to it, which then screwed it up, if you look at the attached image then it resulted in 04.
I have also tried to do the binarization myself, this resulted in poorer recognition and was very rescource consuming.
I have tried to invert the pictures, this makes no difference at all for tesseract.
I have attached the pictures i'd need, among others, to be processed.
Explaination about the images:
is a image that the tesseract has no trouble recognizing, though it has been made in word for the conveniences of building an app around a working image.
is real life image matching the image_seven. But it cannot recognize this.
is another image i'd like it to recognize, and yes i know it cant be skrewed, and i did unskrew(think skrew is the term here=="straighting") it when testing.
I know of some options that might help you:
Add extra space between image border and text. Tesseract would work awful if text in the image is positioned at the edge.
Duplicate your image. For example, if you're performing OCR on a word 'foobar', clone the image and send 'foobar foobar foobar foobar foobar' to tesseract, results would be better.
Google for font training and image binarization for tesseract.
Keep in mind, that built-in camera in mobile devices mostly produce low quality images (blured, noised, skewed etc.) OCR itself is a resource comsuming process and if you add a worthy image preprocessing to that, low-end and mid mobile devices (which are likely to have android) could face unexpectedly slow performance or even lack of resources. That's OK for free/study projects, but if you're planning a commercial app - consider using a better SDK.
Have a look at this question for details: OCR for android
Tesseract doesn't do segmentation for you. Tesseract will do a thresholding of the image prior to the actual tesseract algo. After thresholding, there may be some edges, artefacts that remain in the image.
Try to manually modify your images to black and white colors and see what tesseract returns as output.
Try to threshold (automatically) your images and see what tesseract returns as output. The output of thresholding may be too bad causing tesseract to give bad output.
Your 4th image will probably fail due to thresholding (you have 3 colors: black background, greyish background and white letters) and the threshold may be between (black background, greyish background).
Generally Tesseract wants nice black and white images. Preprocessing of your images may be needed for better results.
For your first image (with the result "04"), try to see the box result (char + coordinates of box that contains the recognized char). The "0" may be a small artefact - like a 4 by 4 blob of pixels.
You may give javaocr a try ( http://sourceforge.net/projects/javaocr/ , yes, I'm developer )
Therre is no offocial release though, and you will have to look for sources ( good news: there is working android sample including sampler, offline trainer and recognizer application )
If you only one font, you can get pretty good results with it (I reached up to recognition rates 99.96 on digits of the same font)
PS: it is pure java and uses invariant moments to perform matching ( so no problems with scaling and rotation ) . There is also pretty effective binarisation.
See it in action:
https://play.google.com/store/apps/details?id=de.pribluda.android.ocrcall&feature=search_result#?t=W251bGwsMSwxLDEsImRlLnByaWJsdWRhLmFuZHJvaWQub2NyY2FsbCJd