I'm trying to create a camera activity for taking photos to be OCR'd. Here's what I wish to accomplish:
A resizable box in the middle of the camera preview to indicate which particular area will be created into a Bitmap and processed by the OCR engine.
Continuous autofocus (done)
I'm using tesseract btw.
If anyone would care to point me to some reference / examples / tutorials, that would be great.
There's a viewfinder rectangle here:
https://github.com/rmtheis/android-ocr/blob/master/android/src/edu/sfsu/cs/orange/ocr/CaptureActivity.java
I've been doing something similar. Right now, I'm just sending the whole photo to a webservice and processing it with OCRfeeder, which will perform segmentation on the image and send each part with text in it to tesseract. I've been getting much better accuracy that way. In addition, you might want to perform some preprocessing to clean up the image first.
There can be two general approaches.
You can resize the image before sending it to OCR engine. Keep in mind that Tesseract engine you use has some kind of feature - it requires some space between characters and image borders, sometimes more than expected.
The second approach is to use field-level recognition, when you specify coordinates of the text block and send the full image to OCR engine. Have a look at http://www.ocrsdk.com, it's a cloud OCR SDK with web api recently launched by ABBYY, it's in beta, so for now it's free to use. It has a field level recognition methods and Android code samples. I work # ABBYY and can provide additional info on our products if necessary
Related
I'm working on an android project that requires Real time Image Recognition feature. I'm a newbie and don't have too much knowledge of image processing. I have to detect only one image by the application, that is nothing more than a logo. Logo is in the shape of circle.
Please suggest appropriate solution.
Thank you.
I recommend to use OpenCV library. It will allow you to learn your application to recognize diffrent things. For example I've made my application to recognize cars based on the size and shape of the object.
there is a lot of examples for OpenCV how to recognize a logo or similar things
for detection of specific object you can follow some basic techniques of object localization such as Normalized Cross-correlation in other words its also known as template matching, you have to prepare a template of your log and just use it as a convolution mask, and convolve your input image with this mask, ideally at the location of the desired object, response of convolution will quite high, so you can further fine tune the process to localize your object.
For how to use template matching in opencv you can refer to its document page http://docs.opencv.org/doc/tutorials/imgproc/histograms/template_matching/template_matching.html
OR
As you have mentioned in your question, that your region of interest is circular in shape, you can use some shape measures after initial segmentation of your image.
I have successfully integrated tesseract into my android app and it reads whatever the image that I capture but with very less accuracy. But most of the time I do not get the correct text after capturing because some text around the region of interest is also getting captured.
All I want to read is all text from a rectangular area, accurately, without capturing the edges of the rectangle. I have done some research and posted on stackoverflow about this two times, but still did not get a happy result!
Following are the 2 posts that I made:
https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504
Extracting information from captured image in android
I am not sure whether to go ahead with tesseract or use openCV
Including the many links and answers from others, I think it's good to take a step back and note that there are actually two fundamental steps to optical character recognition (OCR):
Text Detection: This is the title and focus of your question, and it is concerned with localizing regions in an image that contain text.
Text Recognition: This is where the actual recognition happens, where the localized image regions from detection get segmented character-by-character and classified. This is also where tools like Tesseract come into play.
Now, there are also two general settings in which OCR is applied:
Controlled: These are images taken from a scanner or similar in-nature where the target is a document and things like perspective, scale, font, orientation, background consistency, etc are pretty docile.
Uncontrolled/Scene: These are the more natural and in-the-wild photos, e.g. those taken from a camera, where you are trying to recognize a street sign, shop name, etc.
Tesseract as-is is most applicable to the "controlled" setting. And in general, but for scene OCR especially, "re-training" Tesseract will not directly improve detection, but may improve recognition.
If you are looking to improve scene text detection, see this work; and if you are looking at improving scene text recognition, see this work. Since you asked about detection, the detection reference uses maximally stable extremal regions (MSER), which has a plethora of implementation resources, e.g. see here.
There's also a text detection project here specifically for Android too:
https://github.com/dreamdragon/text-detection
As many have noted, keep in mind that recognition is still an open research challenge.
The solution to improving the OCR output is to
either use more training data to train it better
filter it's input using some Linear Filter (grayscaling, high-contrasting, blurring)
In the chat we posted a number of links describing filtering techniques used in OCRing, but sample code wasn't posted.
Some of the links posted were
Improving input for OCR
How to train Tesseract
Text enhancement using asymmetric filters <-- this paper is easily found on google, and should be read fully as it quite clearly illustrates and demonstrates necessary steps before OCR-processing the image.
OCR Classification
I am looking for a blur algorithm for an android app. The algorithm I found here Fast Bitmap Blur For Android SDK doesn't work in an AsyncTask.
I receive data from a sensor over a long time (one until two houres). Depending on the data an image must be blurred more or less. All pure java code I found is not fast enough wherefore I want to access native C code over jni. Is there anybody could give a hint?
Thanks anko
I am developing a currency identification system for blind people. I need to check if the full currency note has been captured so I used square detection for that. It is currently working when the background is pure black or white, but not when the background is more advanced. What techniques can I use to solve this problem?
I am using OpenCV as my image processing framework. Can I use convolution? How?
need enhancement for square detection.
Result image of my code:
I am not sure whether rectangle detection is the best solution for what you want to do.
It will only work efficiently if the picture is taken right up from the money, and as you say will not be robust to cluttered backgrounds.
Is there a precise reason for not going to a direct pattern recognition system ?
I'd start with a picture of my currency and try to perform object recognition with it.
You will find loads of tutorials that can help you on the web, like for bottles or for bowls.
You might have a lot of possibilities, due to the number of currencies but you know it to be a finite number at least.
I've been thinking about working on an application. You take a picture of something at a yard sale and it compares it against an image database.
For example say you take a picture of a spoon, and compares the image taken against images in the database and throws back to the user the top 5 possible matches.
Is this possible with current Android?
If so point me in the right direction, for stuff I'd need.
Thanks,
abolbridge
Look forward to your guys feedback.
That is rather possible, but too much CPU consuming and therefore not possible on Android itself. You'd have to build a serverside application for that.
It is going to be hard though. Quite.
Take a look at Google Goggles for an idea. The image processing is entirely made on the server side.
Check out openCV, as it contains a lot of useful object recognition functions and can be used on android. However, this approach will push the limits of the phones CPU and more so, its memory when using higher resolution images. A server-side implementation may be more appropiate.