Digital Numbers on Tesseract OCR - android

SOLUTION:
I've had to train my own data to try it with the OCR. It seems that works well, but I don't know why the trained data from arturaugusto not works for me =(
https://github.com/adri1992/Tesseract_sevenSegmentsLetsGoDigital.git
With my trained data, to get good results of the OCR, I've done this phases (I've done it with OpenCV):
First, convert the image to Black&White
Second, apply to the image a Gaussian Blur
Third, apply to the image a Threshold filter
With this, the seven segments digits are recognized.
QUESTION:
I'm trying to get an OCR through Tesseract on Android, and I'm testing the app with this image (via Text detection on Seven Segment Display via Tesseract OCR):
I'm using the data trained by arturaugusto (https://github.com/arturaugusto/display_ocr), but the wrong result of the OCR is:
884288
The zero is recognized as an eight, and I don't know why.
I'm applying to the image a Gaussian Blur and a threshold filter, via OpenCV, and the image processed is this:
Is there any other data trained or do you know any way to solve the problem?

Try using erode to fill the gaps between the segments.
I think the problem is that tesseract can't handle well segmented font.
With OpenCV-python, I use cv2.erode(display,kernel, iterations = erosion_iters) to solve this problem.

Related

Find longest straight line in live camera(After edge detection filter) in android

I want make app that show something above longest straight line in image.
i know should convert RGB Image to GrayScale.
also know should use edge detections algorithm and(sobel,canny,...)
Sobel Edge Detection in Android
but i don't know how can i find largest straight line in image,line may be part of rectangle or any shape,i just want find longest line position in image but no gradient(or small level of gradient)
how can i implement it with no external library (or lightweight libraries)
The Hough Transform is the most commonly used algorithm to find lines in an image. Once you run the transform and find lines, it's just a matter of sorting them by length and then crawling along the lines to check for the constraints your application might have.
RANSAC is also a very quick and reliable solution for finding lines once you have the edge image.
Both these algorithms are fairly easy to implement on your own if you don't want to use an external library.

How to get the center of a YUV image without processing the rest?

I want to get, in RGB (or anything I can convert to RGB later), the middle color value of a YUV image. So the color of the centre XY pixel.
Theres nice code out there to convert the whole pixel array from an Android camera to RGB...but this seems a bit wasteful if I just want the center pixel.
Normally Id just look at the loop and figure out where its processing the middle pixel....but I dont understand the YUV or the conversion code well enough to figure out where the data I need is.
Any help or pointers?
Cheers
-Thomas
Using this guide here;
stackoverflow.com/questions/5272388/extract-black-and-white-image-from-android-cameras-nv21-format
Explains the process fairly well.
However, it seems I was having a different problem then I expected,but its different enough to repost the question.

Android: How to detect these objects in images? (Image included). Tried OpenCV and metaioSDK, but both are not working good enough

i have been working with object detection / recognition in images captured from an android device camera recently.
the object i am trying to detect are all kinds of buttons that look like this:
Picture of buttons
so far i have been trying with OpenCV and also with the metaio SDK. results:
OpenCV was always detecting something, but gave lots of false hits. also it is too much work to collect all the pictures for what i have in mind. i have tried three ways with OpenCV:
FeatureDetection (SURF, ORB and so on) -> was way too slow and not enough features on my objects.
Template Matching -> seems to only work when the template is exactly a part out of the scene image
Training classifiers -> this worked the best so far, but is too much work for my goal, and still gives too many false detections.
metaioSDK was working ok when i took my reference images (the icon part of each button) out of a picture like shown above, then printed the full image and pointed my android device camera at the printed picture. but when i tried with the real buttons (not a picture of them) then almost nothing got detected anymore. in the metaio documentation it is said that the reference images need to have lots of features and color differences and also should not only consist of white text. well, as you see my reference images are exactly the opposite from what they should be. but thats just how the buttons look ;)
so, my question would be: does any of you have a suggestion about what else i could try to detect and recognize each of those buttons when i point my android camera at them?
As a suggestion can you try the following approach:
Class-Specific Hough Forest for Object Detection
they provide a C code implementation. Compile and run it and see the results, then replace positive and negative training images with the ones you have according the following rules:
In a car you will need to define the following 3 areas:
target region (the image you provided is a good representation of a target region)
nearby working area (this area have information regarding you target relative location) I would recommend: area 3-5 times the target regions, around the target, can be a good working area
everything outside the above can be used as negative images
then,
Use "many" positive images (100-1000) at different viewing angles (-30 - +30 degrees) and various distances.
You will have to make assumptions at which viewing angles and distances your users will use the application. The more strict they are the better performance you will get. A simple "hint" camera overlay can give a good idea to people what you expect the working area to be.
Use few times (3-5) more different negative image set which includes pictures of things that might be in the camera but should not contribute any target position information.
Do not use big images, somewhere around 100-300px in width should be enough
Assemble the database, and modify the configuration file that the code comes with. Run the program, see if performance is OK for your needs.
The program will return a voting map cloud of the object you are looking fore. Add gaussian blur to it, and apply some threshold to it (you will have to make another assumption for this threshold value).
Extracted mask will define the area you are looking for. The size of the masked region can give you good estimate of the object scale. Given this information it will be much easier to select proper template and perform template matching.
(Also some thoughts) You can also try to do a small trick by using goodFeaturesToTrack function with the mask you got, to get a set of locations and compare them with the corresponding locations on a template. Constuct an SSD and solve it for rotation, scale and transition parameters, by mimizing alignment error (but not sure if this approach will work)

Optimize an image for text recognition using tesseract

I have user tesseract ocr for my android project to recognize text from an image taken from the camera. But the results are not accurate. I want to optimize the image using opencv. I want to achieve the following for the captured image which is decoded in Bitmap.Config.ARGB_8888 format:
Detect the objects in the resized image.
Once the object is identified, compute its border w.r.t original image. (This is for removing the camera angle effect)
Extract the object from original image, by applying perspective transform.
Apply white balance to remove lightening effects.
In the example provided by with the tess_two api, they are using Leptonica for the image manipulations like drawing the bounding boxes around the words..But in my case I want to use OpenCV...Your guidance will be highly appreciated...
That's a lot you are asking for, and depending on the object may be impossible. You should check out the tutorials on 2D feature detection and object detection (http://docs.opencv.org/doc/tutorials/features2d/table_of_content_features2d/table_of_content_features2d.html and http://docs.opencv.org/doc/tutorials/objdetect/table_of_content_objdetect/table_of_content_objdetect.html) to see if there is something you can use.
White balance does not do anything to lighting, you should do adaptive thresholding or some kind of high pass filtering instead.

Image detection inside an image

I usually play a game called Burako.
It has some color playing pieces with numbers from 1-13.
After a match finishes you have to count your points.
For example:
1 == 15 points
2 == 20 points
I want to create an app that takes a picture and count the pieces for me.
So I need something that recognizes an image inside an image.
I was about to read about OpenCV since there is an Android port but it feels there should be something simpler to do this.
What do you think?
I had not used the Android port, but i think it's doable under good lighting conditions.
I would obtain the minimal bounding boxes of each of the pieces and rotate it accordingly so you can compare it with a model image.
Another way could be to get the contour of the numbers written on the piece ( which i guess are in color) and do some contour matching with the numbers.
Opencv is a big and complex framework but it's also suitable for simple tasks like this.

Categories

Resources