I am wondering how to do as I said in the title:
I want to have some objects counted reading an image from the camera of the portable device (such as iPhone or Android phones)
I need only two specific functions.
Recognize and count the amount of objects
Recognize the color of the object (so I can count how many of each color I have).
A very simple example.
I have a stack of pieces of LEGO, all of them the same dimensions. I know they always will be aligned horizontaly, sometimes they are not verically aligned. I need to count how many of each colour I have.
I know that I have pieces with the same dimensions, only the colour change.
I have i think only 10 colour avaible.
I can elaborate the image (such as blur and other stuff) but I don't know how to read how many pieces I have.
Can you tell me some Ideas how to do (and what kind of libraries to use both for iOS and Android -android first-) or maybe some publication (free pdf or books or even publicated books even if they're not free) teaching how to read data from images.
The program should be act as the same:
I start the program, when the program recognize it is looking at (using the integrated cam) some specific objects, ittake a picture and elaborate it, telling how much of each color I have.
Thanks in advance, ANY kind of help will be
I'll admit it is 10 years since I last dabbled with computer vision, but back then I used the OpenCV libraries, and these still seem to be going strong, and support on Android:
http://opencv.willowgarage.com/wiki/Android
and iOS:
http://www.eosgarden.com/en/opensource/opencv-ios/overview/
Related
i'm making reverse engineering on a android game made with cocos2dx library, editing the .so file (libcocos2dcpp.so) with ida pro and a hex editor. i've getting success changing small things, like swapping buttons functions (changing the jump addresses).
I think this game action too slow, there is someway to increase the speed to entire game? like changing some lib variable and make it accelerate 2x. i think it can be done decreasing a especific delay or increasing the clock cycles, or even changing some fps parameter.
i've readed the documentation of cocos2d and see found some relative functions like getDeltaTime, SetAnimationInterval, CCDirector, CCSpeed, CCAnimation, setTimeScale, etc. but its too many to find the right one trying by trial and error.
Can anyone help me? i will be very thankful
I am going to build of a counting of vote program with optical MC sheet in android platform
However, i find that there is not many SDK for OMR in android.
Is it possible for me to read answer on optical MC sheet answer in OCR?
OMR is much different from OCR, normally OMR on a form are template-based, so you may need to have a template designer, the technology may not be new or complex, but it is hard to make it accurate and robust, there are lots of omr engines on the market, the most efficient way is calling online omr api service from http://ssomr.com/eng/video.asp?#api
This is a very tough project. OMR is based on templates where each checkmark area has to be consistent and clearly defined. OMR is comparison of black to white threshold against blank checkmark. For example, if there is 15% or more black pixels compared to blank 'template' checkmark, then it can be considered marked.
With mobile pictures, each picture is inconsistent in size, and lighting will affect how your binarization works, so thresholding will be very tough to standardize.
In general, on device OCR (not even talking about OMR) is weak or too expensive, even for machine text. I would consider server-based processing for OMR. It is not on device, but it can be fast enough to seem like it is running right on your device.
I'm writing an Android app to extract a Sudoku puzzle from a picture. For each cell in the 9x9 Sudoku grid, I need to determine whether it contains one of the digits 1 through 9 or is blank. I start off with a Sudoku like this:
I pre-process the Sudoku using OpenCV to extract black-and-white images of the individual digits and then put them through Tesseract. There are a couple of limitations to Tesseract, though:
Tesseract is large, contains lots of functionality I don't need (I.e. Full text recognition), and requires English-language training data in order to function, which I think has to go onto the device's SD card. At least I can tell it to only look for digits using tesseract.setVariable("tessedit_char_whitelist", "123456789");
Tesseract often misinterprets a single digits as a string of digits, often containing newlines. It also sometimes just plain gets it wrong. Here are a few examples from the above Sudoku:
I have three questions:
Is there any way I can overcome the limitations of Tesseract?
If not, what is a useful, accurate method to detect individual digits (not k-nearest neighbours) that would be feasible to implement on Android - this could be a free library or a DIY solution.
How can I improve the pre-processing to target that method? One possibility I've considered is using a thinning algorithm, as suggested by this post, but I'm not going to bother implementing it unless it will make a difference.
I took a class with one of the computer vision superstars who was/is at the top of the digit recognition algorithm rankings. He was really adamant that the best way to do digit recognition is...
1. Get some hand-labeled training data.
2. Run Histogram of Oriented Gradients (HOG) on the training data, and produce one
long, concatenated feature vector per image
3. Feed each image's HOG features and its label into an SVM
4. For test data (digits on a sudoku puzzle), run HOG on the digits, then ask
the SVM classify the HOG features from the sudoku puzzle
OpenCV has a HOGDescriptor object, which computes HOG features. Look at this paper for advice on how to tune your HOG feature parameters. Any SVM library should do the job...the CvSVM stuff that comes with OpenCV should be fine.
For training data, I recommend using the MNIST handwritten digit database, which has thousands of pictures of digits with ground-truth data.
A slightly harder problem is to draw a bounding box around digits that appear in nature. Fortunately, it looks like you've already found a strategy for doing bounding boxes. :)
Easiest thing is to use Normalized Central Moments for digit recognition.
If you have one font (or very similar fonts it works good).
See this solution: https://github.com/grzesiu/Sudoku-GUI
In core there are things responsible for digit recognition, extraction, moments training.
First time application is run operator must provide information what number is seen. Then moments of image (extracted square roi) are assigned to number (operator input). Application base on comparing moments.
Here first youtube movie shows how application works: http://synergia.pwr.wroc.pl/2012/06/22/irb-komunikacja-pc/
Last week i have chosen my major project. It is a vision based system to monitor cyclists in time trial events passing certain points on the course. It should detect the bright yellow race number on a cyclist's back and extract the number from it, and besides record the time.
I done some research about it and i decided to use Tesseract Android Tools by Robert Theis called Tess Two. To speed up the process of recognizing the text i want to use a fact that the number is mend to be extracted from bright (yellow) rectangle on the cyclist back and to focus the actual OCR only on it. I have not found any piece of code or any ideas how to detect the geometric figures with specific color. Thank you for any help. And sorry if i made any mistakes I am pretty new on this website.
Where are the images coming from? I ask because I was asked to provide some technical help for the design of a similar application (we were working with footballer's shirts) and I can tell you that you'll have a few problems:
Use a high quality video feed rather than rely on a couple of digital camera images.
The number will almost certainly be 'curved' or distorted because of the movement of the rider and being able to use a series of images will sometimes allow you to work out what number it really is based on a series of 'false reads'
Train for the font you're using but also apply as much logic as you can (if the numbers are always two digits and never start with '9', use this information to help you get the right number
If you have the luxury of being able to position the camera (we didn't!), I would have thought your ideal spot would be above the rider and looking slightly forward so you can capture their back with the minimum of distortions.
We found that merging several still-frames from the video into one image gave us the best overall image of the number - however, the technology that was used for this was developed by a third-party and they do not want to release it, I'm afraid :(
Good luck!
I have an application where I want to track 2 objects at a time that are rather small in the picture.
This application should be running on Android and iPhone, so the algorithm should be efficient.
For my customer it is perfectly fine if we deliver some patterns along with the software that are attached to the objects to be tracked to have a well-recognizable target.
This means that I can make up a pattern on my own.
As I am not that much into image processing yet, I don't know which objects are easiest to recognize in a picture even if they are rather small.
Color is also possible, although processing several planes separately is not desired because of the generated overhead.
Thank you for any advice!!
Best,
guitarflow
If I get this straight, your object should:
Be printable on an A4
Be recognizeable up to 4 meters
Rotational invariance is not so important (I'm making the assumption that the user will hold the phone +/- upright)
I recommend printing a large checkboard and using a combination of color-matching and corner detection. Try different combinations to see what's faster and more robust at difference distances.
Color: if you only want to work on one channel, you can print in red/green/blue*, and then work only on that respective channel. This will already filter a lot and increase contrast "for free".
Otherwise, a histogram backprojection is in my experience quite fast. See here.
Also, let's say you have only 4 squares with RGB+black (see image), it would be easy to get all red contours, then check if it has the correct neighbouring colors: a patch of blue to it's right and a patch of green below it, both of roughly the same area. This alone might be robust enough, and is equivalent to working on 1 channel since for each step you're only accessing one specific channel (search for contours in red, check right in blue, check below in green).
If you're getting a lot of false-positives, you can then use corners to filter your hits. In the example image, you have 9 corners already, in fact even more if you separate channels, and if it isn't enough you can make a true checkerboard with several squares in order to have more corners. It will probably be sufficient to check how many corners are detected in the ROI in order to reject false-positives, otherwise you can also check that the spacing between detected corners in x and y direction is uniform (i.e. form a grid).
Corners: Detecting corners has been greatly explored and there are several methods here. I don't know how efficient each one is, but they are fast enough, and after you've reduced the ROIs based on color, this should not be an issue.
Perhaps the simplest is to simply erode/dilate with a cross to find corners. See here .
You'll want to first threshold the image to create a binary map, probably based on color as metnioned above.
Other corner detectors such as Harris detector are well documented.
Oh and I don't recommend using Haar-classifiers. Seems unnecessarily complicated and not so fast (though very robust for complex objects: i.e. if you can't use your own pattern), not to mention the huge amount of work for training.
Haar training is your friend mate.
This tutorial should get you started: http://note.sonots.com/SciSoftware/haartraining.html
Basically you train something called a classifier based on sample images (2000 or so of the object you want to track). OpenCV already has the tools required to build these classifiers and functions in the library to detect objects.