I am new in OpenCV, image processing and also in Native language C/C++ for, and I would like to have some guides on where I should focus on in order to complete my task. I am developing an android application that can recognize the bent pins and circle/square the bent pins, for example the face recognition in openCV will "square" the human face once it was detected. The bent pins can be in defected in various different forms. I am using Eclipse ADT. Currently I had downloaded the Face recognition in OpenCV for android and I am analyzing it, and according to what I had discovered is that, it consist of an xml file, where it had been trained and is used for detection by the system. Now my questions are:
How can I train and generate the xml file?
What software should I use in order to train and generate the xml file?
What type of images do I need to retrieve in order to train the system/image requirements (eg, image of the bent pins from multiple angles)?
What is the best algorithm to achieve this?
According to my research I discovered that Face detection recognition is using Haar-like feature. What is the difference of Haar-like feature, cascade classifier and also artificial neural network? I am confused of the difference. Are they the same thing?
Thank you
1) 2) the pc-version of opencv comes with a tool named opencv_traincascade, this is used to generate the haar/hog/lbp xml-cascades off-line. (no, you don't run that kind of task on your smartphone)
3) 4) multiple(hundreds) images from your object, also even more negative (non-pin/background images)
5) haar cascades train on edge features like those:
so, here's the bummer: i seriously doubt, that your 'bent pins' come with enough 'edge features' for this.
Related
I would like to make an Android Application that captures an image and searches it for coins and paper notes and then determines the value of the money in the image.
Additionally, the output of the system will be such that it can be understood by a blind person.
What functions and techniques in openCV would suit these tasks?
What limitations and development hurdles can I expect?
Assuming you already know how to program android apps,you need to do the following:-
Download the OpenCV SDK and set it up with the IDE.
Recognising shapes will be a huge part of your project, see the contour detection example that is present with the SDK examples. Your primary goal will be detecting a circle. Later you need to adapt your algorithm according to the currency. This will be of particular interest to you.
Learn the different image processing techniques like thresholding for better accurate results. Understanding what a Mat object is and how it can be manipulated is important.
Finally improve the accuracy of your algorithm, as sometimes lighting conditions make the difference between a good review and a dissatisfied user.
I have successfully integrated tesseract into my android app and it reads whatever the image that I capture but with very less accuracy. But most of the time I do not get the correct text after capturing because some text around the region of interest is also getting captured.
All I want to read is all text from a rectangular area, accurately, without capturing the edges of the rectangle. I have done some research and posted on stackoverflow about this two times, but still did not get a happy result!
Following are the 2 posts that I made:
https://stackoverflow.com/questions/16663504/extract-text-from-a-captured-image?noredirect=1#comment23973954_16663504
Extracting information from captured image in android
I am not sure whether to go ahead with tesseract or use openCV
Including the many links and answers from others, I think it's good to take a step back and note that there are actually two fundamental steps to optical character recognition (OCR):
Text Detection: This is the title and focus of your question, and it is concerned with localizing regions in an image that contain text.
Text Recognition: This is where the actual recognition happens, where the localized image regions from detection get segmented character-by-character and classified. This is also where tools like Tesseract come into play.
Now, there are also two general settings in which OCR is applied:
Controlled: These are images taken from a scanner or similar in-nature where the target is a document and things like perspective, scale, font, orientation, background consistency, etc are pretty docile.
Uncontrolled/Scene: These are the more natural and in-the-wild photos, e.g. those taken from a camera, where you are trying to recognize a street sign, shop name, etc.
Tesseract as-is is most applicable to the "controlled" setting. And in general, but for scene OCR especially, "re-training" Tesseract will not directly improve detection, but may improve recognition.
If you are looking to improve scene text detection, see this work; and if you are looking at improving scene text recognition, see this work. Since you asked about detection, the detection reference uses maximally stable extremal regions (MSER), which has a plethora of implementation resources, e.g. see here.
There's also a text detection project here specifically for Android too:
https://github.com/dreamdragon/text-detection
As many have noted, keep in mind that recognition is still an open research challenge.
The solution to improving the OCR output is to
either use more training data to train it better
filter it's input using some Linear Filter (grayscaling, high-contrasting, blurring)
In the chat we posted a number of links describing filtering techniques used in OCRing, but sample code wasn't posted.
Some of the links posted were
Improving input for OCR
How to train Tesseract
Text enhancement using asymmetric filters <-- this paper is easily found on google, and should be read fully as it quite clearly illustrates and demonstrates necessary steps before OCR-processing the image.
OCR Classification
We're currently working on an android ocr app using opencv.pre-processing ,segmentation ,Feature extraction steps are done. Classification is the remaining step and we're stuck ..We're using a DB table which is filled with each letter features ..Firstly we had only 1 feature per letter and we used euclidean distance ,but results wasn't accurate and more features needed to be obtained and so we did.The problem now is we have 7 features per letter and absolutely no idea of how to classify i/p based on them..some have recommended using knn ,but we can't figure out how and the opencv documentation in that part ain't clear ..so if anybody can help it wud be great.
Thanks in advance
Briefly and without discussing the details. Vector space comes in handy here. You need to build a feature vector
<feature1, feature2, feature3.. featureN> for each of the instances in your training set.
From each of these images you extract features that you think or you read in the research articles are important for image classification. For example you can do centroid, Gaussian blur, histograms, etc.
Once you have these values linear algebra comes into play with some classification algorithm: knn, svm, naive bayes etc that you run on your training set, that is you build your model.
If the model is ready you run it on your test set.
Use cross validation for more comprehensive results.
For more details check the course notes:
http://www.inf.ed.ac.uk/teaching/courses/iaml/slides/knn-2x2.pdf
or
http://www.inf.ed.ac.uk/teaching/courses/inf2b/lectureSchedule.html
would like to add that OpenCV may not have the sort of classifiers you might prefer.
There are several libraries out there, though you may have to see which works best when on a mobile platform. Could you give some details on the features you are using?
The simplest KNN (k-nearest neighbors) measure would be to find the Euclidean distance in n dimensions (for an n-dimensional feature vector) between the input sample's features and each of the vectors in your DB table. Also explore Mahalanobis distance (used to measure distance between a point and a dataset/class) if you have multiple classes and the input image is to be classified as one such 'type' or 'class' of image.
As #matcheek mentioned, more sophistication can be possible using machine learning techniques such as SVM, Neural Nets, etc. However first you might consider a simpler thing like kNN, considering its a mobile platform which may limit the computational complexity.
I want to develop app which will recognize object(like monument or something) present in front of camera using OpenCv and then it will show information about it.
So the question is how to recognize object(like monument or something) shape or compare to images with OpenCV?
And what is the best method for doing this?
It would be good if there was some kind of samples or tutorials for object detection and comparison.
Thank you.
The best method for what you ask is using local features detectors like OpenCV's SIFT, SURF and ORB, for example.
You need at least one picture from the object you want to detect. Afterwards, those algorithms can compare that image with other images to see if they are similar enough.
Here is the Documentation for the algorithms.
ORB and others:
http://docs.opencv.org/modules/features2d/doc/feature_detection_and_description.html
SURF and SIFT ('nonfree'):
http://docs.opencv.org/modules/nonfree/doc/feature_detection.html
The way these algorithms work for that task is by selecting interesting points for each image, and compare them to see if they match. If several matches are found, it is most likely the images have the same object.
Tutorials (from Feature Detection and below):
http://docs.opencv.org/doc/tutorials/features2d/table_of_content_features2d/table_of_content_features2d.html
You can also find C++ samples related to this topic here (samples are also within OpenCV download package):
eg. "matching_to_many_images.cpp"
"video_homography.cpp"
http://code.opencv.org/projects/opencv/repository/revisions/master/show/samples/cpp
And Android Java samples here (unrelated but also helpful):
http://code.opencv.org/projects/opencv/repository/revisions/master/show/samples/android
Or Python samples which are actually the more updated ones for this topic (at the time this post was written):
http://code.opencv.org/projects/opencv/repository/revisions/master/show/samples/python2
As a final note, like #BDFun said in the comment, this is not trivial to do.
More - if you want an overview of OpenCV Feature detection and description, check this post.
First, sorry abour my poor english.
I'm planning to build an augmented reality app for android mobile platform and the main feature is the ability of the user to take a shoot of a shop and the application recognize the shop that he is photographing. I Do not know if the best option would be to use an image recognition api as many existing, but I think it would be something more specific. Maybe own a bank of images would help.
My plan was to have a database of stores with their locations and use one of many tools for image recognition and search in my database to the same location. But I found that all search engines images (kooba, iqengines, etc.) are not free and not a little cheaper. So would a tool that could use a limited catalog, like shops images in a shopping mall for example and send photos of smartphones (both android or iphone).
Can someone help me get started?
I've been doing something similar for my dissertation at University. I developed an application which detected signposts, read the content on them, then personalised / prioritised it depending on the user's preferences (with mixed success).
As part of this I had to look into Image Recognition.
Two things you may want to look at are:
The Qualcomm QCAR SDK. This was a little bit too image specific for what I was after, but if you were to do it on a small range of shops it may work. This would require a collection of shop images to match against - I don't know how successful it would be.
What I implemented used JavaCV (a conversion of OpenCV), which also has an Android conversion. It seems to allow for image recognition a bit more generally than the previous option which is why I used it. It would require you to run your own training to create a classifier though (unless there is another way of doing image recognition within it). But there are a number of guides which can help with that.
I used it for recognising signposts with reasonable success off just some basic training, though did tend to recognise a number of false positives.
Within my application I then used location to match up with previous detections etc.
Hopefully these may get you started.