The requirement is to create an Android application running on one specific mobile device that records video of a human eye pupil dilating in response to a bright light (which is physically attached to the mobile device). The video is then post-processed frame by frame on the device to detect & measure the diameter of the pupil AND the iris in each frame. Note the image processing does NOT need doing in real-time. The end result will be a dataset describing the changes in pupil (& iris) size over time. It's expected that the iris size can be used to enhance confidence in the pupil diameter data (eg removing pupil size data that's wildly wrong), but also as a relative measure for how dilated the eye is at any point.
I am familiar with developing Android mobile apps, but my experience with image processing is very limited. I've researched solutions and it seems that the answer may lie with the OpenCV/JavaCv libraries, which should provide shape detection (eg http://opencvlover.blogspot.co.uk/2012/07/hough-circle-in-javacv.html) but can anyone provide guidance on these specific questions:
Am I right to think it can detect the two circle shapes within a bitmap, one inside the other? ie shapes inside each other is not a problem.
Is it true that JavaCv can detect a circle, and return a position & radius/diameter? ie it doesn't return a set of vertices that then require further processing to compare with a circle? It seems to have a HoughCircle method, so I think yes.
What processing of each frame is typically used before doing shape detection? For example an algorithm to enhance edges, smooth, or remove colour?
Can I use it to not just detect presence of, but measure the diameter of the detected circles? (in pixels, but then can easily be converted to real-world measurements because known hardware is being used). I think yes, but would be great to hear confirmation from those more familiar.
This project is a non-commercial charitable project, so any help especially appreciated.
I would really suggest using ndk as it is a bit richer in features. Also it allows you to run and test your algorithms on a laptop with images before pushing it to a device, speeding up development.
Pre-processing steps:
Typically one would use thresholding or canny edge detection and morphological operations like erode dilate.
For detection of iris / pupil, houghcircles is not a very good method, feature detection methods like MSER work better for not-so-well-defined circles. Here is another answer I wrote on the same topic which has code that could help.
If you are looking to measure the regions, I would suggest going through this blog. It has a clear explanation on the steps involved for a reasonably accurate measurement.
Related
I'm building an Android app that has to identify, in realtime, a mark/pattern which will be on the four corners of a visiting card. I'm using a preview stream of the rear camera of the phone as input.
I want to overlay a small circle on the screen where the mark is present. This is similar to how reference dots will be shown on screen by a QR reader at the corner points of the QR code preview.
I'm aware about how to get the frames from camera using native Android SDK, but I have no clue about the processing which needs to be done and optimization for real time detection. I tried messing around with OpenCV and there seems to be a bit of lag in its preview frames.
So I'm trying to write a native algorithm usint raw pixel values from the frame. Is this advisable? The mark/pattern will always be the same in my case. Please guide me with the algorithm to use to find the pattern.
The below image shows my pattern along with some details (ratios) about the same (same as the one used in QR, but I'm having it at 4 corners instead of 3)
I think one approach is to find black and white pixels in the ratio mentioned below to detect the mark and find coordinates of its center, but I have no idea how to code it in Android. I looking forward for an optimized approach for real-time recognition and display.
Any help is much appreciated! Thanks
Detecting patterns on four corners of a visiting card:
Assuming background is white, you can simply try this method.
Needs to be done and optimization for real time detection:
Yes, you need OpenCV
Here is an example of real-time marker detection on Google Glass using OpenCV
In this example, image showing in tablet has delay (blutooth), Google Glass preview is much faster than that of tablet. But, still have lag.
It seems I've found myself in the deep weeds of the Google Vision API for barcode scanning. Perhaps my mind is a bit fried after looking at all sorts of alternative libraries (ZBar, ZXing, and even some for-cost third party implementations), but I'm having some difficulty finding any information on where I can implement some sort of scan region limiting.
The use case is a pretty simple one: if I'm a user pointing my phone at a box with multiple barcodes of the same type (think shipping labels here), I want to explicitly point some little viewfinder or alignment straight-edge on the screen at exactly the thing I'm trying to capture, without having to worry about anything outside that area of interest giving me some scan results I don't want.
The above case is handled in most other Android libraries I've seen, taking in either a Rect with relative or absolute coordinates, and this is also a part of iOS' AVCapture metadata results system (it uses a relative CGRect, but really the same concept).
I've dug pretty deep into the sample app for the barcode-reader
here, but the implementation is a tad opaque to get anything but the high level implementation details down.
It seems an ugly patch to, on successful detection of a barcode anywhere within the camera's preview frame, to simple no-op on barcodes outside of an area of interest, since the device is still working hard to compute those frames.
Am I missing something very simple and obvious on this one? Any ideas on a way to implement this cleanly, otherwise?
Many thanks for your time in reading through this!
The API currently does not have an option to limit the detection area. But you could crop the preview image before it gets passed into the barcode detector. See here for an outline of how to wrap a detector with your own class:
Mobile Vision API - concatenate new detector object to continue frame processing
You'd implement the "detect" method to take the frame received from the camera, create a cropped version of the frame, and pass that through to the underlying detector.
I need to implement a simple Android application that allows users to draw a "simple" shape (circle, triangle etc) on their phone and then ask a server if the drawn shape matches one of the shapes in its database, which consists of a low number of shapes (let's say < 100, but can be more). In order to make this application work, I was thinking to use the following steps (we assume that the input image consists only of black & white pixels);
A. re-size & crop the input image in order to bring it to the same scale as the ones in the DB
B. rotate the input image by a small angle (let's say 15 degrees) x times (24 in this case) and try to match each of these rotations against each shape in the DB.
Questions:
For A, what would be the best approach? I was thinking to implement this step in the Android application, before sending the data to the server.
For B, what would be a decent algorithm of comparing 2 black & white pixel images that contain only a shape?
Is there any better / simpler way of implementing this? A solution that also has an implementation is desirable.
PS: I can see that many people have discussed similar topics around here, but I can't seem to find something that matches my requirements well enough.
Machine learning approach
You choose some features which describe contours, choose some classification method, prepare a training set of tagged contours, train the classifier, use it in the program.
Contour features. Given a contour(detected in the image or constructed from the user input), you can calculate rotation-invariant moments. The oldest and the most well known is a set of Hu moments.
You can also consider such features of the contour as eccentricity, area, convexity defects, FFT transform of the centroid distance function and many others.
Classifiers. Now you need to train a classifier. Support Vector Machines, Neural Networks, decision trees, Bayes classifiers are some of the popular methods. There are many methods to choose from. If you choose SVM, LIBSVM is a free SVM library, which works also in Java, and it works on Android too.
Ad-hoc rule approach
You can also approximate contour with a polygonal curve (see Ramer-Douglas-Peucker algorithm, there is a free implementation in OpenCV library, now available on Android). For certain simple forms like triangles or rectangles you can easily invent some ad-hoc heuristic rule which will "recognize" them (for example, if a closed contour can be approximated with just three segments and small error, then it is likely to be a triangle; if the centroid distance function is almost constant and there are zero convexity defects, then it is likely to be a circle).
Since this is very much related to hand writing recognition, you can use a simple hmm algorithm to compare shapes with pre-learnt db.
But for a much simpler approach you can detect the corners in the image and then count the corners to detect shapes.
The first approach can be used for any complicated shapes and the second only suits basic shapes.
You can use a supervised learning approach. For the problem you are trying to solve I think simple classifiers like Naive Bayes, KNN, etc. should give you good results.
You need to extract features from each of the images. For each image you can save the them in a vector. Lets call it the feature vector. For the images you have in your database you already know the type of shape so you can include the id of the type in the feature vector. This will serve as the training set.
Once you have your training set, you can train your classifier and every time you want to classify a new shape you just get its feature vector and use it to query the classifier.
I recommend you to use scale and size invariant features, so you will not have to re-size each image and you just need to compare it once instead of rotating it.
You can do a quick search for Scale/Rotate invariant features and try them.
i just need some guide on how to detect a marker and make an output text.. for ex: a marker with an image of a dog , when detected, i have an output text "DOG" in a textfield .. can someone help me with my idea? oh, btw which one is more effective to use nyartoolkit or andar for my idea?thanks:) need help..!
What you're looking for isn't augmented reality, it's object recognition. AR is chiefly concerned with presenting data overlaid on the the real world, so computation is devoted each frame to determining the position relative to the camera of the object. If you don't intent to use this data, AR libraries may be an inefficient. That said...
AR marker tracking libraries usually find markers by prominent features like corners, and can distinguish markers by binary patters encoded inside the marker, or in the marker's borders. If you're happy with having the "dog" part encoded in the border of a marker, there are libraries you can use like Qualcomm's AR development kit. This library, and Metaio's Unifeye mobile can also do natural feature tracking on pre-defined images. If you're happy with being able to recognize one specific image or images of dogs that you have defined in advance, either of these should be ok. You might have to manipulate your dog images to get good features they can identify and track. Natural objects can be problematic.
General object recognition (being able to recognize a picture of any dog, not known beforehand) is still a research topic. There are approaches, but they're mostly very computationally intensive, and most mobile solutions involve offloading the serious computation to a server. Recognition of simple outline sketches however is more tractable, there's a great paper called "Shape recognition and pose estimation for mobile augmented reality" (I can't find a copy online, but the IEEE link is here) that uses contours to identify objects - this is light enough to run on a mobile (and it's pure genius).
I'm developing an augmented reality application for Android that uses the phone's camera to recognise the arrangement of the coloured squares on each face of a Rubik's Cube.
One thing that I am unsure about is how exactly I would go about detecting and recognising the coloured squares on each face of the cube. If you look at a Rubik's Cube then you can see that each square is one of six possible colours with a thin black border. This lead me to think that it should be relativly simply to detect a square, possibly using an existing marker detection API.
My question is really, has anybody here had any experience with image recognition and Android? Ideally I'd like to be able to implement and existing API, but it would be an interesting project to do from scratch if somebody could point me in the right direction to get started.
Many thanks in advance.
Do you want to point the camera at a cube, and have it understand the configuration?
Recognizing objects in photographs is an open AI problem. So you'll need to constrain the problem quite a bit to get any traction on it. I suggest starting with something like:
The cube will be photographed from a distance of exactly 12 inches, with a 100W light source directly behind the camera. The cube will be set diagonally so it presents exactly 3 faces, with a corner in the center. The camera will be positioned so that it focuses directly on the cube corner in the center.
A picture will taken. Then the cube will be turned 180 degrees vertically and horizontally, so that the other three faces are visible. A second picture will be taken. Since you know exactly where each face is expected to be, grab a few pixels from each region, and assume that is the color of that square. Remember that the cube will usually be scrambled, not uniform as shown in the picture here. So you always have to look at 9*6 = 54 little squares to get the color of each one.
The information in those two pictures defines the cube configuration. Generate an image of the cube in the same configuration, and allow the user to confirm or correct it.
It might be simpler to take 6 pictures - one of each face, and travel around the faces in well-defined order. Remember that the center square of each face does not move, and defines the correct color for that face.
Once you have the configuration, you can use OpenGL operations to rotate the cube slices. This will be a program with hundreds of lines of code to define and rotate the cube, plus whatever you do for image recognition.
In addition to what Peter said, it is probably best to overlay guide lines on the picture of the cube as the user takes the pictures. The user then lines up the cube within the guide lines, whether its a single side (a square guide line) or three sides (three squares in perspective). You also might want to have the user specify the number of colored boxes in each row. In your code, sample the color in what should be the center of each colored box and compare it to the other colored boxes (within some tolerance level) to identify the colors. In addition to providing the recognized results to the user, it would be nice to allow the user to make changes to the recognized colors. It does not seem like fancy image recognition is needed.
Nice idea, I'm planing to use computer vision and marker detectors too, but for another project. I am still looking if there is any available information on the web, ex: linking openCV or ARtoolkit to the Android SDK. If you have any additional information, about how to link a computer vision API, please let me know.
See you soon and goodluck!
NYARToolkit uses marker detection and is made in JAVA (as well as managed C# for windows devices). I don't know how well it works on the android platform, but I have seen it used on windows mobile devices, and its very well done.
Good luck, and happy programming!
I'd suggest looking at the Andoid OpenCV library. You probably want to examine the blob detection algorithms. You may also want to consider Hough lines or Countours to detect quads.