I must create an Android app that recognizes some objects from the camera (car steering wheel, car wheel). I tried with Haar classifier but without success and I'm running out of time (it's a school project). So I decided to look for another way. I found some other methods for my goal - ORB. I found what should I do in this answer. My problem is that things are messed up in my head. Can you give me a step-by-step answer of what to do to implement the answer from the question in the link I gave:
From extracting the feature points to training the KD tree and using it for every frame from the camera.
Bonus questions:
Can you give a definition of feature point? It's something I couldn't exactly understand.
Will be the detecting slow using ORB? I know OpenCV can be used in native android, wouldn't that make the things faster?
I need to create this app as soon as possible. Please help!
I am currently developing a similar application. I would recommend getting something working with a single reference image first for a couple of reasons:
It's easier to do and understand if you're just starting out, and you can change it later.
For android applications you have limited processing capabilities so more images = lower fps.
You should have a look at the OpenCV tutorials which are quite helpful. Once you go through the “OpenCV for Android SDK” section and understand the three tutorials you can pretty easily add in functionality that will allow you to analyse the video feed.
The basic logic path I'd recommend following when making the app is:
Read in the reference image.
Create and use your FeatureDetector, DescriptorExtractor and DescriptorMatcher.
Use the above to detect keypoints and then descrive keypoints (the first two, don't forget to convert it to a mat and then to greyscale).
Every time you get a frame from your camera repeat step 3. on it and then compare the keypoints in the images (with the third part of 2.).
Use the result to determine if there is a match (if there is then draw a box around it or something).
Get a new frame.
Try making it to work for a single object and then add in others later. Another thing you could add is a screen at the start to allow users to pick what they want to search for.
Also ORB is reasonably fast, especially compared to SIFT and SURF. I get about 3fps on a HTC One with a single reference image.
Related
I am thinking of a project for my university the teachers liked it but I am not sure if its even possible.
I am trying to make an andriod app.
What I want to do is take a picture of a hand drawn logic circuit (having the AND, OR, NOT ... gates) recognize the gates, and make a circuit in the moblie and run it on all possible inputs
Example of logical circuit ( assume its hand drawn )
For this I will have to make a simulator on mobile, that I dont think is the hard part. The problem is how could recognize the gates from a picture.
I found out that theres a edge detection plugin in java but still I dont think its enought to recognize the gates. Please share any algorithm or any technique or tools that I can use to make this thing.
This is actually for my FYP, I cant find any good ideas and have to present this on thursday.
you will need to do some kind of object recognition the easiest way (conceptually) to identify gates is to simply do a correlation between the image and a bank of gates, or an "alphabet" You run the gate template over the entire image and look for the highest correlation, this means it matches the template closely and you likely found your gate of interest. here are a few interesting s0 posts
Simple text reader (OCR) in Matlab
MATLAB Optical character recognition - need help
On it's own this could be a daunting task, but you can simplify the problem by adding constraints.
For instance the user must draw on graph paper and they can only have one gate per grid. This ensures you won't have to check a large variety of sizes for each gate
If you use graph paper with colored lines (like blue) and the user is only allowed to use a non-blue pen/pencil, you MAY be able to easily remove the grid when processing the image by filtering out the blue channel, and still have a clean image to process with.
of course there are more advanced methods than correlation, but as I said before, conceptually, this model is very easy to understand. Hope that helps
edit
I just realized both my examples were in matlab, the important point here is the logic/process used, not the exact code.
I am trying to create an android app which can recognize Billiard balls on a pool table in an image coming from the camera. What would be the best approach to do this?
We can assume that the camera and the pool table are in fixed positions, but there could be object other than the balls on the pool table.
I am currently looking into two possible solutions:
Vuforia SDK - Simple API for object tracking / recognition, but I couldn't find any information about ball/sphere shape tracking. They have Cylinder and Image target that could possibly be used somehow to track the balls.
OpenCV - Seems much richer and steeper learning curve in comparison with Vuforia, but there is some information about Billiard ball detection online (e.g. this, and this).
Are there any addition approaches for solving this problem? What would be the easiest working approach for this?
Thanks!
The balls are moving or not?
I've used SURF (and SFIT) they work great for arrested objects. Have a look to the documentation page there are also two questions you should see this and this. Than if you want to calculate the trajectory I've tried Pymecavideo that uses OpenCv maybe a look into the source code could be interesting for your work.
I'm looking for a 'basic' AR SDK that allows me to draw images and 3D shapes around the user (no matter where he is). It would be even better if the SDK includes a simple way to detect interaction with the shapes (something like onClick).
I made a project from scratch on Android but there's still a lot of work to do and I'll need to do the same on iOS after... So that's why I'm looking for an SDK or a similar project (no matter what platform).
I tested Metaio but it's quite expensive and maybe overkill for my purpose because it uses LLA coordinates.
I tested DroidAR on Android but it's only for Android and it looks heavy too (don't need the GPS).
How about Qualcomm's Vuforia? I was able to quickly get a sample project running on it.
EDIT Looks like I was wrong about what it could do. According to this (which is slightly dated, so who knows) Metaio might be your only choice.
i really don't sure what you really want to do ..but if you simply show images or 3d models on camera without any detection you can achieve this very easily i am explaining for Android and you can extend it to ios on same logic.
first approach:
you have to use custom camera of Android in your app,then use any game engine as per your need..i will suggest Jpct-ae or Rajawali
they are very simple to integrate and can be used for 2d images and 3d models.
this tutorial will explains a lot
keep the gl-surafce transparent and you can have model floating in space ...
second approach :
to add some more effect to your AR app you can use sensor values to move model in 3d space as per movement of device..it gives a cool effect.
use first approach and additionally collect sensor values and apply that matrix to gl camera of your game engine..for sensor values follow here
good tutorial here..
i hope this may help you..i done these a long time ago but try to help if you want..
I want to develop app which will recognize object(like monument or something) present in front of camera using OpenCv and then it will show information about it.
So the question is how to recognize object(like monument or something) shape or compare to images with OpenCV?
And what is the best method for doing this?
It would be good if there was some kind of samples or tutorials for object detection and comparison.
Thank you.
The best method for what you ask is using local features detectors like OpenCV's SIFT, SURF and ORB, for example.
You need at least one picture from the object you want to detect. Afterwards, those algorithms can compare that image with other images to see if they are similar enough.
Here is the Documentation for the algorithms.
ORB and others:
http://docs.opencv.org/modules/features2d/doc/feature_detection_and_description.html
SURF and SIFT ('nonfree'):
http://docs.opencv.org/modules/nonfree/doc/feature_detection.html
The way these algorithms work for that task is by selecting interesting points for each image, and compare them to see if they match. If several matches are found, it is most likely the images have the same object.
Tutorials (from Feature Detection and below):
http://docs.opencv.org/doc/tutorials/features2d/table_of_content_features2d/table_of_content_features2d.html
You can also find C++ samples related to this topic here (samples are also within OpenCV download package):
eg. "matching_to_many_images.cpp"
"video_homography.cpp"
http://code.opencv.org/projects/opencv/repository/revisions/master/show/samples/cpp
And Android Java samples here (unrelated but also helpful):
http://code.opencv.org/projects/opencv/repository/revisions/master/show/samples/android
Or Python samples which are actually the more updated ones for this topic (at the time this post was written):
http://code.opencv.org/projects/opencv/repository/revisions/master/show/samples/python2
As a final note, like #BDFun said in the comment, this is not trivial to do.
More - if you want an overview of OpenCV Feature detection and description, check this post.
I've been able to import 3D models from Maya into OBJ files, which in turn, are read by my Android app. This model can now be displayed and I can apply transformations on them as well, even on high-polygon count objects, which is nice.
The next step is to figure out if there's any reasonable way to display animation defined within Maya. I really have no clue how to approach this and my initial research on this essentially came up empty.
Has anyone attempted this before? If so, how would this work?
I think it's worth noting that this question has little or nothing to do with Maya. Maya's file formats are proprietary and opaque; you will NOT find a way to directly display them on Android (or anywhere else, come to that). But you can export data from Maya to (basically) any format, which is what you actually want to do.
So, here's the process:
Figure out how you're going to display 3D models and animations on Android
Figure out how to get stuff from Maya into the format your answer from step 1 requires.
There's a lot of ways to do step 1. For sheer ease of use I'd probably go for Unity myself. Basically it's game development tool that can create 3D apps and games that run on Android (and iOS, OSX, Windows, etc.) It's not free - the Android addon costs $400 - but if you're actually planning on doing anything serious with Android, you'll find it worthwhile. With it, it's actually pretty trivial to make a little Android app in Unity that displays an animated model (and a LOT of 3D Android and iOS games are made in Unity). Unity also wants models and animations in FBX format, which is a widely supported interchange format - you'll have no problems getting stuff out of Maya into FBX.
If you've gone with Unity in step 1, then step 2 is trivial: Export your models from Maya as FBXes, and you're done. If you've decided to roll your own Android rendering app, well, good luck. :)
Anyhow, the point is that what you want to do is find a generic solution for rendering animated models on Android, and only then figure out how to get your content out of Maya.
Instead of outputting each individual frame to a separate obj file like spicyweenie suggests, why not export just keyframes to obj files. Implement interpolation in your code in order to fill in the missing frames. If your models are complex, you'll probably want to cache the interpolated models in memory, but at least you don't have to load them all from files too.
Unlike Cody Hatch's answer, I'm interested in this too. This is my theory as to come about animating a model:
Lets say your model has 30 frames. One way would be to export each frame as an individual OBJ model. From there, you can possible make a folder for those 30 OBJs. So now you have 31 files total. If the person hits a button, the trick would be to load each OBJ in order according to the length of time the button (or whatever action) is valid. and if it lasts over 1 second (30 frames), loop back to the beginning.
The only problem with this theory is that it would most-likely be resource and power intensive, not to mention, a space hog if you try to load a lot of things into one scene.