I'm implementing a face tracker on Android, and as a literature study, would like to identify the underlying technique of Android's FaceDetector.
Simply put: I want to understand how the android.media.FaceDetector classifier works.
A brief Google search didn't yield anything informative, so I thought I'd take a look at the code.
By looking at the Java source code, FaceDetector.java, there isn't much to be learned: FaceDetector is simply a class that is provided the image dimensions and number of faces, then returns an array of faces.
The Android source contains the JNI code for this class. I followed through the function calls, where, reduced to the bare essentials, I learned:
The "FaceFinder" is created in FaceFinder.c:75
On line 90, bbs_MemSeg_alloc returns a btk_HFaceFinder object (which contains the function to actually find faces), essentially copying it the hsdkA->contextE.memTblE.espArrE array of the original btk_HSDK object initialized within initialize() (FaceDetector_jni.cpp:145) by btk_SDK_create()
It appears that a maze of functions provide each other with pointers and instances of btk_HSDK, but nowhere can I find a concrete instantiation of sdk->contextE.memTblE.espArrE[0] that supposedly contains the magic.
What I have discovered, is a little clue: the JNI code references a FFTEm library that I can't find the source code for. By the looks of it, however, FFT is Fast Fourier Transform, which is probably used together with a pre-trained neural network. The only literature I can find that aligns with this theory is a paper by Ben-Yacoub et al.
I don't even really know if I'm set on the right path, so any suggestions at all would undoubtedly help.
Edit: I've added a +100 bounty for anybody who can give any insight.
I Found a couple of links too...Not sure if it would help you...
http://code.google.com/p/android-playground-erdao/source/browse/#svn/trunk/SnapFace
http://code.google.com/p/jjil/
http://benosteen.wordpress.com/2010/03/03/face-recognition-much-easier-than-expected/
I'm on a phone, so can't respond extensively, but Google keywords "neven vision algorithm" pull up some useful papers...
Also, US patent 6222939 is related.
Possibly also some of the links on http://peterwilliams97.blogspot.com/2008/09/google-picasa-to-have-face-recognition.html might be handy...
have a look at this:
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1562271
I think I once saw some matlab code doing this in a presentation.
Maybe it's somewhere online.
Greetings,
Lars
Related
I'm struggling to understand how am I meant to interact with the "graph" and "calculator" stuff from the mediapipe library. More specifically, I'd like to write some Android code that uses landmarks from the holistic (pose + hands in my case) solution, with the final goal of writing a flutter application that compiles both for Android and iOS.
I've managed to build a few of the sample apps (thanks to docker), and I think I roughly understand what the graphs do. However, I don't understand how to interact with them from within the code. The Hello World! for Android tutorial doesn't really explain this. There are examples that include this type of behaviour (e.g. here), but I don't really know where is all the required information coming from (e.g. how would I find out the right functions and string constants to get holistic landmarks?)
For example, in Python I could get data via something like holistic.process(image).pose_landmarks, and then compute e.g. position or angle of hips. And as far as I can see there are some similar Android API-s available, although not for all solutions - including holistic. So what if I don't want to wait for the API-s development, and want to use the graphs instead? That part is not so clear.
As a bonus, please do feel free to drop any links which further explain/document the "graph" and "calculator" stuff I've mentioned earlier, as I've not used anything like this before and find it a little difficult to google.
Thanks.
I've realised there are docs available: https://google.github.io/mediapipe/framework_concepts/framework_concepts.html
Not sure how I've missed this...
I'm developing (trying at least) android-game and cannot find actual implementation of collisions between sprites for GLES2-Anchor-Center. No search results.
All implementations of Perfect Pixel Collision (m5 and MakersF on github for example) has a lot of errors, that cannot be resolved (maybe im just stupid). Anchor-Center even supported? Cannot post links for all of them, need more rep.
My issue for one of implementations example:
https://github.com/MakersF/CollisionTest/issues/1
Thanks for any help and sorry for my english.
There is no official pixel perfect extension to AndEngine. There is a lot of information how to make MakerF's extension work. It's bit of a reading explaining that the collision core has been separated into another project. For more information, see the forum post from here onwards.
It will take some effort, but it will help you understand how to collisions work :)
I want to record a dog bark, save the file and compare with several files containing different types of bark (warning bark, crying bark, etc..).
How could i do that comparison in order to get a match? What is the process to follow in this type of apps?
Thank you for the tips.
There is no simple answer to your problem. However, for starters, you might look into how audio fingerprinting works. This paper is an excellent start written by the creators of shazam:
http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf
I'm not sure how well that approach would work for dog barking, but there are some concepts there that might prove useful.
Another thing to look into is how the FFT works. Here's a tutorial with code that I wrote for pitch tracking, which is one way to use the FFT. You are looking more at how the tone and pitch interact with the formant structure of a given dog. So parameters you'll want to derive might include fundamental pitch (which, alone, might be enough to distinguish whining from other kinds of barks), and ratio of fundamental pitch to higher harmonics, which would help identify how agressive the bark is (I'm guessing a bit here):
http://blog.bjornroche.com/2012/07/frequency-detection-using-fft-aka-pitch.html
Finally, you might want to do some research into basic speech recognition and speech processing, as there will be some overlap. Wikipedia will probably be enough to get you started.
EDIT: oh, also, once you've identified some parameters to use for comparison, you'll need a way to compare your multiple parameters to your database of sounds with multiple parameters. I don't think the techniques in the shazam article will work. One thing you could try is Logistic Regression. There are other options, but this is probably the simplest.
I'd check out Google's open source lib musicg API: http://code.google.com/p/musicg/
It's Java so it works in Android and it gives similarity metrics for two audio files.
But it's compatible only with .wav files.
I want to achieve a nice 3D page curl animation in Android. I read some articles and found that nice effect can be achieved by OpenGL-ES so I started to learn OpenGL-ES (I did some of tutorials of OpenGL-ES and am still continuing) but I found it too complex for me to achieve this functionality. Also I got some examples which are available on StackOverflow and on the net, they work but I am not able to understand it, can someone guide me to achieve this functionality?
Based on the question comments I have an answer to this question.
YES, you can do that with OpenGL, BUT you need a deep understanding of math and graphics. This is a lot to learn, this will cost you at least a couple of weeks and it's definitely a hard path to go if you do it only because of this single animation (all of this applies if you don't take code which you probably won't understand and another human being put his whole effort into).
Nevertheless there might a ready to use implementation but unfortunately I can't present you one because I don't know if there's any out there.
Update
You callenged me, so I was eger to know whether there is something out there (because I saw that before and couldn't believe that there isn't a project out there which already does that for you).
And actually I found this question which seems to address the very same issue. And yes, there's someone who published his results here. And I have to admit: I looks awesome. It's also a pure java implementation.
But still: Having some background knowledge about OpenGL would enhance your whole attitude as developer. I'm not saying it's a must because not every one will succeed in OpenGL programming because it's quite hard to learn and implies a lot of math. But I think it's worth it because you will gain some deep understanding of current and all future graphical interfaces.
Does anyone know of good documentation for the Skia drawing library used by Android?
The main Canvas object has hardly any state, so I'm thinking especially of the objects you can embed into the Paint object. I've worked out by trial and error how to use some ColorFilters and made a cool effect with ColorMatrixColorFilter. Now I have the drop shadows I want from the LinearGradient shader also. I think I understand PathEffects and have some ideas about XferModes. MaskFilters and Rasterizers are still utterly opaque to me. But trial and error is not a good way to understand a complicated library.
Mostly I'm concerned that the Android docs don't discuss 2d graphics and the means of using them at all. Even the class javadocs often don't explain what the class is doing. The actual function is all in Skia C code, which I can get, but it also lacks documentation. I've seen some cool demos but Google explained little about how they were done.
Is the only way to understand these things experimentation and reading the C code? What about efficiency and best practices? The Davlik/Android VM is sensitive to memory allocations and sometimes slow and I'm concerned that I'm not doing things the best way.
Skia has its own google code project site where you could find some high level overview.
Inline documents could be browse by this link in the project site
http://skia.googlecode.com/svn/trunk/docs/html/hierarchy.html
And you could join the discussion mail list. Designers and community will answer questions.
Another good reference is surprising from Apple. Apple QuickDraw GX documentations explained a lot of 2D vector graphics concepts and could apply to Skia well enough.
Android canvas API did have two difference implementation, one is Skia and another is OpenGL ES. The later implementation is so-called HWUI.
Regardless of the implementation, understanding the pipeline underlying the draw process is critical to understand how to use the canvas API.
Below are the best doc available so far describing the pipeline. You will definitely find it useful.
http://www.xenomachina.com/2011/05/androids-2d-canvas-rendering-pipeline.html