In newer Android devices there's the possibility to unlock the phone with your face. It will also be possible with the Iphone X.
Is there a way of using these sensors/camera to check if the user is looking at the screen?
Edit:
I found that there's also a Vision Framework from Google: Vision Framework
Yes and no.
The built-in Face ID feature on iPhone X can unlock the device and authorize other built-in features (Apple Pay, iTunes/App Store payment, etc). You can also use it as a method of authorization in your app — the same LocalAuthentication framework calls that you use to support Touch ID on other devices automatically use Face ID instead on iPhone X.
Face ID, by default, requires the user to be looking at the screen. Thus, if your use case for attention detection has to do with authorization or unlocking, you can use LocalAuthentication to do it. (However, the user can disable attention detection in Accessibility settings, reducing the security but increasing the usability of Face ID. Third-party apps can't control or even read this setting.)
If you're talking about more directly doing attention detection or gaze tracking... Apple doesn't provide any API that exposes the inner workings of Face ID, or at least the gaze tracking part. Here's what they do have:
ARKit offers ARFaceTrackingConfiguration (see also sample code), which provides a detailed 3D model of the face in real time (supposedly using some of the same Neural Engine stuff as Face ID for detail and performance).
But as far as ARKit is concerned, eyes are just two holes in the face — there's no gaze tracking.
Apple's Vision framework offers face detection and face landmark recognition (that is, it locates eyes, nose, mouth, etc). Vision does identify the eye outline and the pupil, which you could theoretically use as a basis for gaze tracking.
However, since Vision offers such data only in 2D and doesn't get a 3D pose for the face, you're still left with a hefty computer vision problem if you want to build gaze tracking yourself. Vision processes 2D images, which means that it doesn't require iPhone X (but also means that it doesn't benefit from the TrueDepth camera on iPhone X either).
AVCapture offers access to the TrueDepth camera, so you can get the same color + depth imagery that Face ID and ARKit use to do their magic. (You just don't get said magic for yourself.)
None of this is to say that gaze tracking isn't possible on iOS in general or iPhone X specifically — all the building blocks are there, so given enough R&D effort you can implement it yourself. But Apple doesn't provide any developer access to the built-in gaze tracking mechanism.
Yes, in iOS 11 developer can use this feature in their third party application too through the iOS latest Vision Framework
Whole idea behind this feature is using front camera with facial recognition.
But you have to optimise it for when to capture images for processing
Tips
On application become active or become in foreground.
Also when user interact with any UI control or widget like (buttons,
table , touch events etc ).
Make sure stop or pause processing when
application not active.
Also you can use Gyroscope and other sensors to find device physical state.
If you are open to bulk up your app with a ML model, Google's media pipe is another option. You can even track the user's iris in this way:
https://google.github.io/mediapipe/solutions/iris
Obviously this is a overkill for simple eye detection, but you should be able to do much more with these models and framework.
Related
I was wondering if there are any plans to have the Holistic Detection (face + pose + hand tracking) implemented in MLKit, or if there is an easy and efficient way to add the face and the hand detection to the pose detection results.
We are looking at this indeed. However, since it is a very compute intensive set of tasks, we want to make sure it can run on a wide range of devices, rather than just flagship phones. If you are looking for this functionality today, I would go with the MediaPipe solution like you reference as well.
I am trying to create an application like Snapchat that applies face filters while recording the video and saves it with the filter on.
I know there are packages like AR core and flutter_camera_ml_vision but these are not helping me.
I want to provide face filters and apply them at the time of video recording on the face, and also save the video with the filter on the face.
Not The Easiest Question To Answer, But...
I'll give it a go, let's see how things turn out.
First of all, you should fill in some more details about the statements given in the question, especially what you're trying to say here:
I know there are packages like AR core and flutter_camera_ml_vision but these are not helping me.
How did you approach the problem and what makes you say that it didn't help you?
In the Beginning...
First of all, let's get some needed basics out of the way to better understand your current situation and level in the prerequisite areas of knowledge:
Do you have any experience using Computer Vision & Machine Learning frameworks in other languages / in other apps?
Do you have the required math skills needed to use this technology?
As you're using Flutter, my guess is that cross-platform compatibility is high priority, have you done much Flutter programming before and what devices are your main targets?
So, What is required for creating a Snapchat-like filter for use in live video recording?
Well, quite a lot of work happens behind the scenes when you apply a filter to live video using any app that implements this in a decent way.
Snapchat uses in-house software that they've built up over years, using technology acquired from multiple multi-million dollar company acquisitions, often established companies that specialized in Computer Vision and AR technology, in addition to their own efforts, and has steadily grown to be quite impressive through the last 5-6 years in particular.
This isn't something you can throw together by yourself as an "all night'er" and expect good results. But there are tools available for easing the general learning curve, but these tools also require a firm understanding of the underlying concepts and technologies being used, and quite a lot of math.
The Technical Detour
OK, I know I may have went a bit overboard here, but this is fundamental building blocks, not so many are aware of the actual amount of computation needed for seemingly "basic" functionality, so please, TLDR; or not, this is fundamental stuff.
To create a good filter for live capture using a camera on something like an iPhone or Android device, you could, and most probably would, use AR as you mentioned you wanted to use in the end, but realize that this is a sub-set of the broad field of Computer Vision (CV) that uses various algorithms from Artificial Intelligence (AI) and Machine Learning (ML) for the main tasks of:
Facial Recognition
Given frames of video content from the live camera, define the area containing a human face (some also works with animals, but let's keep it as simple as possible) and output a rectangle suitable for use as a starting point in (x, y, for width & height).
The analysis phase alone will require a rather complex combination of algorithms / techniques from different parts of the AI universe, and this being video, not a single static image file, this must be continuously updated as the person / camera moves, so it must be done in close to real-time, in the millisecond range.
I believe different implementations combining HOG (Histogram of Oriented Gradients) from Computer Vision and SVMs (Support Vector Machines / Networks) from Machine Learning are still pretty common.
Detection of Facial Landmarks
This is what will define how well a certain effect / filter will adapt to different types of facial features and detect accessories like glasses, hats etc. Also called "facial keypoint detection", "facial feature detection" and other variants in different literature on the subject.
Head Pose Estimation
Once you know a few landmark points, you can also estimate the pose of the head.
This is an important part of effects like "face swap" to correctly re-align one face with another in an acceptable manner. A toolkit like OpenFace (Uses Python, OpenCV, OpenBLAS, Dlib ++) contains a lot of useful functionality, capable of facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation, delivering pretty decent results.
The Compositing of Effects into the Video Frames
After the work with the above is done, the rest involves applying the target filter, dog ears, rabbit teeth, whatever to the video frames, using compositing techniques.
As this answer is starting to look more like an article, I'll leave it to you to go figure out if you want to know more of the details in this part of the process.
Hey, Dude. I asked for AR in Flutter, remember?
Yep.
I know, I can get a bit carried away.
Well, my point is that it takes a lot more than one would usually imagine creating something like you ask for.
BUT.
My best advice if Flutter is your tool of choice would be to learn how to use the Cloud-Based ML services from Google's Firebase suite of tools, Firebase Machine Learning and Google's MLKit.
Add to this some AR-specific plugins, like the ARCore Plugin, and I'm sure you'll be able to get the pieces together if you have the right background and attitude, plus a good amount of aptitude for learning.
Hope this wasn't digressing too far from your core question, but there are no shortcuts that I know of that cut more corners than what I've already mentioned.
You could absolutely use the flutter_camera_ml_vision plugin and it's face recognition which will give you positions for landmarks of a face, such as, nose, eyes etc. Then simply stack the CameraPreview with a CustomPaint(foregroundPainter: widget in which you draw your filters using the different landmarks as coordinates for i.e. glasses, beards or whatever you want at the correct position of the face in the camera preview.
Google ML Kit also has face recognition that produces landmarks and you could write your own flutter plugin for that.
You can capture frames from the live camera preview and reformat them and then it as a byte buffer to ML kit or ML vision. I am currently writing a flutter plugin for ML kit pose detection with live capture so if you have any specific question about that let me know.
You will then have to merge the two surfaces and save to file in appropriate format. This is unknown territory for me so I can not provide any details about this part.
I have a Sony Alpha 7R Camera and look for information about the "build in" application support. What are those apps, android based? I there information public about how to create and install your own camera app -- NOT talking about the remote api.
The few available apps are kind of primitive and limited, in particular I'd like to create a more versatile "Interval timer" app -- the time lapse thing is kind of too simple for my purpose.
To be particular: versatile bracketing, absolute start/stop times, complex shooting programs with pre programmed ISO, Shutter, bracketing etc. for a series for programmed interval shooting, or simply as fast as possible... As an example -- I just suffered "lost valuable time" shooting a Eclipse as I had to reconfigure/switch modes, etc.
Ideal would be a scenario I could upload a shooting script to the app on the camera.
The real answer is that you can build applications for the Camera API using many different methods. When you create an application for the Camera API you are just making API calls to the camera while your code is connected to the camera Wifi somehow. In the end the easiest way to distribute your code is using smartphones as it will work for IOS, Windows, etc.. as well as Android, but you are not limited to these technologies. Please let me know if I can provide more information.
I've been exploring 3D scanning and reconstruction using Google's project Tango.
So far, some apps I've tried like Project Tango Constructor and Voxxlr do a nice job over short time-spans (I would be happy to get recommendations for other potential scanning apps). The problem is, regardless of the app, if I run it long enough the scans accumulate so much drift that eventually everything is misaligned and ruined.
High chance of drift also occurs whenever I point the device over a featureless space like a blank wall, or when I point the cameras upward to scan ceilings. The device gets disoriented temporarily, thereby destroying the alignment of future scans. Whatever the case, getting the device to know where it is and what it is pointing at is a problem for me.
I know that some of the 3D scanning apps use Area Learning to some extent, since these apps ask me for permission to allow area learning upon startup of the app. I presume that this is to help localize the device and stabilize its pose (please correct me if this is inaccurate).
From the apps I've tried, I have never been given an option to load my own ADF. My understanding is that loading in a carefully learned feature-rich ADF helps to better anchor the device pose. Is there a reason for this dearth of apps that allow users to load in their homemade ADFs? Is it hard/impossible to do? Are current apps already optimally leveraging on area learning to localize, and is it the case that no self-recorded ADF I provide could ever do better?
I would appreciate any pointers/instruction on this topic - the method and efficacy of using ADFs in 3D scanning and reconstruction is not clearly documented. Ultimately, I'm looking for a way to use the Tango to make high quality 3D scans. If ADFs are not needed in the picture, that's fine. If the answer is that I'm endeavoring on an impossible task, I'd like to know as well.
If off-the-shelf solutions are not yet available, I am also willing to try to process the point cloud myself, though I have a feeling its probably much easier said than done.
Unfortunately, Tango doesn't have any application could do this at the moment, you will need to develop you own application for this. Just in case you wonder how to do this in code, here are the steps:
First, the learning mode of the application should be on. When we turn the learning mode on, the system will start to record an ADF, which allows the application to see a existing area that it has been to. For each point cloud we have saved, we should save the timestamp that associated with points as well.
After walking around and collecting the points, we will need to call the TangoService_saveAreaDescription function from the API. This step does some optimization on each key poses saved in the system. After done saving, we need to use the timestamp saved with point cloud to query to optimized pose again, to do that, we use the functionTangoService_getPoseAtTime. After this step, you will see the point cloud set to the right transformation, and the points will be overlapped together.
Just as a recap of the steps:
Turn on learning mode in Tango config.
Walk around, save point cloud along with the timestamp associated with the point cloud.
Call save TangoService_saveAreaDescription function.
After saving is done, call TangoServcie_getPoseAtTime to query the optimized pose based on the timestamp saved with the point cloud.
I am new to Android Development. Now I want to integrate Fingerprint lock in my application. Which is the best. Please help me to find good fingerprint lock.
USING CAMERA AS FINGER LOCK
as refernece check this
Fingerprint Scanner using Camera
As someone who's done significant research on this exact problem, I can tell you it's difficult to get a suitable image for templating (feature extraction) using a stock camera found on any current Android device. The main debilitating issue is achieving significant contrast between the finger's ridges and valleys. Commercial optical fingerprint scanners (which you are attempting to mimic) typically achieve the necessary contrast through frustrated total internal reflection in a prism.
FTIR in Biometrics
In this case, light from the ridges contacting the prism are transmitted to the CMOS sensor while light from the valleys are not. You're simply not going to reliably get the same kind of results from an Android camera, but that doesn't mean you can't get something useable under ideal conditions.
I took the image on the left with a commercial optical fingerprint scanner (Futronics FS80) and the right with a normal camera (15MP Cannon DSLR). After cropping, inverting (to match the other scanner's convention), contrasting, etc the camera image, we got the following results.
enter image description here
The low contrast of the camera image is apparent.
enter image description here
But the software is able to accurately determine the ridge flow.
enter image description here
And we end up finding a decent number of matching minutia (marked with red circles.)
Here's the bad news. Taking these types of up close shots of the tip of a finger is difficult. I used a DSLR with a flash to achieve these results. Additionally most fingerprint matching algorithms are not scale invariant. So if the finger is farther away from the camera on a subsequent "scan", it may not match the original.
The software package I used for the visualizations is the excellent and BSD licensed SourceAFIS. No corporate "open source version"/ "paid version" shenanigans either although it's currently only ported to C# and Java (limited).
Non Camera Based Solutions:
For the frightening small number of devices that have hardware that support "USB Host Mode" you can write a custom driver to integrate a fingerprint scanner with Android. I'll be honest, for the two models I've done this for it was a huge pain. I accomplished it by using wireshark to sniff USB packets between the scanner and a linux box that had a working driver and then writing an Android driver based on the sniffed commands.
Cross Compiling FingerJetFX
Once you have worked out a solution for image acquisition (both potential solutions have their drawbacks) you can start to worry about getting FingerJetFX running on Android. First you'll use their SDK to write a self contained C++ program that takes an image and turns it into a template. After that you really have two options.
Compile it to a library and use JNI to interface with it.
Compile it to an executable and let your Android program call it as a subprocess.
For either you'll need the NDK. I've never used JNI so I'll defer to the wisdom of others on how best us it. I always tend to choose route #2. For this application I think it's appropriate since you're only really calling the native code to do one thing, template your image. Once you've got your native program running and cross compiled you can use the answer to this question to package it with your android app and call it from your Android code.
1 ] There is no APIs or Hardware support for finger print detection in Android platform.
2 ] Existing finger print lock systems are not working on finger print pattern matching.
3 ] They are working on pressure comparison , area of finger impression etc.
Reference : Link