I am using Camera.Face to detect face and min3D to load 3d models.
I want to let the model move with face, but it is not working well.
#Override
public void updateScene() {
if (mFaces == null) {
animeModel.position().x = animeModel.position().y = animeModel
.position().z = 0;
return;
}
for (Face face : mFaces) {
if (face == null) {
continue;
}
animeModel.position().x = face.rect.centerX();
animeModel.position().y = face.rect.centerY();
}
}
Is that model's coordinate and rectangle's coordinate are different systems?
(world coordinates to screen coordinates or something?)
How to solve this?
UPDATE:
I have try to get model's coordinate and face's coordinate.
These two value are totally different.
How to convert face.rect.centerX() to animeModel.position().x?
Here is an article all about how a face tracking demo was developed:
http://www.smallscreendesign.com/2011/02/07/about-face-detection-on-android-%E2%80%93-part-1/
That app is also available on the Play store. Part 1 of the above article has some performance metrics on recognition time. It looks like it may take up to two seconds or more to detect a face.
You could use the code in that article to do your prototyping. You may discover that face detection doesn't happen fast or often enough to track a face in realtime.
Here is the documentation for face tracking on the Android Developer site:
http://developer.android.com/reference/android/hardware/Camera.Face.html
UPDATE:
Check out this library: https://code.google.com/p/asmlib-opencv/
Related
I'm stretching my very limited ARCore knowledge.
My question is similar (but different) to this question
I want to work out if my device camera node intersects/overlaps with my other nodes, but I've not been having any luck so far
I'm trying something like this (the camera is another node):
scene.setOnUpdateListener(frameTime -> {
Node x = scene.overlapTest(scene.getCamera());
if (x != null) {
Log.i(TAG, "setUpArComponents: CAMERA HIT DETECTED at: " + x.getName());
logNodeStatus(x);
}
});
Firstly, does this make sense?
I can detect all node collisions in my scene using:
for (Node node : nodes) {
...
ArrayList<Node> results = scene.overlapTestAll(node);
...
}
Assuming that there isn't a renderable for the Camera node (so no default collision shape), I tried to set my own collision shape, but this was actually catching all the tap events I was trying to perform, so I figured I must be doing this wrong.
I'm thinking about things like fixing a deactivated node in front of the camera.
I may be asking for too much of ARCore, but has anyone found a way to detect a collision between the "user" (i.e. camera node) and another node? Or should I be doing this "collision detection" via indoor positioning instead?
Thanks in advance :)
UPDATE: it's really hacky and performance-heavy, but you can actually compare the camera's and node's world space positions from within onUpdate inside a node, you'll probably have to manage some tolerance and other things to smooth out interactions.
One idea to do the same thing is to use a raycast to hit the objects and if they are close do something. You could use something like this in the onUpdateListener:
Camera camera = arSceneView.getScene().getCamera();
Ray ray = new Ray(camera.getWorldPosition(), camera.getForward());
HitTestResult result = arSceneView.getScene().hitTest(ray);
if (result.getNode() != null && result.getDistance() <= SOME_THRESHOLD) {
// Hit something
doSomething (result.getNode());
}
I'm facing the following problem while developing Tango app and not sure whether I'm on the right track or not.
What I'm trying to achieve:
User takes a picture. In the background the app saves to persistent the current point cloud and pose.
The server is getting that image and doing some magic processing behind the scene and sends (x,y) coordinate back to the app(Async and unrelated to current Tango session).
Restart the app, start a new tango session and show a 3d object at (x,y) using the persist copy of the point cloud and pose.
I expect that I'll be able to use these parameters - (x,y), point cloud and Pose in the following algorithm and get a Pose, which is a Rajawali object that RajawaliRenderer knows how to render.
tango initialization is accoring to the following coordinate frame:
TANGO_WORLD_BASE_COORDINATE_FRAME = new TangoCoordinateFramePair(
TangoPoseData.COORDINATE_FRAME_AREA_DESCRIPTION,
TangoPoseData.COORDINATE_FRAME_DEVICE
);
Plan Fit using intersection point -
private void convertByIntersectionPoint(float x, float y,
TangoPointCloudData tangoPointCloudData, TangoPoseData devicePose,
TangoPoseData colorTdepthPose) {
if (tangoPointCloudData != null) {
TangoSupport.IntersectionPointPlaneModelPair intersectionPointPlaneModelPair =
TangoSupport.fitPlaneModelNearPoint(tangoPointCloudData,
colorTdepthPose, x, y);
if (devicePose.statusCode == TangoPoseData.POSE_VALID) {
mRenderer.updateObjectPose(
intersectionPointPlaneModelPair.intersectionPoint,
intersectionPointPlaneModelPair.planeModel,
devicePose);
}
}
}
It throws TangoErrorException on TangoSupport.fitPlaneModelNearPoint.
To my understanding the fitPlaneModelNearPoint method should do pure algorithm that doesn't rely on current Tango session but I cannot be sure because I don't have its implementation.
Any help'd be much appreciated.
Okay it was totally my mistake.
There was a bug in serializing the point cloud.
Gson library do not know how to deserialize into subclass and always construct into a parent class - which in this case creates corrupted data
After some weeks of waiting I finally have my Project Tango. My idea is to create an app that generates a point cloud of my room and exports this to .xyz data. I'll then use the .xyz file to show the point cloud in a browser! I started off by compiling and adjusting the point cloud example that's on Google's github.
Right now I use the onXyzIjAvailable(TangoXyzIjData tangoXyzIjData) to get a frame of x y and z values; the points. I then save these frames in a PCLManager in the form of Vector3. After I'm done scanning my room, I simple write all the Vector3 from the PCLManager to a .xyz file using:
OutputStream os = new FileOutputStream(file);
size = pointCloud.size();
for (int i = 0; i < size; i++) {
String row = String.valueOf(pointCloud.get(i).x) + " "
+ String.valueOf(pointCloud.get(i).y) + " "
+ String.valueOf(pointCloud.get(i).z) + "\r\n";
os.write(row.getBytes());
}
os.close();
Everything works fine, not compilation errors or crashes. The only thing that seems to be going wrong is the rotation or translation of the points in the cloud. When I view the point cloud everything is messed up; the area I scanned is not recognizable, though the amount of points is the same as recorded.
Could this have to do something with the fact that I don't use PoseData together with the XyzIjData? I'm kind of new to this subject and have a hard time understanding what the PoseData exactly does. Could someone explain it to me and help me fix my point cloud?
Yes, you have to use TangoPoseData.
I guess you are using TangoXyzIjData correctly; but the data you get this way is relative to where the device is and how the device is tilted when you take the shot.
Here's how i solved this:
I started from java_point_to_point_example. In this example they get the coords of 2 different points with 2 different coordinate system and then write those coordinates wrt the base Coordinate frame pair.
First of all you have to setup your exstrinsics, so you'll be able to perform all the transformations you'll need. To do that I call mExstrinsics = setupExtrinsics(mTango) function at the end of my setTangoListener() function. Here's the code (that you can find also in the example I linked above).
private DeviceExtrinsics setupExtrinsics(Tango mTango) {
//camera to IMU tranform
TangoCoordinateFramePair framePair = new TangoCoordinateFramePair();
framePair.baseFrame = TangoPoseData.COORDINATE_FRAME_IMU;
framePair.targetFrame = TangoPoseData.COORDINATE_FRAME_CAMERA_COLOR;
TangoPoseData imu_T_rgb = mTango.getPoseAtTime(0.0,framePair);
//IMU to device transform
framePair.targetFrame = TangoPoseData.COORDINATE_FRAME_DEVICE;
TangoPoseData imu_T_device = mTango.getPoseAtTime(0.0,framePair);
//IMU to depth transform
framePair.targetFrame = TangoPoseData.COORDINATE_FRAME_CAMERA_DEPTH;
TangoPoseData imu_T_depth = mTango.getPoseAtTime(0.0,framePair);
return new DeviceExtrinsics(imu_T_device,imu_T_rgb,imu_T_depth);
}
Then when you get the point Cloud you have to "normalize" it. Using your exstrinsics is pretty simple:
public ArrayList<Vector3> normalize(TangoXyzIjData cloud, TangoPoseData cameraPose, DeviceExtrinsics extrinsics) {
ArrayList<Vector3> normalizedCloud = new ArrayList<>();
TangoPoseData camera_T_imu = ScenePoseCalculator.matrixToTangoPose(extrinsics.getDeviceTDepthCamera());
while (cloud.xyz.hasRemaining()) {
Vector3 rotatedV = ScenePoseCalculator.getPointInEngineFrame(
new Vector3(cloud.xyz.get(),cloud.xyz.get(),cloud.xyz.get()),
camera_T_imu,
cameraPose
);
normalizedCloud.add(rotatedV);
}
return normalizedCloud;
}
This should be enough, now you have a point cloud wrt you base frame of reference.
If you overimpose two or more of this "normalized" cloud you can get the 3D representation of your room.
There is another way to do this with rotation matrix, explained here.
My solution is pretty slow (it takes around 700ms to the dev kit to normalize a cloud of ~3000 points), so it is not suitable for a real time application for 3D reconstruction.
Atm i'm trying to use Tango 3D Reconstruction Library in C using NDK and JNI. The library is well documented but it is very painful to set up your environment and start using JNI. (I'm stuck at the moment in fact).
Drifting
There still is a problem when I turn around with the device. It seems that the point cloud spreads out a lot.
I guess you are experiencing some drifting.
Drifting happens when you use Motion Tracking alone: it consist of a lot of very small error in estimating your Pose that all together cause a big error in your pose relative to the world. For instance if you take your tango device and you walk in a circle tracking your TangoPoseData and then you draw you trajectory in a spreadsheet or whatever you want you'll notice that the Tablet will never return at his starting point because he is drifting away.
Solution to that is using Area Learning.
If you have no clear ideas about this topic i'll suggest watching this talk from Google I/O 2016. It will cover lots of point and give you a nice introduction.
Using area learning is quite simple.
You have just to change your base frame of reference in TangoPoseData.COORDINATE_FRAME_AREA_DESCRIPTION. In this way you tell your Tango to estimate his pose not wrt on where it was when you launched the app but wrt some fixed point in the area.
Here's my code:
private static final ArrayList<TangoCoordinateFramePair> FRAME_PAIRS =
new ArrayList<TangoCoordinateFramePair>();
{
FRAME_PAIRS.add(new TangoCoordinateFramePair(
TangoPoseData.COORDINATE_FRAME_AREA_DESCRIPTION,
TangoPoseData.COORDINATE_FRAME_DEVICE
));
}
Now you can use this FRAME_PAIRS as usual.
Then you have to modify your TangoConfig in order to issue Tango to use Area Learning using the key TangoConfig.KEY_BOOLEAN_DRIFT_CORRECTION. Remember that when using TangoConfig.KEY_BOOLEAN_DRIFT_CORRECTION you CAN'T use learningmode and load ADF (area description file).
So you cant use:
TangoConfig.KEY_BOOLEAN_LEARNINGMODE
TangoConfig.KEY_STRING_AREADESCRIPTION
Here's how I initialize TangoConfig in my app:
TangoConfig config = tango.getConfig(TangoConfig.CONFIG_TYPE_DEFAULT);
//Turning depth sensor on.
config.putBoolean(TangoConfig.KEY_BOOLEAN_DEPTH, true);
//Turning motiontracking on.
config.putBoolean(TangoConfig.KEY_BOOLEAN_MOTIONTRACKING,true);
//If tango gets stuck he tries to autorecover himself.
config.putBoolean(TangoConfig.KEY_BOOLEAN_AUTORECOVERY,true);
//Tango tries to store and remember places and rooms,
//this is used to reduce drifting.
config.putBoolean(TangoConfig.KEY_BOOLEAN_DRIFT_CORRECTION,true);
//Turns the color camera on.
config.putBoolean(TangoConfig.KEY_BOOLEAN_COLORCAMERA, true);
Using this technique you'll get rid of those spreads.
PS
In the Talk i linked above, at around 22:35 they show you how to port your application to Area Learning. In their example they use TangoConfig.KEY_BOOLEAN_ENABLE_DRIFT_CORRECTION. This key does not exist anymore (at least in Java API). Use TangoConfig.KEY_BOOLEAN_DRIFT_CORRECTION instead.
I have some code that allows me to detect faces in a live camera preview and draw a few GIFs over their landmarks using the play-services-vision library provided by Google.
It works well enough when the face is static, but when the face moves at a moderate speed, the face detector takes longer than the camera's framerate to detect the landmarks at the face's new position. I know it might have something to do with the bitmap draw speed, but I took steps to minimize the lag in them.
(Basically I get complaints that the GIFs' repositioning isn't 'smooth enough')
EDIT: I did try getting the coordinate detection code...
List<Landmark> landmarksList = face.getLandmarks();
for(int i = 0; i < landmarksList.size(); i++)
{
Landmark current = landmarksList.get(i);
//canvas.drawCircle(translateX(current.getPosition().x), translateY(current.getPosition().y), FACE_POSITION_RADIUS, mFacePositionPaint);
//canvas.drawCircle(current.getPosition().x, current.getPosition().y, FACE_POSITION_RADIUS, mFacePositionPaint);
if(current.getType() == Landmark.LEFT_EYE)
{
//Log.i("current_landmark", "l_eye");
leftEyeX = translateX(current.getPosition().x);
leftEyeY = translateY(current.getPosition().y);
}
if(current.getType() == Landmark.RIGHT_EYE)
{
//Log.i("current_landmark", "r_eye");
rightEyeX = translateX(current.getPosition().x);
rightEyeY = translateY(current.getPosition().y);
}
if(current.getType() == Landmark.NOSE_BASE)
{
//Log.i("current_landmark", "n_base");
noseBaseY = translateY(current.getPosition().y);
noseBaseX = translateX(current.getPosition().x);
}
if(current.getType() == Landmark.BOTTOM_MOUTH) {
botMouthY = translateY(current.getPosition().y);
botMouthX = translateX(current.getPosition().x);
//Log.i("current_landmark", "b_mouth "+translateX(current.getPosition().x)+" "+translateY(current.getPosition().y));
}
if(current.getType() == Landmark.LEFT_MOUTH) {
leftMouthY = translateY(current.getPosition().y);
leftMouthX = translateX(current.getPosition().x);
//Log.i("current_landmark", "l_mouth "+translateX(current.getPosition().x)+" "+translateY(current.getPosition().y));
}
if(current.getType() == Landmark.RIGHT_MOUTH) {
rightMouthY = translateY(current.getPosition().y);
rightMouthX = translateX(current.getPosition().x);
//Log.i("current_landmark", "l_mouth "+translateX(current.getPosition().x)+" "+translateY(current.getPosition().y));
}
}
eyeDistance = (float)Math.sqrt(Math.pow((double) Math.abs(rightEyeX - leftEyeX), 2) + Math.pow(Math.abs(rightEyeY - leftEyeY), 2));
eyeCenterX = (rightEyeX + leftEyeX) / 2;
eyeCenterY = (rightEyeY + leftEyeY) / 2;
noseToMouthDist = (float)Math.sqrt(Math.pow((double)Math.abs(leftMouthX - noseBaseX), 2) + Math.pow(Math.abs(leftMouthY - noseBaseY), 2));
...in a separate thread within the View draw method, but it just nets me a SIGSEGV error.
My questions:
Is syncing the Face Detector's processing speed with the Camera Preview framerate the right thing to do in this case, or is it the other way around, or is it some other way?
As the Face Detector finds the faces in a camera preview frame, should I drop the frames that the preview feeds before the FD finishes? If so, how can I do it?
Should I just use setClassificationMode(NO_CLASSIFICATIONS) and setTrackingEnabled(false) in a camera preview just to make the detection faster?
Does the play-services-vision library use OpenCV, and which is actually better?
EDIT 2:
I read one research paper that, using OpenCV, the face detection and other functions available in OpenCV is faster in Android due to their higher processing power. I was wondering whether I can leverage that to hasten the face detection.
There is no way you can guarantee that face detection will be fast enough to show no visible delay even when the head motion is moderate. Even if you succeed to optimize the hell of it on your development device, you will sure find another model among thousands out there, that will be too slow.
Your code should be resilient to such situations. You can predict the face position a second ahead, assuming that it moves smoothly. If the users decide to twitch their head or device, no algorithm can help.
If you use the deprecated Camera API, you should pre-allocate a buffer and use setPreviewCallbackWithBuffer(). This way you can guarantee that the frames arrive to you image processor one at a time. You should also not forget to open the Camera on a background thread, so that the [onPreviewFrame()](http://developer.android.com/reference/android/hardware/Camera.PreviewCallback.html#onPreviewFrame(byte[], android.hardware.Camera)) callback, where your heavy image processing takes place, will not block the UI thread.
Yes, OpenCV face-detection may be faster in some cases, but more importantly it is more robust that the Google face detector.
Yes, it's better to turn the classificator off if you don't care about smiles and open eyes. The performance gain may vary.
I believe that turning tracking off will only slow the Google face detector down, but you should make your own measurements, and choose the best strategy.
The most significant gain can be achieved by turning setProminentFaceOnly() on, but again I cannot predict the actual effect of this setting for your device.
There's always going to be some lag, since any face detector takes some amount of time to run. By the time you draw the result, you will usually be drawing it over a future frame in which the face may have moved a bit.
Here are some suggestions for minimizing lag:
The CameraSource implementation provided by Google's vision library automatically handles dropping preview frames when needed so that it can keep up the best that it can. See the open source version of this code if you'd like to incorporate a similar approach into your app: https://github.com/googlesamples/android-vision/blob/master/visionSamples/barcode-reader/app/src/main/java/com/google/android/gms/samples/vision/barcodereader/ui/camera/CameraSource.java#L1144
Using a lower camera preview resolution, such as 320x240, will make face detection faster.
If you're only tracking one face, using the setProminentFaceOnly() option will make face detection faster. Using this and LargestFaceFocusingProcessor as well will make this even faster.
To use LargestFaceFocusingProcessor, set it as the processor of the face detector. For example:
Tracker<Face> tracker = *your face tracker implementation*
detector.setProcessor(
new LargestFaceFocusingProcessor.Builder(detector, tracker).build());
Your tracker implementation will receive face updates for only the largest face that it initially finds. In addition, it will signal back to the detector that it only needs to track that face for as long as it is visible.
If you don't need to detect smaller faces, using the setMinFaceSize() larger will make face detection faster. It's faster to detect only larger faces, since it doesn't need to spend time looking for smaller faces.
You can turn of classification if you don't need eyes open or smile indication. However, this would only give you a small speed advantage.
Using the tracking option will make this faster as well, but at some accuracy expense. This uses a predictive algorithm for some intermediate frames, to avoid the expense running full face detection on every frame.
I am adroid proggrammer,because of many object in scene my game has lagging
i have theory for remove lagging in my game.
if i can control rendering in unity i can remove lagging.
using UnityEngine;
using System.Collections;
public class Enemy : MonoBehaviour {
void Update(){
void Start(){
GetComponent<Renderer>().enabled = false;
}
object2 = GameObject.Find("TR");
var distance = Vector3.Distance(gameObject.transform.position, object2.transform.position);
print (distance);
if(distance <= 80){
GetComponent<Renderer>().enabled = true;
}
}
}
Don't work.how can i have boolean render that when have collision will render
else remove.
i want have zone that all object in my zone rendered and allthing outside do not render.
void OnTriggerEnter(Collider collision)
{
if(collision.gameObject.tag == "zone")
{
GetComponent<Renderer>().enabled = true;
}
else{
GetComponent<Renderer>().enabled = false;
}
don't work
void OnTriggerEnter(Collider collision)
{
if(collision.gameObject.tag == "zone")
{
gameObject.SetActive(false);
}
else{
gameObject.SetActive(true);
}
This is either implemented in Unity or implementing it is a bad idea because raycasts are expensive and you need a lot of them. Try finding other problems which cause lagging in your game, disable feature by feature and write how many frames you have, this will get you best overview of what's the problem. Look online which methods are expensive (Instantiating, Destroy, try merging all models you have, smaller amount of shaders, fast shaders, less textures to load, FindGameObjectByName (or tag...)).
Here you will find a great document about optimization. It's preapared for mobile devices but i hope you will find what you need: Unity Optimization Guide for x86 Android
I would recommend having your blue blobs in an object pool, and the ones leaving your screen getting disabled.
You know your position and you know the position of the objects in the pool, you can math your distance in one direction, for instance behind you and disable after x amount.
Raycasting or collisions are abundant.
On your terrain generation scripts, check for disabled pool objects and if one exist, it should be put ahead in the level and repositioned or w/e logic you have there.
Don't instantiate and destroy unless you really need it, do it on level-load instead of on the fly.
(It's expensive.)
There's some really good tutorials on the unity page, have a look there.
They cover things like endless-runners.