How to use the numbers from Game Rotation Vector in Android? - android

I am working on an AR app that needs to move an image depending on device's position and orientation.
It seems that Game Rotation Vector should provide the necessary data to achieve this.
However I cant seem to understand what the values that I get from GRV sensor show. For instance in order to reach the same value on the Z axis I have to rotate the device 720 degrees. This seems odd.
If I could somehow convert these numbers to angles from the reference frame of the device towards the x,y,z coordinates my problem would be solved.
I have googled this issue for days and didn't find any sensible information on the meaning of GRV coordinates, and how to use them.
TL:DR What do the numbers of the GRV sensor show? And how to convert them to angles?

As the docs state, the GRV sensor gives back a 3D rotation vector. This is represented as three component numbers which make this up, given by:
x axis (x * sin(θ/2))
y axis (y * sin(θ/2))
z axis (z * sin(θ/2))
This is confusing however. Each component is a rotation around that axis, so each angle (θ which is pronounced theta) is actually a different angle, which isn't clear at all.
Note also that when working with angles, especially in 3D, we generally use radians, not degrees, so theta is in radians. This looks like a good introductory explanation.
But the reason why it's given to us in the format is that it can easily be used in matrix rotations, especially as a quaternion. In fact, these are the first three components of a quaternion, the components which specify rotation. The 4th component specifies magnitude, i.e. how far away from the origin (0, 0) a point it. So a quaternion turns general rotation information into an actual point in space.
These are directly usable in OpenGL which is the Android (and the rest of the world's) 3D library of choice. Check this tutorial out for some OpenGL rotations info, this one for some general quaternion theory as applied to 3D programming in general, and this example by Google for Android which shows exactly how to use this information directly.
If you read the articles, you can see why you get it in this form and why it's called Game Rotation Vector - it's what's been used by 3D programmers for games for decades at this point.
TLDR; This example is excellent.
Edit - How to use this to show a 2D image which is rotated by this vector in 3D space.
In the example above, SensorManage.getRo‌tationMatrixFromVecto‌r converts the Game Rotation Vector into a rotation matrix which can be applied to rotate anything in 3D. To apply this rotation a 2D image, you have to think of the image in 3D, so it's actually a segment of a plane, like a sheet of paper. So you'd map your image, which in the jargon is called a texture, onto this plane segment.
Here is a tutorial on texturing cubes in OpenGL for Android with example code and an in depth discussion. From cubes it's a short step to a plane segment - it's just one face of a cube! In fact that's a good resource for getting to grips with OpenGL on Android, I'd recommend reading the previous and subsequent tutorial steps too.
As you mentioned translation also. Look at the onDrawFrame method in the Google code example. Note that there is a translation using gl.glTranslatef and then a rotation using gl.glMultMatrixf. This is how you translate and rotate.
It matters the order in which these operations are applied. Here's a fun way to experiment with that, check out Livecodelab, a live 3D sketch coding environment which runs inside your browser. In particular this tutorial encourages reflection on the ordering of operations. Obviously the command move is a translation.

Related

Spatial offset of a virtual object with Opengl ES perspective

Context: I'm currently working on a Augmented Reality (AR) application using OpenGL ES 2.0 and some AR glasses running on Android. My goal is to display a virtual cursor at the tip of a real object : a screwdriver. Both the glasses and the screwdriver locations are tracked by a fixed external camera. The left image just below can give you an idea of the setup.
Things that are working: So far, I'm able to display a virtual 3D object (for example a cube) at a given location in space. For example, I am able to position it at (more or less 1cm from) the tip of a tracked screwdriver. When I just rotate the head, the virtual cube gives the impression to "stay at the same place" in the real world, which is nice. This behavior is what I expected, and is consistent with its real-world anchor.
Issue: However, when I do a translation with the head (and thus a translation of the opengl camera), the cube seems to have a strange spatial offset, like if it was shifted from the object's tip (case 2 in the drawing above). This shift can be pretty significant (until 5 or 6 cm), and unconsistent with the real-world. But if I align the object exactly with any of the camera axis, the cube seems well-placed at the tip of the object, which confuses me.
Question: Is it just a strange visual perspective effect ? How can it work with head rotations but not head translations ? Did I miss something about perspective projection in OpenGL ES ?
Implementation details The fixed external camera is the origin of world coordinates. It is really precise, and gives me both the world-space position and rotation of each object (including the glasses and the screwdriver). To be more precise, it continuously send this data via Bluetooth to my Android program to make sure what the user can see is up-to-date.
In the case 1, this works like a charm: the camera correctly detects that the screwdriver is at position (0, 0, 1 meter) and whatever rotation for example, I display a cube centered around that position, and it appears correctly placed. But after a head translation (case 2), the screwdriver is still detected at the correct position (it didn't move after all), but the cube is shifted in a way that does not make sense to me.
If it was a small offset, I would put that on an accumulation of small errors, but here it seems to big to be the only explanation. Depending on the head translation I do, the cube gains a different offset and overall give the impression not to have a single fixed position in the world.
I am using perspective projection with the FOV and aspect ration of the AR glasses. The position of the opengl camera is set to the position of the AR glasses, and the Look-at values are computed according to the direction the head is currently facing.
If I modify the FOV, I loose the expected behavior I have about head rotations and correct positionning. Finally, I am using the glasses as a stereo display.

How to find orientation of a surface using OpenCV?

I was inspired by Pokemon GO and wanted to make a simple prototype for learning purposes. I am a total beginner in image processing.
I did a little research on the subject. Here is what I came up with. In order to place any 3D model in the real world, I must know the orientation. Say if I am placing a cube on a table:
1) I need to know the angles $\theta$, $\phi$ and $\alpha$ where $\theta$ is the rotation along the global UP vector, $\phi$ is the rotation along the camera's FORWARD vector and $\alpha$ is rotation along camera's RIGHT vector.
2) Then I have to multiply these three rotation matrices with the object's transform using these Euler angles.
3) The object's position would be at the center point of the surface for protyping.
4) I can find the distance of the surface using android's camera's inbuilt distance estimation using focal length. Then I can scale the object accordingly.
Is there any more straight forward way to do this using OpenCV or am I going in the right track?

OpenCV: Finding the pixel width from squares of known size

In OpenCV I use the camera to capture a scene containing two squares a and b, both at the same distance from the camera, whose known real sizes are, say, 10cm and 30cm respectively. I find the pixel widths of each square, which let's say are 25 and 40 pixels (to get the 'pixel-width' OpenCV detects the squares as cv::Rect objects and I read their width field).
Now I remove square a from the scene and change the distance from the camera to square b. The program gets the width of square b now, which let's say is 80. Is there an equation, using the configuration of the camera (resolution, dpi?) which I can use to work out what the corresponding pixel width of square a would be if it were placed back in the scene at the same distance as square b?
The math you need for your problem can be found in chapter 9 of "Multiple View Geometry in Computer Vision", which happens to be freely available online: https://www.robots.ox.ac.uk/~vgg/hzbook/hzbook2/HZepipolar.pdf.
The short answer to your problem is:
No not in this exact format. Given you are working in a 3D world, you have one degree of freedom left. As a result you need to get more information in order to eliminate this degree of freedom (e.g. by knowing the depth and/or the relation of the two squares with respect to each other, the movement of the camera...). This mainly depends on your specific situation. Anyhow, reading and understanding chapter 9 of the book should help you out here.
PS: to me it seems like your problem fits into the broader category of "baseline matching" problems. Reading around about this, in addition to epipolar geometry and the fundamental matrix, might help you out.
Since you write of "squares" with just a "width" in the image (as opposed to "trapezoids" with some wonky vertex coordinates) I assume that you are considering an ideal pinhole camera and ignoring any perspective distortion/foreshortening - i.e. there is no lens distortion and your planar objects are exactly parallel to the image/sensor plane.
Then it is a very simple 2D projective geometry problem, and no separate knowledge of the camera geometry is needed. Just write down the projection equations in the first situation: you have 4 unknowns (the camera focal length, the common depth of the squares, the horizontal positions of their left sides (say), and 4 equations (the projections of each of the left and right sides of the squares). Solve the system and keep the focal length and the relative distance between the squares. Do the same in the second image, but now with known focal length, and compute the new depth and horizontal location of square b. Then add the previously computed relative distance to find where square a would be.
In order to understand the transformations performed by the camera to project the 3D world in the 2D image you need to know its calibration parameters. These are basically divided into two sets:
Intrensic parameters: These are fixed parameters that are specific for each camera. They are normally represented by a Matrix called k.
Extrensic parameters: These depend on the camera position in the 3D world. Normally they are represented by two matrices: R and T where the first one represents the rotation and the second one represents the translation
In order to calibrate a camera your need some pattern (basically a set of 3D points which coordinates are known). There are several examples for this in OpenCV library which provides support to perform the camera calibration:
http://docs.opencv.org/doc/tutorials/calib3d/camera_calibration/camera_calibration.html
Once you have your camera calibrated you can transform from 3D to 2D easily by the following equation:
Pimage = K · R · T · P3D
So it will not only depend on the position of the camera but it depends on all the calibration parameters. The following presentation go through the camera calibration details and the different steps and equations that are used during the 3D <-> Image transformations.
https://www.cs.umd.edu/class/fall2013/cmsc426/lectures/camera-calibration.pdf
With this in mind you can project whatever 3D point to the image and get its coordinate on it. The reverse transformation is not unique since going back from 2D to 3D will give you a line instead of a unique point.

Open GL Android 3D Object Rotation Issue

I am having trouble rotating my 3D objects in Open GL. I start each draw frame by loading the identity (glLoadIdentity()) and then I push and pop on the stack according to what I need (for the camera, etc). I then want 3D objects to be able to roll, pitch and yaw and then have them displayed correctly.
Here is the catch... I want to be able to do incremental rotations as if I was flying an airplane. So every time the up button is pushed the object rotates around it's own x axis. But then if the object is pitched down and chooses to yaw, the rotation should then be around the object's up vector and not the Y axis.
I've tried doing the following:
glRotatef(pitchTotal, 1,0,0);
glRotatef(yawTotal, 0,1,0);
glRotate(rollTotal, 0,0,1);
and those don't seem to work. (Keeping in mind that the vectors are being computed correctly)I've also tried...
glRotatef(pitchTotal, 1,0,0);
glRotatef(yawTotal, 0,1,0);
glRotate(rollTotal, 0,0,1);
and I still get weird rotations.
Long story short... What is the proper way to rotate a 3D object in Open GL using the object's look, right and up vector?
You need to do the yaw rotation around (around Y) before you do the pitch one. Otherwise, the pitch will be off.
E.g. you have a 45 degrees downward pitch and a 180 degrees yaw. By doing the pitch first, and then rotate the yaw around the airplane's Y vector, the airplane would end up pointing up and backwards despite the pitch being downwards. By doing the yaw first, the plane points backwards, then the pitch around the plane's X vector will make it point downwards correctly.
The same logic applies for roll, which needs to be applied last.
So your code should be :
glRotatef(yawTotal, 0,1,0);
glRotatef(pitchTotal, 1,0,0);
glRotatef(rollTotal, 0,0,1);
Cumulative rotations will suffer from gimbal lock. Look at it this way: suppose you are in an aeroplane, flying level. You apply a yaw of 90 degrees anticlockwise. You then apply a roll of 90 degrees clockwise. You then apply a yaw of 90 degrees clockwise.
Your plane is now pointing straight downward — the total effect is a pitch of 90 degrees clockwise. But if you just tried to add up the different rotations then you'd end up with a roll of 90 degrees, and no pitch whatsoever because you at no point applied pitch to the plane.
Trying to store and update rotation as three separate angles doesn't work.
Common cited solutions are to use a quaternion or to store the object orientation directly as a matrix. The matrix solution is easier to build because you can prototype it with OpenGL's built-in matrix stacks. Most people also seem to find matrices easier to understand than quaternions.
So, assuming you want to go matrix, your prototype might do something like (please forgive my lack of decent Java knowledge; I'm going to write C essentially):
GLfloat myOrientation[16];
// to draw the object:
glMultMatrixf(myOrientation);
/* drawing here */
// to apply roll, assuming the modelview stack is active:
glPushMatrix(); // backup what's already on the stack
glLoadIdentity(); // start with the identity
glRotatef(angle, 0, 0, 1);
glMultMatrixf(myOrientation); // premultiply the current orientation by the roll
// update our record of orientation
glGetFloatv(GL_MODELVIEW_MATRIX, myOrientation);
glPopMatrix();
You possibly don't want to use the OpenGL stack in shipping code because it's not really built for this sort of use and so performance may be iffy. But you can prototype and profile rather than making an assumption. You also need to consider floating point precision problems — really you should be applying a step that ensures myOrientation is still orthonormal after it has been adjusted.
It's probably easiest to check Google for that, but briefly speaking you'll use the dot product to remove erroneous crosstalk from two of the axes to the third, then to remove from one of the first two axes from the second, then renormalise all three.
Thanks for the responses. The first response pointed me in the right direction, the second response helped a little too, but ultimately it boiled down to a combination of both. Initially, your 3D object should have a member variable which is a float array size 16. [0-15]. You then have to initialize it to the identity matrix. Then the member methods of your 3D object like "yawObject(float amount)" just know that you are yawing the object from "the objects point of view" and not the world, which would allow the incremental rotation. Inside the yawObject method (or pitch,roll ojbect) you need to call the Matrix.rotateM(myfloatarray,0,angle,0,1,0). That will store the new rotation matrix (as describe in the first response). You can then when you are about to draw your object, multiply the model matrix by the myfloatarray matrix using gl.glMultMatrix.
Good luck and let me know if you need more information than that.

Implementing Anti-Aliasing (FSAA?) on Android, OpenGL

I'm currently using OpenGL on Android to draw set width lines, which work great except for the fact that OpenGL on Android does not natively support the anti-aliasing of such lines. I have done some research, however I'm stuck on how to implement my own AA.
FSAA
The first possible solution I have found is Full Screen Anti-Aliasing. I have read this page on the subject but I'm struggling to understand how I could implement it.
First of all, I'm unsure on the entire concept of implementing FSAA here. The article states "One straightforward jittering method is to modify the projection matrix, adding small translations in x and y". Does this mean I need to be constantly moving the same line extremely quickly, or drawing the same line multiple times?
Secondly, the article says "To compute a jitter offset in terms of pixels, divide the jitter amount by the dimension of the object coordinate scene, then multiply by the appropriate viewport dimension". What's the difference between the dimension of the object coordinate scene and the viewport dimension? (I'm using a 800 x 480 resolution)
Now, based on the information given in that article the 'jitter' coordinates should be relatively easy to compute. Based on my assumptions so far, here is what I have come up with (Java)...
float currentX = 50;
float currentY = 75;
// I'm assuming the "jitter" amount is essentially
// the amount of anti-aliasing (e.g 2x, 4x and so on)
int jitterAmount = 2;
// don't know what these two are
int coordSceneDimensionX;
int coordSceneDimensionY;
// I assume screen size
int viewportX = 800;
int viewportY = 480;
float newX = (jitterAmount/coordSceneDimensionX)/viewportX;
float newY = (jitterAmount/coordSceneDimensionY)/viewportY;
// and then I don't know what to do with these new coordinates
That's as far as I've got with FSAA
Anti-Aliasing with textures
In the same document I was referencing for FSAA, there is also a page that briefly discusses implementing anti-aliasing with the use of textures. However, I don't know what the best way to go about implementing AA in this way would be and whether it would be more efficient than FSAA.
Hopefully someone out there knows a lot more about Anti-Aliasing than I do and can help me achieve this. Much appreciated!
The method presented in the articles predates the time, when GPUs were capable of performing antialiasing themself. This jittered rendering to a accumulation buffer is not really state of the art with realtime graphics (it is a widely implemented form of antialiasing for offline rendering though).
What you do these days is requesting an antialiased framebuffer. That's it. The keyword here is multisampling. See this SO answer:
How do you activate multisampling in OpenGL ES on the iPhone? – although written for the iOS, doing it for Android follows a similar path. AFAIK On Android this extension is used instead http://www.khronos.org/registry/gles/extensions/ANGLE/ANGLE_framebuffer_multisample.txt
First of all the article you refer to uses the accumulation buffer, whose existence I really doubt in OpenGL ES, but I might be wrong here. If the accumulation buffer is really supported in ES, then you at least have to explicitly request it when creating the GL context (however this is done in Android).
Note that this technique is extremely inefficient and also deprecated, since nowadays GPUs usually support some kind of multisampling atialiasing (MSAA). You should research if your system/GPU/driver supports multi-sampling. This may require you to request a multisample framebuffer during context creation or something similar.
Now back to the article. The basic idea of this article is not to move the line quickly, but to render the line (or actually the whole scene) multiple times at very slightly different (at sub-pixel accuracy) locations (in image space) and average these multiple renderings to get the final image, every frame.
So you have a set of sample positions (in [0,1]), which are actually sub-pixel positions. This means if you have a sample positon (0.25, 0.75) you move the whole scene about a quarter of a pixel in the x direction and 3 quarters of a pixel in the y direction (in screen space, of course) when rendering. When you have done this for each different sample, you average all these renderings together to gain the final antialiased rendering.
The dimension of the object coordinate scene is basically the dimension of the screen (actually the near plane of the viewing volume) in object space, or more practically, the values you passed into glOrtho or glFrustum (or a similar function, but with gluPerspective it is not that obvious). For modifying the projection matrix to realize this jittering, you can use the functions presented in the article.
The jitter amount is not the antialiasing factor, but the sub-pixel sample locations. The antialiasing factor in this context is the number of samples and therfore the number of jittered renderings you perform. And your code won't work, if I assume correctly and you try to only jitter the line end points. You have to draw the whole scene multiple times using this jittered projection and not just this single line (it may work with a simple black background and appropriate blending, though).
You might also be able to achieve this without an accum buffer using blending (with glBlendFunc(GL_CONSTANT_COLOR, GL_ONE) and glBlendColor(1.0f/n, 1.0f/n, 1.0f/n, 1.0f/n), with n being the antialiasing factor/sample count). But keep in mind to render the whole scene like this and not just this single line.
But like said this technique is completely outdated and you should rather look for a way to enable MSAA on your ES platform.

Categories

Resources