I have two 4x4 rotation matrices M and N. M is describing my current object attitude in space, and N is a desired object attitude. Now I would like to rotate M matrix towards N, so the object will slowly rotate towards desired position in following iterations. Any idea how to approach this?
If these matrices are not strange which should be the case describing "rotation matrices" you should do this by interpolating their base vectors in polar system.
To examine we want to convert top-left 3x3 matrix to 3 vectors defined by angles and distance. Once this is done you should do a linear interpolation on angles and distances for that top-left 3x3 part while the rest should have a direct cartesian interpolation. From angles and distances you can then convert back to cartesian coordinates.
Naturally there is still work internally like choosing which way to rotate (using closest) and checking there are no edge cases where one base vector rotates into different direction then the other...
I managed to successfully do this in 2D system which is a bit easier but should be no different in 3D.
To note a cartesian interpolation works fairly fine as long as angles are relatively small (<10 degrees to guess) which is most likely not your case at all.
Related
I was inspired by Pokemon GO and wanted to make a simple prototype for learning purposes. I am a total beginner in image processing.
I did a little research on the subject. Here is what I came up with. In order to place any 3D model in the real world, I must know the orientation. Say if I am placing a cube on a table:
1) I need to know the angles $\theta$, $\phi$ and $\alpha$ where $\theta$ is the rotation along the global UP vector, $\phi$ is the rotation along the camera's FORWARD vector and $\alpha$ is rotation along camera's RIGHT vector.
2) Then I have to multiply these three rotation matrices with the object's transform using these Euler angles.
3) The object's position would be at the center point of the surface for protyping.
4) I can find the distance of the surface using android's camera's inbuilt distance estimation using focal length. Then I can scale the object accordingly.
Is there any more straight forward way to do this using OpenCV or am I going in the right track?
sorry for my bad english. I have the following problem:
Lets say the camera of my mobile device is showing this picture.
In the picture you can see 4 different positions. Every position is known to me (longitude, latitude).
Now i want to know, where in the picture a specific position is. For example, i want to have a rectangle 20 meters in front and 5 meters to the left of me. I just know the latitude/longitude of this point, but i don't know, where i have to place it inside of the picture (x,y). For example, POS3 is at (0,400) in my view. POS4 is at (600,400) and so on.
Where do i have to put the new point, which is 20 meters in front and 5 meters to the left of me? (So my Input is: (LatXY,LonXY) and my result should be (x,y) on the screen)
I also got the height of the camera and the angles of x,y and z - axis from the camera.
Can i use simple mathematic operations to solve this problem?
Thank you very much!
The answer you want will depend on the accuracy of the result you need. As danaid pointed out, nonlinearity in the image sensor and other factors, such as atmospheric distortion, may induce errors, but would be difficult problems to solve with different cameras, etc., on different devices. So let's start by getting a reasonable approximation which can be tweaked as more accuracy is needed.
First, you may be able to ignore the directional information from the device, if you choose. If you have the five locations, (POS1 - POS4 and camera, in a consistent basis set of coordinates, you have all you need. In fact, you don't even need all those points.
A note on consistent coordinates. At his scale, once you use the convert the lat and long to meters, using cos(lat) for your scaling factor, you should be able to treat everyone from a "flat earth" perspective. You then just need to remember that the camera's x-y plane is roughly the global x-z plane.
Conceptual Background
The diagram below lays out the projection of the points onto the image plane. The dz used for perspective can be derived directly using the proportion of the distance in view between far points and near points, vs. their physical distance. In the simple case where the line POS1 to POS2 is parallel to the line POS3 to POS4, the perspective factor is just the ratio of the scaling of the two lines:
Scale (POS1, POS2) = pixel distance (pos1, pos2) / Physical distance (POS1, POS2)
Scale (POS3, POS4) = pixel distance (pos3, pos4) / Physical distance (POS3, POS4)
Perspective factor = Scale (POS3, POS4) / Scale (POS1, POS2)
So the perspective factor to apply to a vertex of your rect would be the proportion of the distance to the vertex between the lines. Simplifying:
Factor(rect) ~= [(Rect.z - (POS3, POS4).z / ((POS1, POS2).z - (POS3, POS4).z)] * Perspective factor.
Answer
A perspective transformation is linear with respect to the distance from the focal point in the direction of view. The diagram below is drawn with the X axis parallel to the image plane, and the Y axis pointing in the direction of view. In this coordinate system, for any point P and an image plane any distance from the origin, the projected point p has an X coordinate p.x which is proportional to P.x/P.y. These values can be linearly interpolated.
In the diagram, tp is the desired projection of the target point. to get tp.x, interpolate between, for example, pos1.x and pos3.x using adjustments for the distance, as follows:
tp.x = pos1.x + ((pos3.x-pos1.x)*((TP.x/TP.y)-(POS1.x/POS1.y))/((POS3.x/POS3.y)-(POS1.x/POS1.y))
The advantage of this approach is that it does not require any prior knowledge of the angle viewed by each pixel, and it will be relatively robust against reasonable errors in the location and orientation of the camera.
Further refinement
Using more data means being able to compensate for more errors. With multiple points in view, the camera location and orientation can be calibrated using the Tienstra method. A concise proof of this approach, (using barycentric coordinates), can be found here.
Since the transformation required are all linear based on homogeneous coordinates, you could apply barycentric coordinates to interpolate based on any three or more points, given their X,Y,Z,W coordinates in homogeneous 3-space and their (x,y) coordinates in image space. The closer the points are to the destination point, the less significant the nonlinearities are likely to be, so in your example, you would use POS 1 and POS3, since the rect is on the left, and POS2 or POS4 depending on the relative distance.
(Barycentric coordinates are likely most familiar as the method used to interpolate colors on a triangle (fragment) in 3D graphics.)
Edit: Barycentric coordinates still require the W homogeneous coordinate factor, which is another way of expressing the perspective correction for the distance from the focal point. See this article on GameDev for more details.
Two related SO questions: perspective correction of texture coordinates in 3d and Barycentric coordinates texture mapping.
I see a couple of problems.
The only real mistake is you're scaling your projection up by _canvasWidth/2 etc instead of translating that far from the principal point - add those value to the projected result, multiplication is like "zooming" that far into the projection.
Second, dealing in a global cartesian coordinate space is a bad idea. With the formulae you're using, the difference between (60.1234, 20.122) and (60.1235, 20.122) (i.e. a small, latitude difference) causes changes of similar magnitude in all 3 axes which doesn't feel right.
It's more straightforward to take the same approach as computer graphics: set your camera as the origin of your "camera space", and convert between world objects and camera space by getting the haversine distance (or similar) between your camera location and the location of the object. See here: http://www.movable-type.co.uk/scripts/latlong.html
Third your perspective projection calculations are for an ideal pinhole camera, which you probably do not have. It will only be a small correction, but to be accurate you need to figure out how to additionally apply the projection that corresponds to the intrinsic camera parameters of your camera. There are two ways to accomplish this: you can do it as a post multiplication to the scheme you already have, or you can change from multiplying by a 3x3 matrix to using a full 4x4 camera matrix:http://en.wikipedia.org/wiki/Camera_matrix with the parameters in there.
Using this approach the perspective projection is symmetric about the origin - if you don't check for z depth you'll project points behind you onto you screen as if they were the same z distance in front of you.
Then lastly I'm not sure about android APIs but make sure you're getting true north bearing and not magnetic north bearing. Some platform return either depending on an argument or configuration. (And your degrees are in radians if that's what the APIs want etc - silly things, but I've lost hours debugging less :) ).
If you know the points in the camera frame and the real world coordinates, some simple linear algebra will suffice. A package like OpenCV will have this type of functionality, or alternatively you can create the projection matrices yourself:
http://en.wikipedia.org/wiki/3D_projection
Once you have a set of points it is as simple as filling in a few vectors to solve the system of equations. This will give you a projection matrix. Once you have a projection matrix, you can assume the 4 points are planar. Multiply any 3D coordinate to find the corresponding 2D image plane coordinate.
In an Android app I want to draw a running leg. To output the top part of the leg I do something like:
// legCX,legCY is the location on screen about which the leg rotates.
Matrix m = new Matrix();
m.postTranslate(-legCX,-legCY);
m.postRotate(legRot);
m.postTranslate(legCX,legCY);
I then set the matrix to the canvas and draw the leg.
How do I draw the second part of the leg below the knee? It rotates at a different rate than the leg above it and has a center point which moves with the leg above it. I tried the following, but it turns out that the end result is rotation around some single point which doesn't follow the leg above.
Matrix m = new Matrix();
m.postTranslate(-legCX,-legCY);
m.postRotate(legRot);
m.postTranslate(0,-legLength);
m.postRotate(footRot);
m.postTranslate(0,legLength);
m.postTranslate(legCX,legCY);
I suspect that it's probably necessary to do the two rotations in two different Matrix objects and then combine them somehow, but I can't figure out how exactly to do that.
EDIT:
This type of matrix seems to be called a "transformation matrix". Combining multiple operations is called composition of transformations. However, none of the pages on this topic mention how to do a series of translations and rotations.
Surely, if you can use a matrix to do rotation about one point, it must be possible to do multiple matrix operations somehow to allow rotation about one point and then an additional rotation around a different point.
I've tried looking at pages on skeletal animation, but I can't make head nor tail of what they're talking about.
If I understand you problem correctly, you have a relative rotation case. You can try to search for double pendulum, see fig.
Using rotation matrix the new coordinates of point p1 rotated around point p0 can be found like
The new coordinates of point p2 rotated around point p1 will be
Finally, the new coordinates of point p2 rotated around point p0 will be
Order of matrices multiplication matters as well the sign of the angles.
I'm afraid this is going to be language-agnostic - I'm actually doing something similar at the moment in Android, but I'm both learning android and matrix math! You seem to know how to use matrices in Android so I'm guessing this won't be a problem.
So - let's say we've got two meshes (where a mesh is the thing that you can draw independently to the screen): UpperLeg and LowerLeg.
For UpperLeg you're going to have the point where the mesh rotates (RotationPoint)(in the real world I guess this would be the hip) and you're going to have the point where the LowerLeg attaches to it (AttachmentPoint)(in the real world I guess this would be the knee).
For LowerLeg you're going to have the point where the mesh rotates (RotationPoint)(in the real world I guess this would be the knee).
UpperLeg.AttachmentPoint = LowerLeg.RotationPoint (that way your leg won't fall off).
Let's now imagine that you've got two amounts of rotation (one for UpperLeg and one for LowerLeg): UpperLeg.Rotation and LowerLeg.Rotation.
(One the subject of rotation - if you haven't heard of Quaternions, you should look - it amazes me that some guy from 1846 came up with these - they basically encapsulate the concept of rotation, can be turned into rotation matrices, can be combined (by multiplication) and don't suffer from Gimbal-lock).
First up you you rotate UpperLeg by:
Moving the UpperLeg mesh so that UpperLeg.RotationPoint is the origin
Rotating by UpperLeg.Rotation
Moving the UpperLeg mesh so that it is where it needs to be in the real world.
I see that you're doing this.
So for the LowerLeg it'd be:
Moving the LowerLeg mesh so that LowerLeg.RotationPoint is the origin
Rotating by (UpperLeg.Rotation combined with LowerLeg.Rotation)
Moving the LowerLeg mesh by the same amount that the UpperLeg mesh was moved by in step 3
Moving the LowerLeg mesh by the Vector which is (the Vector from UpperLeg.RotationPoint to UpperLeg.AttachmentPoint) rotated by UpperLeg.Rotation.
The above steps can be combined and optimized.
Essentially I'm saying:
Rotate LowerLeg as it needs to be rotated, then shove it where it needs to go - where it needs to go will be determined by where UpperLeg went, plus how you get to where LowerLeg is attached to UpperLeg.
This is my first post here so, if I've broken any cardinal rules, please let me know what they are.
in OpenGL ES 1, I have a Rubic cube that consists of 27 smaller cubes. i want a rotation which causes a particular small cube becoming exactly in front of the viewpoint. so i need two vectors. one is the vector that comes from the origin of the object to that particular cube. and another is the vector that comes from origin to the viewpoint. then the cross product of them gives me the axis of the rotation and the dot product gives me the angle.
but i cant convert the (0,0,1) -which is the vector that comes from the origin to the viewpoint in world coordinate- to object coordinates.
how can i do that? how can i convert "world coordinates to object coordinates"?
It's easier to rotate the camera around than it is rotating the object in front of a stationary camera.
You can do what you asked for by placing the camera at the origin (center) of the rubic cube, giving it the opposite direction from the small cube, and than translating z backwards.
I know it doesn't answer the question in the title, but I think it's a simpler solution. (As for your question, I keep world and object coordinates same, and set the object scale as needed when rendering).
I am having trouble rotating my 3D objects in Open GL. I start each draw frame by loading the identity (glLoadIdentity()) and then I push and pop on the stack according to what I need (for the camera, etc). I then want 3D objects to be able to roll, pitch and yaw and then have them displayed correctly.
Here is the catch... I want to be able to do incremental rotations as if I was flying an airplane. So every time the up button is pushed the object rotates around it's own x axis. But then if the object is pitched down and chooses to yaw, the rotation should then be around the object's up vector and not the Y axis.
I've tried doing the following:
glRotatef(pitchTotal, 1,0,0);
glRotatef(yawTotal, 0,1,0);
glRotate(rollTotal, 0,0,1);
and those don't seem to work. (Keeping in mind that the vectors are being computed correctly)I've also tried...
glRotatef(pitchTotal, 1,0,0);
glRotatef(yawTotal, 0,1,0);
glRotate(rollTotal, 0,0,1);
and I still get weird rotations.
Long story short... What is the proper way to rotate a 3D object in Open GL using the object's look, right and up vector?
You need to do the yaw rotation around (around Y) before you do the pitch one. Otherwise, the pitch will be off.
E.g. you have a 45 degrees downward pitch and a 180 degrees yaw. By doing the pitch first, and then rotate the yaw around the airplane's Y vector, the airplane would end up pointing up and backwards despite the pitch being downwards. By doing the yaw first, the plane points backwards, then the pitch around the plane's X vector will make it point downwards correctly.
The same logic applies for roll, which needs to be applied last.
So your code should be :
glRotatef(yawTotal, 0,1,0);
glRotatef(pitchTotal, 1,0,0);
glRotatef(rollTotal, 0,0,1);
Cumulative rotations will suffer from gimbal lock. Look at it this way: suppose you are in an aeroplane, flying level. You apply a yaw of 90 degrees anticlockwise. You then apply a roll of 90 degrees clockwise. You then apply a yaw of 90 degrees clockwise.
Your plane is now pointing straight downward — the total effect is a pitch of 90 degrees clockwise. But if you just tried to add up the different rotations then you'd end up with a roll of 90 degrees, and no pitch whatsoever because you at no point applied pitch to the plane.
Trying to store and update rotation as three separate angles doesn't work.
Common cited solutions are to use a quaternion or to store the object orientation directly as a matrix. The matrix solution is easier to build because you can prototype it with OpenGL's built-in matrix stacks. Most people also seem to find matrices easier to understand than quaternions.
So, assuming you want to go matrix, your prototype might do something like (please forgive my lack of decent Java knowledge; I'm going to write C essentially):
GLfloat myOrientation[16];
// to draw the object:
glMultMatrixf(myOrientation);
/* drawing here */
// to apply roll, assuming the modelview stack is active:
glPushMatrix(); // backup what's already on the stack
glLoadIdentity(); // start with the identity
glRotatef(angle, 0, 0, 1);
glMultMatrixf(myOrientation); // premultiply the current orientation by the roll
// update our record of orientation
glGetFloatv(GL_MODELVIEW_MATRIX, myOrientation);
glPopMatrix();
You possibly don't want to use the OpenGL stack in shipping code because it's not really built for this sort of use and so performance may be iffy. But you can prototype and profile rather than making an assumption. You also need to consider floating point precision problems — really you should be applying a step that ensures myOrientation is still orthonormal after it has been adjusted.
It's probably easiest to check Google for that, but briefly speaking you'll use the dot product to remove erroneous crosstalk from two of the axes to the third, then to remove from one of the first two axes from the second, then renormalise all three.
Thanks for the responses. The first response pointed me in the right direction, the second response helped a little too, but ultimately it boiled down to a combination of both. Initially, your 3D object should have a member variable which is a float array size 16. [0-15]. You then have to initialize it to the identity matrix. Then the member methods of your 3D object like "yawObject(float amount)" just know that you are yawing the object from "the objects point of view" and not the world, which would allow the incremental rotation. Inside the yawObject method (or pitch,roll ojbect) you need to call the Matrix.rotateM(myfloatarray,0,angle,0,1,0). That will store the new rotation matrix (as describe in the first response). You can then when you are about to draw your object, multiply the model matrix by the myfloatarray matrix using gl.glMultMatrix.
Good luck and let me know if you need more information than that.