Determining what I'm looking at with Google Cardboard [Android]

Determining what I'm looking at with Google Cardboard [Android] - android

So here's the problem overview: I render a number of models using OpenGL ES in GvrView using the GvrView.StereoRenderer and I want to determine which exact model I'm looking at.
My idea is to reproject the screen coordinates back to the model space (discarding Z) and check if the point (lets call it Point) is in the range:
(ModelsMinX < Point.x < ModelsMaxX) and (ModelsMinY < Point.y < ModelsMaxY).
I was trying to use GLU.gluUnproject to get the initial coordinates. This function requires the current viewport and that's where the problems begin:
GvrView.StereoRenderer has a method .onDrawEye, which is called whenever there's something specific to one eye that should be setup before rendering (aka the view and the projection matrices should be acquired from the Eye instance). An Eye also has a method .getViewport which is supposed to return a viewport for the current eye, however the returned result is completely clear to me. More specifically, I'm developing on Nexus 6 (1440x2560 pixels) and .getViewport returns:
x = 0, y 0, width = 1280, height = 1440 // for the first eye
x = 1280, y 0, width = 1280, height = 1440 // for the second eye.
Now this is interesting.. Somehow I assumed two things about the current viewport:
width = 1440, height = 1280 (we are in the landscape mode after all);
the viewport size for each eye will be half the size of the whole viewport.
Hence, calling .gluUnproject on the middle point of the viewport:
GLU.gluUnProject(viewport.width / 2, viewport.height / 2, 0, mEyeViewMatrix, 0, mEyeProjectionMatrix, 0, new int[] {viewport.x, viewport.y, viewport.width, viewport.height}, 0, center, 0);
does not yield expected results, in fact, it gives me all 0s. I found this question (Determining exact eye view size), but the guy gets even stranger viewport values and it doesn't contain an answer..
So the question is - how to I get from the 'eye'-space coordinates to the model? And what those coordinates even are?
Here's the project's github for reference: https://github.com/bowlingforsoap/CardboardDataVisualizationJava.
Some other approaches I'm aware of:
In the treasurehunt demo they use an opposite way of doing things - they go from a model coordinate (0, 0, 0, 1) into the head view space using the HeadTransform to get the headView matrix (look for method .isLookingAtObject in https://github.com/googlevr/gvr-android-sdk/blob/master/samples/sdk-treasurehunt/src/main/java/com/google/vr/sdk/samples/treasurehunt/TreasureHuntActivity.java).
Using raycasting. I'm not sure this is going to help my cause, because after I get an observed object I would like to create an 'floating' activity which is going to contain information about it (I certainly don't wanna render that data through shaders).
Wheh! Boy, that's a lot of text. But yeah, it seems like a generic problem, yet I haven't found an easy/elegant/working solution to that.. Would appreciate any feedback.

Related

Android - calculating pixel rotation without matrix? And checking if pixel is in view

I'm hoping someone can help me out. I'm making an image manipulation app, and I found I needed a better way to load in large images.
My plan, is to iterate through "hypothetical" pixels of an image (a "for loop" that covers width/height of the base image, so each iteration represents a pixel), scale/translate/rotate that pixels position relative to the view, then use this information to determine which pixels are being displayed in the view itself, then use a combination of BitmapRegionDecoder and BitmapFactory.Options to load in only the section of image that the output actually needs rather than a full (even if scaled) image.
So far I seem to have covered scale of the image and translation properly, but I can't seem to figure out how to calculate rotation. Since it's not a real Bitmap pixel I can't use Matrix.rotate =( Here is the image translations in the onDraw of the view, imgPosX and imgPosY hold the center point of the image:
m.setTranslate(-userImage.getWidth() / 2.0f, -userImage.getHeight() / 2.0f);
m.postScale(curScale, curScale);
m.postRotate(angle);
m.postTranslate(imgPosX, imgPosY);
mCanvas.drawBitmap(userImage.get(), m, paint);
and here is the math so far of how I'm trying to determine if an images pixel is on the screen:
for(int j = 0;j < imageHeight;j++) {
for(int i = 0;i < imageWidth;i++) {
//image starts completely center in view, assume image is original size for simplicity
//this is the original starting position for each pixel
int x = Math.round(((float) viewSizeWidth / 2.0f) - ((float) newImageWidth / 2.0f) + i);
int y = Math.round(((float) viewSizeHeight / 2.0f) - ((float) newImageHeight / 2.0f) + j);
//first we scale the pixel here, easy operation
x = Math.round(x * imageScale);
y = Math.round(y * imageScale);
//now we translate, we do this by determining how many pixels
//our images x/y coordinates have differed from it's original
//starting point, imgPosX and imgPosY in the view start in center
//of view
x = x + Math.round((imgPosX - ((float) viewSizeWidth / 2.0f)));
y = y + Math.round((imgPosY - ((float) viewSizeHeight / 2.0f)));
//TODO need rotation here
}
}
so, assuming my math up until rotation is correct (probably not but it appears to be working so far), how would I then calculate the rotation from that pixels position? I've tried other similar questions like:
Link 1
Link 2
Link 3
without using rotation the pixels I expect to actually be on the screen are represented (I made text file that outputs the results in 1's and 0's so I can have a visual representation of whats on the screen), but with the formula found in those questions the information isn't what is expected. (Scenario: I've rotated an image so only the top left corner is visible in the view. Using the info from Here to rotate the pixel, I should expect to see a triangular set of 1's in the upper left corner of the output file, but that's not the case)
So, how would I calculate a a pixels position after rotation without using the Android matrix? But still get the same results.
And if I've just messed it up entirely my apologies =( Any help would be appreciated, this project has gone on for so long and I want to finally be done lol
If you need any more information I will provide as much as I possibly can =) Thank you for your time
I realize this question is particularly difficult so I will be posting a bounty as soon as SO allows.

You do not need to create your own Matrix, use the existing one.
http://developer.android.com/reference/android/graphics/Matrix.html
You can map bitmap coordinates to screen coordinates by using
float[] coords = {x, y};
m.mapPoints(coords);
float sx = coords[0];
float sy = coords[1];
If you want to map screen to bitmap coordinates, you can create the inverse matrix
Matrix inverse = new Matrix(m);
inverse.inverse();
inverse.mapPoints(...)
I think your overall approach is going to be slow, as doing the pixel manipulation on the CU from Java has a lot of overhead. When drawing bitmaps normally, the pixel manipulation is done on the GPU.

Android Screen Ratios

I'm using OpenGL ES 2.0 on Android and I and I initialise my display like so:
float ratio = (float) width / height;
Matrix.orthoM(mProjMatrix, 0, -ratio, ratio, -1, 1, 3, 7); //Using Orthographic as developing 2d
What I'm having trouble understanding is this:
Let's say my app is a 'fixed screen' game (like Pac-Man ie, no scrolling, just the whole game visible on the screen).
Now at the moment, if I draw a quad at -1 to +1 on both x and y I get something like this:
Obviously, this is because I am setting -ratio, ratio as seen above. So this is correct.
But am I supposed to use this as my 'whole' screen? With rather massive letterboxing on the left and right?
I want a rectangular display that is the whole height of the physical display (and as much of the width as possible), but this would mean drawing at less that -1 and more than +1, is this a problem?
I realise the option may be to use clipping if this was a scrolling game, but for this particular scenario I want the whole 'game board' on the screen and to be static (And to use as much of the available screen real estate as possible without 'stretching' thus causing elongation of my sprites).
As I like to work with 0,0 as the top of the screen, basically what I do is pass my draw method something like so:
quad1.drawQuad (10,0);
When the drawQuad method get's this, it basically takes the range from left to right as expressed my openGL and divide the the screen width (so, in my case -1.7 through +1.7 so 3.4/2560 = 0.001328125). And say I specify 10 as my X (as above), it will say something like:
-1.7 + (10*0.001328125) = -1.68671875
It then plots the quad at -1.68671875.
Doing this I am able to work with normal co-ords (and I just subtract rather than add for y axis so I can have 0 at the top).
Is this a good way to do things?
Because with this method, at the moment, if I specify a 100,100 square, it isn't a square, it's rectangle. However, on the plus side, I can fill the whole physical screen by scaling the quad by width x height.

You are drawing a 1x1 quad, so that is why you see a 1x1 quad. Try translating the quad 0.25 to the right or left and you will see that you can draw in that space too.
In graphics, you create an object, like a quad, in your case you made it 1x1. Then you position it wherever you want. If you do not position it, then it will be at the origin, which is what you see.
If you draw a wider shape, you will also see you can draw outside this area on the screen.
By the way, with your ortho matrix function, you aren't just specifying the screen aspect ratio, you are also specifying the coordinate unit size you have to work with. This is why a 1x1 is filling the height the of the screen, because your upper and lower boundaries are set to 1 and -1. Your ratio is a little more than one, since your width is longer than your height, so your left and right boundaries are essentially something like -1.5 and 1.5 (whatever your ratio happens to be).
But you can also do something like this;
Matrix.orthoM(mProjMatrix, 0, -width/2, width/2, -height/2, height/2, 3, 7);
Here, your ratio is the same, but you are sending it to your ortho projection with screen coordinates. (Disclaimer: I don't use the same math library you do, but these appears to be a conventional ortho matrix function based on the arguments you are passing to it).
So lets say you have a 1000x500 pixel resolution. In OpenGL your origin of 0,0 is in the middle. So now your left edge is at (-500,y), right edge at (500,y) and your top is (x,250). So if you draw your 1x1 quad, it will be very tiny, but if you draw a 250x250 square, it will look like your 1x1 quad in your previous ortho projection.
So you can specify the coordinates you want, the ratio, the unit size, etc for how you want to work. Personally, I dont't like specifying coordinates as fractions between 0 and 1, I like to think about them in the same sense as the screen pixels.
But whether or not you choose to do this, hopefully you understand what you are actually passing to these matrix functions.
One of the best ways to learn is draw an object to the screen and just play around with different numbers you send to your modelview and projection matrices so you can see what it is they are actually doing.

OpenGL co-ordinate mapping to device co-ordinate

I have gone through so many tutorial & also implemented some small apps in OpenGL.Stil I have confusion over the mapping of OpenGL co-ordinate system to android view co-ordinate system.
I faced the problem while I was trying to display a texture as full screen.Somehow by hit&trial method I was able to show the texture as full screen,but has so many doubts for which I could not proceed fast.
In OpenGL co-ordinate system starts with left-bottom(as origin),whereas in device left- top as origin.How things are mapped correctly to device.
In OpenGL we specify vertices range start from -1 to 1.How these range maps to device where it ranges from 0 to width & height.
Can vertices be mapped exactly the same way as the device co-ordinate.Like vertex with 0,100 maps to device co-ordinates with 0,100.
While trying to show texture as fullscreen,I changed the code according to some blogs&it worked.Here is the changes.
glOrtho(0, width, height, 0, -1, 1); from glOrtho(0, width, 0, height, -1, 1);
& vertices[] = {
0, 0,
width, 0,
width, height,
0, height
};
from {-1,-1,
1,-1,
-1,1,
1,1}
Plz help me to understand the co-ordinate mapping.

what you set the glOrtho to the width and the height opengl is going to stretch that to fit the device you are using, say your width = 320 and height = 480 when you make glOrth(0,width,height,0,1,-1) opengl stretches that to fit your screen so the coordinates can be whatever you want them to be by setting the width and height of glOrth()

Reverse projecting screenspace coordinate to modelspace coordinates

I am working on an Android Application in which a 3d scene is displayed and the user should be able to select an area by clicking/tapping the screen. The scene is pretty much a planar (game) board on which different objects are placed.
Now, the problem is how do I get the clicked area on the board from the actual screen-space coordinates?
I was planning on using gluUnProject(), as I have access to (almost) all the necessary parameters. Unfortunately I am missing the winZ param, and cannot get the current depth as the touch event is occurring in a different thread than the GL-thread.
My new plan is to still use gluUnProject, but with a winZ of 0, and then project the resulting point onto the board (the board stretches from 0,0,0 to 10,0,10 in model space), However, I can't seem to figure out how to do this?
I would be very happy if anyone could help me out with the maths needed to do this (matrices were never my strongest side), or perhaps find a better solution.
To clarify; here is an image of what I want to do:
The red rectangle represent the device screen, the green x is the touch event and the black square is the board (grey subdivisions represent a square of one unit). I need to figure out where on the board the touch has happened (in this case it is in square 1,1).

As you are working in 2D basically already, (I presume you mean your 3D board stretches from 0,0,0 to 10,10,0 (x,y,z).) you could translate and interpolate/extrapolate the 2D/3D space coordinates from your screen space coordinates without the gluUnProject(). You will need your screen resolution, and to pick the resolution of the 3D space grid you wish to convert to. If both the screen and 3D space origins are aligned (0,0 screen space is at 0,0,0 3D space), and your screen dimensions are 320x240, using your existing 10x10 3D grid, then 320/10 = 32, and 240/10 = 24, thus the screen space size of a single 1x1 area is 32x24. So if the user presses on 162, 40, then the user is pressing within ( 5, 1, 0) (162/32 >= 5 but < 6, 40/24 >= 1 but < 2 ) in the 3D space. If you need greater resolution than this you can select a higher 3D space grid resolution (i.e. using 20 instead of 10). You don't need to update the GL matrix to use this factor. Though it may make it simpler in some ways, I'm sure from a modeling perspective you would have additional work to do. Just be aware for a factor like 20, 1,3 would be at (.5, 1.5, 0). If your screen and 3D space origins are not already aligned will need to translate the screen space coord prior to this. If 0,0 screen space is 10,10,0, you will need to take your screen resolution and subtract the current point from it, making 0,0 into 320, 240 in this example, our example point of 162, 40, would be 158 (320-158 == 162), 200 (240-200 == 40).
If you'd like an overview of the projection matrix and how that all works, which could help you understand where to put the screen space dimensions in the unproject matrix, read this chapter of the OpenGL red book. http://www.glprogramming.com/red/chapter03.html
Hope this helps, and good luck!

So, I managed to solve this by doing the following:
float[] clipPoint = new float[4];
int[] viewport = new int[]{0, 0, width, height};
//screenY/screenX are the screen-coordinates, y should be flipped:
screenY = viewport[3] - screenY;
//Calculate a z-value appropriate for the far clip:
float dist = 1.0f;
float z = (1.0f/clip[0] - 1.0f/dist)/(1.0f/clip[0]-1.0f/clip[1]);
//Use gluUnProject to create a 3d point in the far clip plane:
GLU.gluUnProject(screenX, screenY, z, vMatrix, 0, pMatrix, 0, viewport, 0, clipPoint, 0);
//Get a point representing the 'camera':
float eyeX = lookat[0] + eyeOffset[0];
float eyeY = lookat[1] + eyeOffset[1];
float eyeZ = lookat[2] + eyeOffset[2];
//Do some magic to calculate where the line between clipPoint and eye/camera would intersect the y-plane:
float dX = eyeX - clipPoint[0];
float dY = eyeY - clipPoint[1];
float dZ = eyeZ - clipPoint[2];
float resX = glu[0] - (dX/dY)*glu[1];
float resZ = glu[2] - (dZ/dY)*glu[1];
//resX and resZ is the wanted result.

OpenGL ES - glReadPixels

I am taking a screenshot with glReadPixels to perform a "cross-over" effect between two images.
On the Marmalade SDK simulator, the screenshot is taken just fine and the "cross-over" effect works a treat:
However, this is how it looks on iOS and Android devices - corrupted:
(source: eikona.info)
I always read the screen as RGBA 1 byte/channel, as the documentation says it's ALWAYS accepted.
Here is the code used to take the screenshot:
uint8* Gfx::ScreenshotBuffer(int& deviceWidth, int& deviceHeight, int& dataLength) {
/// width/height
deviceWidth = IwGxGetDeviceWidth();
deviceHeight = IwGxGetDeviceHeight();
int rowLength = deviceWidth * 4; /// data always returned by GL as RGBA, 1 byte/each
dataLength = rowLength * deviceHeight;
// set the target framebuffer to read
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glPixelStorei(GL_PACK_ALIGNMENT, 1);
uint8* buffer = new uint8[dataLength];
glReadPixels(0, 0, deviceWidth, deviceHeight, GL_RGBA, GL_UNSIGNED_BYTE, buffer);
return buffer;
}
void Gfx::ScreenshotImage(CIwImage* img, uint8*& pbuffer) {
int deviceWidth, deviceHeight, dataLength;
pbuffer = ScreenshotBuffer(deviceWidth, deviceHeight, dataLength);
img->SetFormat(CIwImage::ABGR_8888);
img->SetWidth(deviceWidth);
img->SetHeight(deviceHeight);
img->SetBuffers(pbuffer, dataLength, 0, 0);
}

That is a driver bug. Simple as that.
The driver got the pitch of the surface in the video memory wrong. You can clearly see this in the upper lines. Also the garbage you see at the lower part of the image is the memory where the driver thinks the image is stored but there is different data there. Textures / Vertex data maybe.
And sorry, I know of no way to fix that. You may have better luck with a different surface-format or by enabling/disabling multisampling.

In the end, it was lack of memory. The "new uint8[dataLength];" never returned an existent pointer, thus the whole process went corrupted.
TomA, your idea of clearing the buffer actually helped me to solve the problem. Thanks.

I don't know about android or the SDK you're using, but on IOS when I take a screenshot I have to make the buffer the size of the next POT texture, something like this:
int x = NextPot((int)screenSize.x*retina);
int y = NextPot((int)screenSize.y*retina);
void *buffer = malloc( x * y * 4 );
glReadPixels(0,0,x,y,GL_RGBA,GL_UNSIGNED_BYTE,buffer);
The function NextPot just gives me the next POT size, so if the screen size was 320x480, the x,y would be 512x512.
Maybe what your seeing is the wrap around of the buffer because it's expecting a bigger buffer size ?
Also this could be a reason for it to work in the simulator and not on the device, my graphics card doesn't have the POT size limitation and I get similar (weird looking) result.

What I assume is happening is that you are trying to use glReadPixels on the window that is covered. If the view area is covered, then the result of glReadPixels is undefined.
See How do I use glDrawPixels() and glReadPixels()? and The Pixel Ownership Problem.
As said here :
The solution is to make an offscreen buffer (FBO) and render to the
FBO.
Another option is to make sure the window is not covered when you use glReadPixels.

I am getting screenshoot of my android game without any problems on android device using glReadPixels.
I am not sure yet what's the problem in your case, need more information.
So lets start:
I would recommend you not to specify PixelStore format. I am worried about your alignment in 1 byte, do you really "use it"/"know what does it do"? It seems you get exactly what you specify - one extra byte(look at your image, there one extra pixel all the time!) instead of fully packed image. SO try to remove this:
glPixelStorei(GL_UNPACK_ALIGNMENT, 1);
glPixelStorei(GL_PACK_ALIGNMENT, 1);
I am not sure in C code, as I was working only in java, but this looks as possible point:
// width/height
deviceWidth = IwGxGetDeviceWidth();
deviceHeight = IwGxGetDeviceHeight();
Are you getting device size? You should use your OpenGL surface size, like this:
public void onSurfaceChanged(GL10 gl, int width, int height) {
int surfaceWidth = width;
int surfaceHeight = height;
}
What are you doing next with captured image? Are you aware that memory block you got from opengl is RGBA, but all not-opengl image operations expect ARGB?
For example here in your code you expect alpha to be first bit, not last:
img->SetFormat(CIwImage::ABGR_8888);
In case if 1, 2 and 3 did not help you might want to save the captured screen to phone sdcard to examine later. I have a program that converts opengl RGBA block to normal bitmap to examine on PC. I may share it with you.

I don't have a solution for fixing glReadPixels. My suggestion is that you change your algorithm to avoid the need to read the data back from the screen.
Take a look at this page. These guys have done a page flip effect all in Flash. It's all in 2D, the illusion is achieved just with shadow gradients.
I think you can use a similar approach, but a little better in 3D. Basically you have to split the effect into three parts: the front facing top page (clouds), the bottom page (the girl) and the back side of the front page. You have to draw each part separately. You can easily draw the front facing top page and the bottom page together in the same screen, you just need to invoke the drawing code for each with a preset clipping region that is aligned with the split line where the top page bends. After you have to top and back page sections drawn, you can draw the gray back facing portion on top, also aligned to the split line.
With this approach the only thing you lose is a little bit of deformation where the clouds image starts to bend up, of course no deformation will occur with my method. Hopefully that will not diminish the effect, I think the shadows are way more important to give the depth effect and will hide this minor inconsistency.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.