OpenGL/Android big sprite quads bad performance

OpenGL/Android big sprite quads bad performance - android

I am developping a game on android using opengl and am having a little performance problem.
Let's say for example I want to draw a background partially filled with grass "bushes". Bushes have different x,y,z, different sizes and so on (each bush is a 2D sprite), and potentially partially hide each other (I use a perspective camera). I am having a big performance problem if those sprites are big (i.e. the quad sizes, not the texture size/resolution) :
If I use a classical front to back draw (to avoid overdraw), I find myself having problems because of (I think) alpha testing. Even if the bushes have only opaque and fully transparent pixels (no partial transparency), and if I use the proper alpha testing comparison (GL_EQUAL 1) the performances are bad because a lot of pixels have to be alpha tested (If I understand right).
If I use a back to front display with alpha testing disabled, I lose a lot of performance too (but this time because of overdraw problems), even when disabling depth buffer writing (not sure if it does anything if depth test is disabled by the way).
I am having good performances if using front to back without alpha testing, but of course sprite cutout is completely gone, which is really really bad.
All the bushes have the same texture, I use 16 bit colors, mip mapping, geometry batching, cull faces, no shaders, etc. All what I can think of to improve performances (which are not bad in other cases), except texture compression. I even filter the sprites to avoid "displaying" the ones out the screen. I have also tried some "violent optimizations" for test purposes, such as making the textures fully opaque, lowering the texture resolution a lot, disabling blending, etc, but nothing was fantastic performance-wise except the alpha testing removal.
I was wondering if I was forgetting something here to help with the performance. Back to front creates overdraw, front to back is slow because of alpha testing (and I do not want my bushes to be "square" images so I cannot disable alpha testing). If I create smaller sprites performances are far better (even with a lot more sprites), but this is only a workaround.
To summarize, how can you display overlapping big quads needing cutout, without losing performance?
PS : I am testing on a nexus one.
PS2 : Some optimizations suggest to not create quads but geometries more "fitting" the texture, but it seems to be a really tedious process, and would not help me a lot I think.

Drawing front-to-back is normally a benefit because of early-z: the hardware can do the depth test right after rasterization, before doing the texture fetch or shading. With front-to-back sorting, most fragments fail the depth test, and you save a lot of texture bandwidth, shading throughput, and zbuffer-write bandwidth.
But alpha test breaks that. If a fragment passes the depth test, it might still be killed by alpha test, so zwrite can't happen until after texturing/shading. Most hardware that can do early-z still has to do the depth test at the same point in the pipeline as it does zwrite, so with alpha test you end up doing ztest + zwrite after texturing and shading. As a result, front-to-back sorting only saves you zwrite bandwidth, nothing else.
I think you have two options, if you really want large sprites that overlap significantly:
(a) Only use two or three distinct Z values for your sprites. Draw them back-to-front with blending (and alpha-test, if it helps). No overlap within a layer: you can pre-render each layer either in the original assets or once at runtime, then just shift it left and right.
(b) If your sprites have large opaque regions surrounded by a semi-transparent border, you can draw the opaque regions in a first pass with no alpha test, then draw borders as a separate pass. This will cut down on the number of alpha-tested fragments.

Related

Clipping a partly off screen textured quad via code gives performance gain?

I'm writing a 2d mobile game that has huge textures for the backgrounds and i was wondering if and how the graphic hardware optimizes drawing of items that are partly / completely out of the screen.
I'm expecting it to at least skip rasterization of pizels that fall out of the buffer.
So i was wondering if i could expect a performance gain in drawing only the visible part of the texture to the screen using a smaller quad or the gain in performance would be trivial?
The situation could be summed up like this:
1) few quads,
2) big textures,
3) many of the quads are partly on screen.

Trying to figure out OpenGLES and its "space"

Currently I'm working on an Application involving OpenGL ES 2.0. I'm using the Java Wrapper for it, since the OpenGL part will probably not have the biggest complexity ever. Nontheless, I'm currently stuck.
First, I'm trying to draw something like this:
So I just want to draw some sort of indicator, how big my "space" is - if there even are limitations? How would I draw such a cage around the center of the camera? (Of course I just want a simple one, basically a square, indicating boundaries, not something with rounded borders etc)

To draw something like this without rounded corners I suggest you to simply draw a textured cube (there are too many of those around the web). For it to look as nice as the one on the image you will also need to add some lights into the scene as they are the ones that give a true 3d effect (a sphere without shades/lights will always appear as a 2d circle).
As for the limitations: There are no specific limitations in size except the overflow. I think in most cases you have a 32-bit floating values in your vectors so its maximum value would be how big is your space. Other limitations are more of a visual, you usually use frustum for this type of scene which has parameters zNear and zFar clipping plains. These two will define you can not see pixels nearer then zNear or further then zFar. Although you can set your own value for zFar and can be very large you should know there is a penalty in depth buffer precision doing so (result can be incorrect drawing when 2 objects are too close together).
So in general you are the one that has to take care of the scene scale or size and consider your field of view.

Seamlessly layering transparent sprites in OpenGL ES

I am working on an Android app, based on the LibGDX framework (Though I don't think that should affect this problem too much), and I am having trouble finding a way to get the results I want when drawing using transparent sprites. The problem is that the sprites visibly layer on top of each other where they overlap, similar to what is displayed in this image :
This is pretty unsightly for some of what I want to do, and even completely breaks other parts. What I would like them to do is merge together seamlessly, like so:
The only success I have had thus far is to draw the entire sequence of sprites on a separate texture at full opacity, and then draw that texture back with the desired opacity. I had this working moderately well, and I could likely make it work for most of what I need it to, but the large problem right now is that these things are dynamically drawn onto the screen, and the process of modifying a fairly large texture and sending it back are pretty taxing on mobile devices, and causes an unacceptable level of performance.
I've spent a good chunk of time looking for more ideal solutions, including experimenting with blend modes and coming up with quirky formulas that balanced out alpha and color values in ways to even things out, but nothing was particularly successful. My guess is that the only viable route for this is the previously mentioned way of creating a texture and applying the alpha difference to that, but I am unsure of the best way to make that work with lower powered mobile devices.

There might be a few other ways to do this: The most straight forward would be to attach a stencil buffer and draw circles to stencil first and then draw a full screen rect with desired color+alpha with the stencil, this should be much faster then some FBO with a separate texture.
Another thing might work is drawing those circles first with disabled blend and then your whole scene over it with inverted "blendFunc" but do note it might be impossible if other elements also need blending.
3rd instead of using stencil you could just use the alpha channel of your render buffer. Just use a color mask to draw only to alpha and draw the circles, then reenable RGB on color mask and draw the fullscreen rect using appropriate "blendFunc" also note here that if previous shapes have used blend you will need to clear the alpha to 1.0 before doing this (color mask to alpha only, disabled blend, draw full screen rect with color that has alpha set to 1.0)

OpenGL ES drop shadows for 2D sprites

I've got a an OpenGL scene rendered with a bunch of sprites, and I'd like to automagically add drop shadows to all of them. Here's a picture showing what I mean:
The scene uses orthographic projection, the sprites are textured quads, and I'm using the depth buffer to draw them front to back. I'm working with OpenGL ES 2.0, but thoughts from the iOS or non-ES worlds would be appreciated as well. I've tossed a few ideas around in my head of how I can go about this, and I'd like to find out which has the most promise.
Draw each sprite twice, the first normally, the second with some kind of drop shadow shader a bit deeper in the scene. Not sure if this is possible?
Draw a sprite, then draw it again, darkened and with some alpha, several times with some random jitter applied to the verticies. This may look silly and not at all like a shadow.
Draw the base scene without background to a texture, then blur and darken it to create one large drop shadow. Then draw the base scene over the drop shadow texture, then finally over the background. This would lose the shadows between sprites, though.
SSAO in a post-processing pass. Might be the most dynamic and automatic, but could look fuzzy/grainy and really slow things down.
At creation time, generate a shadow texture for each sprite. For rendering, draw a sprite and then its shadow texuture a bit deeper in the scene. I think I'd like to avoid this due to the loading time and extra memory requirements, but this may be the fastest and best looking?
I don't want to do any shadow work with external textures, since I use the same sprite textures at varying scales, and pre-baked shadows would scale unnaturally.
So are any of these better than the others? Are there other options I'm not thinking of? Thanks!

Those are all some well thought out options, here are my thoughts on each
It is definitely possible to use a shader but it might not be the most performant option, since the blurring will have to be done inside the shader and might involve multiple texture lookups.
Drawing the texture multiple times would work and would look like a shadow, because each "jittered" image would have slightly modified alpha values. But again, blending and multiple renders of each sprite would add up and might affect performance.
I like and recommend this option, because you can set a shader that puts black pixels instead of colored pixels (considering alpha) into a render target smaller than the screen (1/4th?) and then use this as the shadow texture. Since the texture is now being stretched, you'd get the "blurring" for free, too. The pixel shader that does the "blackening" would be very simple and not affect performance too much.
Unless you really need high-quality shadows (and the previous method doesn't suffice) I wouldn't recommend this.
This is of course the most flexible option and has an x2 rendering complexity. Unfortunately, it will consume more memory than all the other options above.
Hope this helps!

Scrolling/zooming a scene in OpenGL and subdivision

We are to develop a scrolling/zooming scene in OpenGL ES on Android, very much like a level in Angry Birds but more like a level in World Of Goo. More like the latter as the world will not consist of repeated layers as featured in Angry Birds but of a large image. As the scene needs to scroll/zoom and therefore a lot of it will not be visible, I was wondering about the most efficient way to implement the rendering, focusing on the environment only (ie not the objects within the world but background layers).
We will be using an orthographic projection.
The first that comes to mind is creating a large 4 vertices rectangle at world size, which has the background texture mapped to it, and translate/scale this using glTranslatef / glScalef. However, I was wondering if the non visible area outside of the screens boundaries is still being rendered by OpenGL as it is not being culled (you would lose the visible area as well as there are only 4 vertices). Therefore, would it be more efficient to subdivide this rectangle, so non visible smaller rectangles can be culled?
Another option would be creating a 4 vertice rectangle that would fill the screen, then move the background by adjusting its texture coordinates. However, I guess we would run into problems when building bigger worlds, considering the texture size limit. It seems like a nice implementation for repeated backgrounds like AngryBirds has.
Maybe there is another way..?
If someone has an idea on how it might be done in AngryBirds / World of Goo, please share as I'd love to hear. They seem to have implemented a system that allows for the world to be moved and zoomed very (WorldOfGoo = VERY) smoothly.

This is probably your best bet for implementation.
In my experience, keeping a large texture in memory is very expensive on Android. I would get quite a few OutOfMemoryError exceptions for the background texture before I moved to tiling.
I think the biggest rendering bottleneck would be with memory transfer speeds and fill rate instead of any graphics computation.
Edit: Check out 53:28 of this presentation from Google I/O 2009.

You could split the background rectangle into smaller rectangles, so that OpenGL only renders the visible rectangles. You won't have a big ass rectangle with a big ass texture loaded but smallers rectangles with smaller textures that you could load/unload, depending on what is visible on screen...

Afaik there would be no performance drop due to large areas being rendered off-screen, subdividing and culling is normally done just to reduce vertex count, but you would actually be adding to it here.
Putting that aside for now; from the way you phrased the question I am unsure whether you have a large background texture or a small repeating one. If it is large, then you will need to subdivide because of texture size limitations anyway, so the question is moot! If it is small, then I would suggest the second method, fit a quad to the screen and move the background by changing the texture coordinates.
I feel like I may have missed something, though, as I am unsure why you mentioned the texture size limitation issue when talking about the the texture coordinate method and not the large quad method. Surely for both this is not a problem for repeating textures as you can use GL_REPEAT texture wrap mode...
But for both it is a problem for a single large texture unless you subdivide, which would make the texture coordinate tactic way more complicated than necessary. In this case subdividing the mesh along texture subdivisions would be best, and culling off-screen sections. Deciding which parts to cull should be trivial with this technique.
Cheers.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.