I need to apply full-screen photographic-like vignette effect over rendered scene. Obviously, I have to use blending to achieve this. I would like to choose the fastest possible blending mode because it will be applied to all screen space.
Do some blending modes in OpenGL ES work faster than another? Or does any blending mode work at the same fill-rate? So far I haven't found any resources on Internet saying that certain blending modes are slower or faster than another ones, so I decided to ask this question on SO.
This is for Android app, so I understand that of course this behavior can depend on GPU vendor, but maybe there are some common considerations for faster blending?
The one single slow part of blending is reading pixels from the backbuffer(doesn't matter alpha only or rgb or both). So as long as it's 'real' blending using dst color/alpha(i.e. not using a degenerate blend func like glBlendFunc(GL_ONE, GL_ZERO)or glBlendFunc(GL_ZERO, GL_ONE) or similar) - there's no performance difference.
It doesn't matter which blending option you choose, it is going to slow down the fragment shader as it needs to read back the pixel values from the target framebuffer. You can save some cycles by splitting your effect into some quads that are setup around the screen borders and leaving the central part of the framebuffer without overlaying quads. You can also do some more tricky approaches to use the early fragment discard employed by some tile based mobile GPUs like the Mali ones, but maybe is just not worth the effort.
To be short, no there is probably not a measurably worse blendmode (as long as you are doing "real" blending).
Blending can either be implemented by having a fixed function blend stage, or by adding a short tail to the shader program that will do the actual blending. Another solution is that fixed-function is used for most of the common blend modes, while a shader takes over if there is an uncommon blend mode. If you hit the shader one, your performance might take a hit.
Knowing what is good or bad would be very HW specific - and might not even be measurable due to the biggest cost is that you need to read and combine two buffers, not the relatively minor extra shading cost.
Related
How can I do antialiasing on triangles on the entire render? Should I put it on the fragmentShader? Is there any other good solution to improve this sort of thing?
Here is my "view", with very crispy edges (not very nice).
After doing some Deep research, I found that It's in fact pretty simple, and the most comonly done is to render like there was a screen 4 times bigger (or even more than 4 times). After rendering to this much more bigger screen, the GPU will take the avarege of that area and set the pixel color based on that.
It's pretty easy to enable this with this library:
https://code.google.com/p/gdc2011-android-opengl/source/browse/trunk/src/com/example/gdc11/MultisampleConfigChooser.java
However,you should keep in mind, that it will spent 4 or more times time to render everything, meaning more time to process, and perhaps, less FPS...
Also, if you are emulating an android device with OpenGL, find out if your GPU supports this kind of Multisampling. Mine for example, doesen't (Tegra).
Here is the final result, with and without multisampling:
I have been unable to find any internet articles or Google documentation on the relative performance of compositing bitmaps using different Porter-Duff modes. What has become very apparent to me whilst programming is that the traditional SRC/DST prefix modes are performing a lot faster (3 - 4 times faster) than the Android Mode.DARKEN, Mode.LIGHTEN, Mode.MULTIPLY modes. Use of the latter modes can bring down my game engine's performance from 40+ to around 13 FPS when rendering a lighting mask on a 720p screen.
My questions are thus:
Is there a faster way for compositing images using the darken/lighten property than the supplied Porter-Duff modes? Would it be worth the switch to OpenGL?
Are there data available on the relative speeds of different compositing modes?
Yes, there are many faster ways, for a game engine switching to opengl (or to something like Unity if you want something more high level) can be a very good idea. Renderscript is also a very good alternative that already has built-in multiply intrinsic.
You should probably bench these things yourself, there are few measurements on this kind of topic and hardware moves fast.
I am trying to write a libgdx livewallpaper (OpenGL ES 2.0) which will display a unique background image (non splittable into sprites).
I want to target tablets, so I need to somehow be able to display at least 1280x800 background image on top of which a lot more action will also happen, so I need it to render as fast as possible.
Now I have only basic knowledge both about libgdx and about opengl es, so I do not know what is the best way to approach this.
By googling I found some options:
split texture into smaller textures. It seems like GL_MAX_TEXTURE_SIZE on most devices is at least 1024x1024, but I do not want to hit max, so maybe I can use 512x512, but wouldn't that mean drawing a lot of tiles, rebinding many textures on every frame => low performance?
libgdx has GraphicsTileMaps which seems to be the tool to automate drawing tiles. But it also has support for many features (mapping info to tiles) that I do not need, maybe it would be better to use splitting by hand?
Again, the main point here is performance for me - because drawing background is expected to be the most basic thing, more animation will be on top of it!
And with tablet screen growing in size I expect soon I'll need to be able to comfortably render even bigger image sizes :)
Any advice is greatly appreciated! :)
Many tablets (and some celphones) support 2048 textures. Drawing it in one piece will be the fastest option. If you still need to be 100% sure, you can divide your background into 2 pieces whenever GL_MAX_TEXTURE happens to be smaller (640x400).
'Future' tables will surely support bigger textures, so don't worry so much about it.
For the actual drawing just create a libgdx mesh which uses VBOs whenever possible! ;)
Two things you dindn't mention will be very important to the performance. The texture filter (GL_NEAREST is the ugliest if you don't do a pixel perfect mapping, but the fastest), and the texture format (RGBA_8888 would be the best and slowest, you can downgrade it until it suits your needs - At least you can remove alpha, can't you?).
You can also research on compressed formats which will reduce the fillrate considerably!
I suggest you start coding something, and then tune the performance up. This particular problem you have is not that hard to optimize later.
I found a 3D graphics framework for Android called Rajawali and I am learning how to use it. I followed the most basic tutorial which is rendering a shpere object with a 1024x512 size jpg image for the texture. It worked fine on Galaxy Nexus, but it didn't work on the Galaxy Player GB70.
When I say it didn't work, I mean that the object appears but the texture is not rendered. Eventually, I changed some parameters that I use for the Rajawali framework when creating textures and got it to work. Here is what I found out.
The cause was coming from where the GL_TEXTURE_MIN_FILTER was being set. Among the following four values
GLES20.GL_LINEAR_MIPMAP_LINEAR
GLES20.GL_NEAREST_MIPMAP_NEAREST
GLES20.GL_LINEAR
GLES20.GL_NEAREST
the texture is only rendered when GL_TEXTURE_MIN_FILTER is not set to a filter using mipmap. So when GL_TEXTURE_MIN_FILTER is set to the last two it works.
Now here is the what I don't understand and am curious about. When I shrink the image which I'm using as the texture to size 512x512 the GL_TEXTURE_MIN_FILTER settings does not matter. All four settings of the min filter works.
So my question is, is there a requirement for the dimensions of the image when using min filter for the texture? Such as am I required to use an image that is square? Can other things such as the wrap style or the the configuration of the mag filter be a problem?
Or does it seem like a OpenGL implementation bug of the device?
Good morning, this a typical example of non-power of 2 textures.
Textures need to be power of 2 in their resolution for a multitude of reasons, this is a very common mistake and it did happen to everybody to fall in this pitfall :) too me too.
The fact that non power of 2 textures work smoothly on some devices/GPU, depends merely to the OpenGL drivers implementation, some GPUs support them clearly, some others don't, I strongly suggest you to go for pow2 textures in order to be able to guarantee the functioning on all the devices.
Last but not least, using non power of 2 textures can lead you to a cathastrophic scenarious in GPU memory utilization since, most of the drivers which accept non-powerof2 textures, need to rescale in memory the textures to the nearest higher power of 2 factor. For instance, having a texture of 520X520 could lead to an actual memory mapping of 1024X1024.
This is something you don't want because in real world "size matters", especially on mobile devices.
You can find a quite good explanation in the OpenGL Gold Book, the OpenGL ES 2.0:
In OpenGL ES 2.0, textures can have non-power-of-two (npot)
dimensions. In other words, the width and height do not need to be a
power of two. However, OpenGL ES 2.0 does have a restriction on the
wrap modes that can be used if the texture dimensions are not power of
two. That is, for npot textures, the wrap mode can only be
GL_CLAMP_TO_EDGE and the minifica- tion filter can only be GL_NEAREST
or GL_LINEAR (in other words, not mip- mapped). The extension
GL_OES_texture_npot relaxes these restrictions and allows wrap modes
of GL_REPEAT and GL_MIRRORED_REPEAT and also allows npot textures to
be mipmapped with the full set of minification filters.
I suggest you to evaluate this book since it does a quite decent coverage to this topic.
I'm currently playing around with 2D graphics in android and have been using a plain old SurfaceView to draw Drawables and Bitmaps to the screen. This has been working alright, but there's a little stutter in the sprite movement, and I'm wondering the feasibility to do a real time (but not terrible fast) game with this.
I know GLSurfaceView exists which uses OpenGL, but I'm curious as to the extent to which this makes a difference. Is a plain SurfaceView hardware accelerated, or do I need to use OpenGL? What type of speed difference could I expect from switching to OpenGL, and how much altering of code would it require to switch (the game logic is all in a separate object that provides an ordered array of drawables to the SurfaceView)?
As far as I can tell, you have to use openGL to get HW acceleration. But don't take is for granted and wait for other answers ^^
If it really is the case, the speedup should be quite important. Any 2D application should work at at very least 20 fps (generally less polygons than 3D applications)
it would take a substantial amount of code, but 1) as a first attempt, you could try with only 1 square VBO and change the matrix each time and 2) your rendering seems already quite encapsulated so it should simplify things a lot.
SurfaceView is not hardware accelerated in default.
if you want to get HW acceleration
use GLSurfaceView, which use opengl and is hardware accelerated.
Hardware acceleration is possible for a regular SurfaceView since 3.0.
http://developer.android.com/guide/topics/graphics/hardware-accel.html