I'm developing a drawing application where the user can select a range of brushes and paint on the screen. I'm using textures as brushes and I'm drawing vertexes as points with PointSpriteOES enabled as displayed below.
gl.glEnable(GL10.GL_TEXTURE_2D);
gl.glEnableClientState(GL10.GL_VERTEX_ARRAY);
gl.glEnable(GL11.GL_POINT_SPRITE_OES);
gl.glTexEnvf(GL11.GL_POINT_SPRITE_OES, GL11.GL_COORD_REPLACE_OES, GL10.GL_TRUE);
The application worked just as desired, but I needed to optimize it for runtime as its framerate dropped under 30 when dealt with a lot of vertexes. Since the application's domain enables it, it seemed a good idea to leave the glClear and leave the redrawing of already existing lines as it's really unnecessary. However, this resulted in a very strange bug I couldn't fix since then. When the OpenGL is not rendering (I have set render mode to WHEN_DIRTY), only about 1/3 of all the vertexes are visible on the screen. Requesting a redraw by calling requestRender() makes these vertexes disappear and others are shown. There are three states I can tell apart, each state showing an approximate of 1/3 of all vertexes.
I have uploaded three screenshots (http://postimg.org/image/d63tje56l/, http://postimg.org/image/npeds634f/) to make it a bit easier for you to understand. The screenshots show the state where I have drawn three lines with different colors (SO didn't enable me to link all 3 images, but I hope you can imagine it - it has the segments missing from the 1st and the 2nd). It can clearly be seen that if I could merge the screens into a single one, I would get the desired result.
I'm only guessing what the issue is caused by since I'm not an OpenGL expert. My best shot is that OpenGL uses triple buffers and only a single buffer is shown at a given time, while other vertexes are placed on the backbuffers. I have tried forcing all buffers to be rendered as well as trying to force all vertexes to appear on all buffers, but I couldn't manage either.
Could you help me solve this?
I believe your guess is exactly right. The way OpenGL is commonly used, you're expected to draw a complete frame, including an initial clear, every time you're asked to redraw. If you don't do that, behavior is generally undefined. In your case, it certainly looks like triple buffering is used, and your drawing is distributed over 3 separate surfaces.
This model does not work very well for incremental drawing, where drawing a full frame is very expensive. There are a few options you can consider.
Optimize your drawing
This is not directly a solution, but always something worth thinking about. If you can find a way to make your rendering much more efficient, there might be no need to render incrementally. You're not showing your rendering code, so it's possible that you simply have too many points to get a good framerate.
But in any case, make sure that you use OpenGL efficiently. For example, store your points in VBOs, and update only the parts that change with glBufferSubData().
Draw to FBO, then blit
This is the most generic and practical solution. Instead of drawing directly to the primary framebuffer, use a Frame Buffer Object (FBO) to render to a texture. You do all of your drawing to this FBO, and copy it to the primary framebuffer when it's time to redraw.
For copying from FBO to the primary framebuffer, you will need a simple pair of vertex/fragment shaders in ES 2.0. In ES 3.0 and later, you can use glBlitFramebuffer().
Pros:
Works on any device, using only standard ES 2.0 features.
Easy to implement.
Cons:
Requires a copy of framebuffer on every redraw.
Single Buffering
EGL, which is the underlying API to connect OpenGL to the window system in Android, does have attributes to create single buffered surfaces. While single buffered rendering is rarely advisable, your use case is one of the few where it could still be considered.
While the API definition exists, the documentation specifies support as optional:
Client APIs may not be able to respect the requested rendering buffer. To determine the actual buffer being rendered to by a context, call eglQueryContext.
I have never tried this myself, so I have no idea how widespread support is, or if it's supported on Android at all. The following sketches how it could be implemented if you want to try it out:
If you derive from GLSurfaceView for your OpenGL rendering, you need to provide your own EGLWindowSurfaceFactory, which would look something like this:
class SingleBufferFactory implements GLSurfaceView.EGLWindowSurfaceFactory {
public EGLSurface createWindowSurface(EGL10 egl, EGLDisplay display,
EGLConfig config, Object nativeWindow) {
int[] attribs = {EGL10.EGL_RENDER_BUFFER, EGL10.EGL_SINGLE_BUFFER,
EGL10.EGL_NONE};
return egl.eglCreateWindowSurface(display, config, nativeWindow, attribs);
}
public void destroySurface(EGL10 egl, EGLDisplay display, EGLSurface surface) {
egl.eglDestroySurface(display, surface);
}
}
Then in your GLSurfaceView subclass constructor, before calling setRenderer():
setEGLWindowSurfaceFactory(new SingleBufferFactory());
Pros:
Can draw directly to primary framebuffer, no need for copies.
Cons:
May not be supported on some or all devices.
Single buffered rendering may be inefficient.
Use EGL_BUFFER_PRESERVE
The EGL API allows you to specify a surface attribute that requests the buffer content to be preserved on eglSwapBuffers(). This is not available in the EGL10 interface, though. You'll have to use the EGL14 interface, which requires at least API level 17.
To set this, use:
EGL14.eglSurfaceAttrib(EGL14.eglGetCurrentDisplay(), EGL14.eglGetCurrentSurface(EGL14.EGL_DRAW),
EGL14.EGL_SWAP_BEHAVIOR, EGL14.EGL_BUFFER_PRESERVED);
You should be able to place this in the onSurfaceCreated() method of your GLSurfaceView.Renderer implementation.
This is supported on some devices, but not on others. You can query if it's supported by querying the EGL_SURFACE_TYPE attribute of the config, and check it against the EGL_SWAP_BEHAVIOR_PRESERVED_BIT bit. Or you can make this part of your config selection.
Pros:
Can draw directly to primary framebuffer, no need for copies.
Can still use double/triple buffered rendering.
Cons:
Only supported on subset of devices.
Conclusion
I would probably check for EGL_BUFFER_PRESERVE support on the specific device, and use it if it is suppported. Otherwise, go for the FBO and blit approach.
Related
I am trying to generate movie using MediaMuxer. The Grafika example is an excellent effort, but when i try to extend it, I have some problems.
I am trying to draw some basic shapes like square, triangle, lines into the Movie. My openGL code works well if I draw the shapes into the screen but I couldn't draw the same shapes into the video.
I also have questions about setting up openGL matrix, program, shader and viewport. Normally, there are methods like onSurfaceCreated and onSurfaceChanged so that I can setup these things. What is the best way to do it in GeneratedMovie?
Anybody has examples of writing into video with more complicated shapes would be welcome
The complexity of what you're drawing shouldn't matter. You draw whatever you're going to draw, then call eglSwapBuffers() to submit the buffer. Whether you draw one flat-shaded triangle or 100K super-duper-shaded triangles, you're still just submitting a buffer of data to the video encoder or the surface compositor.
There is no equivalent to SurfaceView's surfaceCreated() and surfaceChanged(), because the Surface is created by MediaCodec#createInputSurface() (so you know when it's created), and the Surface does not change.
The code that uses GeneratedMovie does some fairly trivial rendering (set scissor rect, call clear). The code in RecordFBOActivity is what you should probably be looking at -- it has a bouncing rect and a spinning triangle, and demonstrates three different ways to deal with the fact that you have to render twice.
(The code in HardwareScalerActivity uses the same GLES routines and demonstrates texturing, but it doesn't do recording.)
The key thing is to manage your EGLContext and EGLSurfaces carefully. The various bits of GLES state are held in the EGLContext, which can be current on only one thread at a time. It's easiest to use a single context and set up a separate EGLSurface for each Surface, but you can also create separate contexts (with or without sharing) and switch between them.
Some additional background material is available here.
I was trying to render rubix cubes with opengl es on android. Here is how I do it: I render 27 ajacent cubes. And the faces of the cubes which is covered is textured with black bmp picture and other faces that can be seen is textured with colorful picture. I used cull face and depth-test to avoid rendering useless faces. But look at what I got, it is pretty wierd. The black faces show up sometimes. Can anyone tell me how to get rid of the artifacts?
Screenshots:
With the benefit of screenshots it looks like the depth buffering simply isn't having any effect — would it be safe to conclude that you render the side of the cube with the blue faces first, then the central section behind it, then the back face?
I'm slightly out of my depth with the Android stuff but I think the confusion is probably just that enabling the depth test within OpenGL isn't sufficient. You also have to ensure that a depth buffer is allocated.
Probably you have a call to setEGLConfigChooser that's disabling the depth buffer. There are a bunch of overloaded variants of that method but the single boolean version and the one that allows redSize, greenSize, etc to be specified give you explicit control over whether there's a depth buffer size. So you'll want to check those.
If you're creating your framebuffer explicitly then make sure you are attaching a depth renderbuffer.
I'm trying to figure a way to recreate or at least have a similar result to the max clamp blend equation in OpenGl es 2.0 on Android devices.
Unfortunately, glBlendEquation(GL_MAX_EXT) is not supported on Android. GL_MAX enum is defined in the gl header in Android but when executing, the result is a GL_INVALID_ENUM, 0x0500 error.
I have a solution using shaders and off screen textures where each render ping-pongs back and forth between textures using the shader to calculate the max pixel value.
However, this solution isn't fast enough for any real time execution on most Android devices.
So given this limitation, is there any way to recreate a similar result using just different blend equations and blend factors?
I have tried many blend function combinations, the closest have been:
glBlendFunction(GL_SRC_ALPHA, GL_ONE_MINUS_ALPHA) : This comes close but textures become too transparent. Textures with low alpha values are difficult to see.
glBlendFunction(GL_ONE_MINUS_DST_ALPHA, GL_SRC_ALPHA) : This also comes some what close but the alpha accumulates too much and the colors become darker than intended.
If you could do GL_MAX blending without needing a special blend function... OpenGL would never have added it in the first place. So your options are to do without or to use your shader method.
I was reading this article, and the author writes:
Here's how to write high-performance applications on every platform in two easy steps:
[...]
Follow best practices. In the case of Android and OpenGL, this includes things like "batch draw calls", "don't use discard in fragment shaders", and so on.
I have never before heard that discard would have a bad impact on performance or such, and have been using it to avoid blending when a detailed alpha hasn't been necessary.
Could someone please explain why and when using discard might be considered a bad practise, and how discard + depthtest compares with alpha + blend?
Edit: After having received an answer on this question I did some testing by rendering a background gradient with a textured quad on top of that.
Using GL_DEPTH_TEST and a fragment-shader ending with the line "if(
gl_FragColor.a < 0.5 ){ discard; }" gave about 32 fps.
Removing the if/discard statement from the fragment-shader increased
the rendering speed to about 44 fps.
Using GL_BLEND with the blend function "(GL_SRC_ALPHA,
GL_ONE_MINUS_SRC_ALPHA)" instead of GL_DEPTH_TEST also resulted in around 44 fps.
It's hardware-dependent. For PowerVR hardware, and other GPUs that use tile-based rendering, using discard means that the TBR can no longer assume that every fragment drawn will become a pixel. This assumption is important because it allows the TBR to evaluate all the depths first, then only evaluate the fragment shaders for the top-most fragments. A sort of deferred rendering approach, except in hardware.
Note that you would get the same issue from turning on alpha test.
"discard" is bad for every mainstream graphics acceleration technique - IMR, TBR, TBDR. This is because visibility of a fragment(and hence depth) is only determinable after fragment processing and not during Early-Z or PowerVR's HSR (hidden surface removal) etc. The further down the graphics pipeline something gets before removal tends to indicate its effect on performance; in this case more processing of fragments + disruption of depth processing of other polygons = bad effect
If you must use discard make sure that only the tris that need it are rendered with a shader containing it and, to minimise its effect on overall rendering performance, render your objects in the order: opaque, discard, blended.
Incidentally, only PowerVR hardware determines visibility in the deferred step (hence it's the only GPU termed as "TBDR"). Other solutions may be tile-based (TBR), but are still using Early Z techniques dependent on submission order like an IMR does.
TBRs and TBDRs do blending on-chip (faster, less power-hungry than going to main memory) so blending should be favoured for transparency. The usual procedure to render blended polygons correctly is to disable depth writes (but not tests) and render tris in back-to-front depth order (unless the blend operation is order-independent). Often approximate sorting is good enough. Geometry should be such that large areas of completely transparent fragments are avoided. More than one fragment still gets processed per pixel this way, but HW depth optimisation isn't interrupted like with discarded fragments.
Also, just having an "if" statement in your fragment shader can cause a big slowdown on some hardware. (Specifically, GPUs that are heavily pipelined, or that do single instruction/multiple data, will have big performance penalties from branch statements.) So your test results might be a combination of the "if" statement and the effects that others mentioned.
(For what it's worth, testing on my Galaxy Nexus showed a huge speedup when I switched to depth-sorting my semitransparent objects and rendering them back to front, instead of rendering in random order and discarding fragments in the shader.)
Object A is in front of Object B. Object A has a shader using 'discard'. As such, I can't do 'Early-Z' properly because I need to know which sections of Object B will be visible through Object A. This means that Object A has to pass all the way through the processing pipeline until almost the last moment (until fragment processing is performed) before I can determine if Object B is actually visible or not.
This is bad for HSR and 'Early-Z' as potentially occluded objects have to sit and wait for the depth information to be updated before they can be processed. As has been stated above, its bad for everyone, or, in slightly more friendly way "Friends don't let friends use Discard".
In your test, your if statment is in per pixel level performance
if ( gl_FragColor.a < 0.5 ){ discard; }
Would be processed once per pixel that was being rendered (pretty sure that's per pixel and not per texel)
If your if statment was testing a Uniform or Constant you'd most likley get a different result due to to Constants only being processed once on compile or uniforms being processed once per update.
I'm working on implementing picking for an OpenGL game I'm writing for android. It's using the "unique color" method of drawing each touchable object as a solid color that is unique to each object. The user input then reads glReadPixels() at the location of the touch. I've gotten the coloring working, and glReadPixels working, but I have been unable to separate the "color" rendering from the main actual rendering, which complicated the use of glReadPixels.
Supposedly the trick to working with this is to render the second scene (for input) into an offscreen buffer, but this seems to be a bit problematic. I've investigated using OpenGL ES1.1 FBO's to act as an offscreen buffer, but it seems my handset (Samsung Galaxy S Vibrant (2.2)) does not support FBO's. I'm at a loss for how to correctly render this scene (and run glReadPixels on it) without the user witnessing it.
Any ideas how offscreen rendering of this sort can be done?
if FBO is not supported, you can always resort to rendering to your normal back-buffer.
Typical usage would be:
Clear back-buffer
draw "color-as-id" objects
Clear back-buffer
draw normal
SwapBuffers
The second clear will make sure the picking code will not show up on the final image.