glReadPixels to EGLImage direct texture slower than glReadPixels to ByteBuffer and glTexSubImage2D?

glReadPixels to EGLImage direct texture slower than glReadPixels to ByteBuffer and glTexSubImage2D? - android

I have an Android OpenGL-ES application featuring two threads. Call Thread 1 the "display thread" which "blends" its current texture with a texture emanating from Thread 2 a.k.a the "worker thread". Thread 2 performs off-screen rendering (render to texture), and then Thread 1 combines this texture with it's own texture to generate the frame which is displayed to the user.
I have a working solution but I know it is inefficient and am trying to improve upon it. In it's OnSurfaceCreated() method, Thread 1 creates two textures. Thread 2, in it's draw method, does a glReadPixels() into a ByteBuffer (let's refer to it as bb). Thread 2 then signals to Thread 1 that a new frame is ready, at which point Thread 1 invokes glTexSubImage2D(bb) to update it's texture with the new data from Thread 2, and proceed with it's "blending" in order to generate a new frame.
This architecture works better on some Android devices than others, and I have been able to garner a slight improvement in performance by using PBOs. But I figured that by using so-called "direct textures" via the EGL Image extension (https://software.intel.com/en-us/articles/using-opengl-es-to-accelerate-apps-with-legacy-2d-guis) I would gain some benefit by removing the need for the costly glTexSubImage2D() call. Yes, I'd still have the glReadPixels() call which bothers me still but at least I should measure some improvement. In fact, at least on a Samsung Galaxy Tab S (Mali T628 GPU) my new code is dramatically slower than before! How can this be?
In the new code Thread 1 instantiates the EGLImage object by using gralloc and proceeds to bind it to a texture:
// note gbuffer::create() is a wrapper around gralloc
buffer = gbuffer::create(width, height, gbuffer::FORMAT_RGBA_8888);
EGLClientBuffer anb = buffer->getNativeBuffer();
EGLImageKHR pEGLImage = _eglCreateImageKHR(eglGetCurrentDisplay(), EGL_NO_CONTEXT, EGL_NATIVE_BUFFER_ANDROID, (EGLClientBuffer)anb, attrs);
glBindTexture(GL_TEXTURE_2D, texid); // texid from glGenTextures(...)
_glEGLImageTargetTexture2DOES(GL_TEXTURE_2D, pEGLImage);
Then Thread 2 in it's main loop does it's off-screen render-to-texture stuff and essentially pushes the data back over to Thread 1 via glReadPixels() with the destination address as the backing storage behind the EGLImage:
void* vaddr = buffer->lock();
glReadPixels(0, 0, width, height, GL_RGBA, GL_UNSIGNED_BYTE, vaddr);
buffer->unlock();
How can this be slower than glReadPixels() into a ByteBuffer followed by glTexSubImage2D from the aforementioned ByteBuffer? I'm also interested in alternative techniques as I am not limited to OpenGL-ES 2.0 and can use OpenGL-ES 3.0. I have tried FBOs but ran into some issues.
In response to the first answer I decided to take a stab at implementing a different approach. Namely, sharing the textures between Thread 1 and Thread 2. While I don't have the sharing part working yet, I do have Thread 1's EGLContext passed down to Thread 2's EGLContext so that in theory, Thread 2 can share textures with Thread 1. With these changes in place, and with the glReadPixels() and glTexSubImage2D() calls remaining, the app works but is far slower than before. Strange.
The other oddity I uncovered is the deal with the difference between javax.microedition.khronos.egl.EGLContext and android.opengl.EGLContext. GLSurfaceView exposes an interface method setEGLContextFactory() which allows me to pass Thread 1's EGLContext to Thread 2, as in the following:
public Thread1SurfaceView extends GLSurfaceView {
public Thread1SurfaceView(Context context) {
super(context);
// here is how I pass Thread 1's EGLContext to Thread 2
setEGLContextFactory(new EGLContextFactory() {
#Override
public javax.microedition.khronos.egl.EGLContext createContext(
final javax.microedition.khronos.egl.EGL10 egl,
final javax.microedition.khronos.egl.EGLDisplay display,
final javax.microedition.khronos.egl.EGLConfig eglConfig) {
// Configure context for OpenGL ES 3.0.
int[] attrib_list = {EGL14.EGL_CONTEXT_CLIENT_VERSION, 3, EGL14.EGL_NONE};
javax.microedition.khronos.egl.EGLContext renderContext =
egl.eglCreateContextdisplay, eglConfig, EGL10.EGL_NO_CONTEXT, attrib_list);
mThread2 = new Thread2(renderContext);
}
});
}
Previously, I used stuff out of the EGL14 namespace but since the interface for GLSurfaceView apparently relies on EGL10 stuff I had to change the implementation for Thread 2. Everywhere I used EGL14 I replaced with javax.microedition.khronos.egl.EGL10. Then my shaders stopped compiling until I added GLES3 to the attribute list. Now things work, albeit slower than before (but next I will remove the calls to glReadPixels and glTexSubImage2D).
My follow-on question is, is this the right way to handle the javax.microedition.khronos.egl.* versus android.opengl.* issue? Can I typecast javax.microedition.khronos.egl.EGL10 to android.opengl.EGL14, javax.microedition.khronos.egl.EGLDisplay to android.opengl.EGLDisplay, and javax.microedition.khronos.egl.EGLContext to android.opengl.EGLContext? What I have right now just seems ugly and doesn't feel right, although this proposed casting doesn't sit right either. Am I missing something?

Based on the description of what you're trying to do, both approaches sound way more complicated and inefficient than necessary.
The way I've always understood EGLImage, it's a mechanism for sharing images between different processes, and possibly different APIs.
For multiple OpenGL ES contexts in the same process, you can simply share the textures. All you need to do is make both contexts part of the same share group, and they can both use the same textures. In your use case, you can then have one thread rendering to a texture using an FBO, and then sample from it in the other thread. This way, there is no extra data copying, and your code should become much simpler.
The only slightly tricky aspect is synchronization. ES 2.0 does not have synchronization mechanisms in the API. The best you can probably do is call glFinish() in one thread (e.g. after your thread 2 finished rendering to the texture), and then use standard IPC mechanisms to signal the other thread. ES 3.0 has sync objects, which make this much more elegant.
My earlier answer here sketches some of the steps needed to create multiple contexts that are in the same share group: about opengles and texture on android. The key part of creating multiple contexts in the same share group is the 3rd argument to eglCreateContext, where you specify the context that you want to share objects with.

Related

What Surface to use for eglMakeCurrent for context that only renders into FBOs

I'm having the following situation:
In a cross platform rendering library for iOS and Android (written in c(++)), i have two threads that each need their own EGLContext:
Thread A is the main thread; it renders to the Window.
Thread B is a generator thread, that does various calculations and renders the results into textures that are later used by thread A.
Since i can't use EGL on iOS, the library uses function pointers to static Obj.-C functions to create a new context and set it current.
This already works, i create the context for thread A using
EAGLContext *contextA = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES2];
The context for thread B is created using
EAGLContext *contextB = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES2 sharegroup:[contextA sharegroup]];
I can then set either of the two current:
[EAGLContext setCurrentContext:context];
To use the same logic (function pointers passed to the library) on Android, i want to do this in the C side of the JNI bindings, this time using real EGL instead of Apple's EAGL.
I can easily create contextA using a WindowSurface and the native Window, i can create contextB and pass contextA to the shareContext parameter of the eglCreateContext call.
But when i want to make contextB current, i have to pass a surface to the eglMakeCurrent call, and i'm trying to figure out what kind of surface to pass there.
I cannot use the WindowSurface i use for contextA as the spec says in section 3.7 that "At most one context for each supported client API may be current to a particular thread at a given time, and at most one context may be bound to a particular surface at a given time."
I cannot specify EGL_NO_SURFACE, because that would result in an EGL_BAD_MATCH error in the eglMakeCurrent call.
It seems I could use a PBuffer surface, but I hesitate because I'd have to specify the width and height when I create such a surface, and thread B might want to create textures of different sizes. In addition to that, the "OpenGL ES 2.0 Programming Guide" by Munshi, Ginsburg, and Shreiner states in section 3.8 that "Pbuffers are most often used for generating texture maps. If all you want to do is render to a texture, we recommend using framebuffer objects [...] instead of pbuffers because they are more efficient", which is exactly what i want to do in thread B.
I don't understand what Munshi, Ginsurg and Shreiner mean by that sentence, how would a framebuffer object be a replacement for a pbuffer surface? What if I create a very small (say 1x1px) pbuffer surface to make the context current - can i then still render into arbitrarily large FBOs? Are there any other possibilities I'm not yet aware of?
Thanks a lot for your help!

The surface you pass to eglMakeCurrent() must be an EGL surface from eglCreateWindowSurface(). For example:
EGLSurface EglSurface = mEgl.eglCreateWindowSurface(mEglDisplay, maEGLconfigs[0], surfaceTexture, null);
mEgl.eglMakeCurrent(mEglDisplay, EglSurface, EglSurface, mEglContext);
But, eglCreateWindowSurface() requires a SurfaceTexture which is provided to the onSurfaceTextureAvailable() callback when a TextureView is created, but you can also create off-screen SurfaceTextures without any View.
There is an example app that uses TextureView in the Android SDK here, although it uses the SurfaceTexture for camera video rather than OpenGL ES rendering:
sources\android-17\com\android\test\hwui\GLTextureViewActivity.java
By default, the EGL surface for FBOs will have the same size as the SurfaceTexture they were created from. You can change the size of a SurfaceTexture with:
surfaceTexture.setDefaultBufferSize(width, height);
Don't use pbuffers on Android because some platforms (Nvidia Tegra) do not support them.
This article explains the advantages of FBOs over pbuffers in detail:
http://processors.wiki.ti.com/index.php/Render_to_Texture_with_OpenGL_ES

I ended up using a PBuffer surface (sized 1x1) - i then create an FBO and render into textures just fine. For displaying them (in a different thread and a different (shared) opengl context), i use a windowsurface with an ANativeWindow (there's a sample of that somewhere in the sdk).

If drawing to FBO is the only things you want to do, you can grab any EGLContext that already created by you or someone else (e.g. GLSurfaceView)and make it current then you just generate your FBO then draw with it.
The problem is how to share the context say created by GLSurfaceView to your cross-platform c++ library. I did this by calling a static function inside the c++ to get eglcontext and surface immediately after the context been made current by Java layer. like the code below:
//This is a GLSurfaceView renderer method
#override
public void onSurfaceCreated(GL10 gl, EGLConfig config) {
//up to this point, we know the EGLContext
//has already been set current on this thread.
//call JNI function to setup context
graphics_library_setup_context();
}
the c++ counterpart
void setup_context() {
context = eglGetCurrentContext();
display = eglGetCurrentDisplay();
surface = eglGetCurrentSurface(EGL_DRAW);
}

Fast dynamic vertices in OpenGL ES 2.0 on Android

I am trying to batch draw a bunch of lines on Android using OpenGL ES 2.0 and I need to know the best way to do this.
Right now I made a class called LineEngine which builds up a FloatBuffer of all the vertices to draw and then draws all the lines at once. The problem is that apparently FloatBuffer.put() is very slow and is gobbling up CPU time like crazy.
Here is my class
public class LineEngine {
private static final float[] IDENTIY = new float[16];
private FloatBuffer mLinePoints;
private FloatBuffer mLineColors;
private int mCount;
public LineEngine(int maxLines) {
Matrix.setIdentityM(IDENTIY, 0);
ByteBuffer byteBuf = ByteBuffer.allocateDirect(maxLines * 2 * 4 * 4);
byteBuf.order(ByteOrder.nativeOrder());
mLinePoints = byteBuf.asFloatBuffer();
byteBuf = ByteBuffer.allocateDirect(maxLines * 2 * 4 * 4);
byteBuf.order(ByteOrder.nativeOrder());
mLineColors = byteBuf.asFloatBuffer();
reset();
}
public void addLine(float[] position, float[] color){
mLinePoints.put(position, 0, 8); //These lines
mLineColors.put(color, 0, 4); // are taking
mLineColors.put(color, 0, 4); // the longest!
mCount++;
}
public void reset(){
mLinePoints.position(0);
mLineColors.position(0);
mCount = 0;
}
public void draw(){
mLinePoints.position(0);
mLineColors.position(0);
GraphicsEngine.setMMatrix(IDENTIY);
GraphicsEngine.setColors(mLineColors);
GraphicsEngine.setVertices4d(mLinePoints);
GraphicsEngine.disableTexture();
GLES20.glDrawArrays(GLES20.GL_LINES, 0, mCount * 2);
GraphicsEngine.disableColors();
reset();
}
}
Is there a better way to batch all these lines together?

What you are trying to do is called SpriteBatching. If you want your program to be robust in openGL es you have to check the following( a list of what makes your program slow ) :
-Changing to many opengl ES states each frame.
-Enable//Dissable (textures etc) again and again. Once you enable something you dont have to do it again it will be applied automaticly its frame.
-Using to many assets instead of spriteSheets. Yes thats make a huge performance impact. For example if you have 10 images you have to load for each image a different texture and that is SLOW. A better way to do this is to create a spriteSheet.(You can check google for free spriteSheet creators).
-Your images are high quality. Yes! check your resolution and file extension. Prefer .png files.
-Care for the api 8 float buffer bug(you have to use int buffer instead to fix the bug).
-Using java containers. This is the biggest pitfall, if you use string concatenation(that's not a container but it returns a new string each time) or a List or any other container that RETURNS A NEW CLASS INSTANCE chances are your program will be slow due to garbage collection. For input handling i would suggest you to search a teqnique called the Pool Class. Its use is to recycle objects instead of creating new ones. Garbage collector is the big enemy, try to make him happy and avoid any programming technique that might call him.
-Never load things on the fly, instead make a loader class and load all the necessary assets in the begining of the app.
If you do implement this list then your chances are your program will be robust.
One last adition. What is a sprite batcher? A spriteBatcher is a class that uses a single texture to render multiple objects. It will create automaticly vertices, indices, colors, u - v coords and it will render them as a batch. This pattern will save GPU power as well as cpu power. From your posted code i can't be sure what causes the cpu to slow down but from my experience is due to one(or more) things of the list i previously mention. Check the list, follow it, search google for spriteBatching and i am sure your program will run fast. Hope i helped!
Edit: I think i found what causes your program to slow down, you dont flip the buffer dude! You just reset the position. You just add add add more objects and you cause buffer overload. In the reset method just flip the buffer. mLineColors.flip mLinePaints.flip will do the job. Make sure you call them each frame if you send new verices each frame.

FloatBuffer.put(float[]) with a relatively large float array should be considerably faster. The single put(float) calls have plenty of overhead.

Just go for a very simple native function which will be called from your class. You can put float[] to OpenGL directly, no need to kill CPU time with a silly buffer interface.

android opengl texture loading thread

I'm working on an app that needs to load textures for frame animations at certain times while it's executing, the rendering thread needs to continue to run and I need to load the textures in a bg thread. Is there a way to do this in android? I was able to in ios by creating a separate opengl context on the other thread that used the same sharegroup but am not sure if there is a similar facility on android?

Yes, you can share textures between contexts (as long as your driver supports it). Create your background loading context like this (meaning you want to share objects with rendering_context):
eglCreateContext(display, config, rendering_context, attrs);
Then after doing something like this in your background context:
glGenTextures(1, &tex);
glBindTexture(GL_TEXTURE_2D, tex);
glTexImage2D(...);
You can then bind and use tex from your rendering context.

Android OpenGL lag every x frame

I am developer of Android game.
I have create an GLSurfaceView and draw something in OnDrawFrame(GL10 gl)
like below
void OnDrawFrame(GL10 gl)
{
frame_limit_wait();
game_logic();
draw_game();
}
Everything is good, but onething is strange, when drawing n-frame(Occur at GC_EXPLICIT/paused 92ms) will cause game a bit pause, in appplication it's ok, but not in a smooth game.
Original game I was use SurfaceView and Thread update works and smooth.
If add a line below draw_game() like 'system.gc()' it seems work but I feel a bit slow.
Compare to another game engine, my game endinge is running slow.
How to resolve the latency problem ?
Edit : I have solved the issue.
Just initial native Float Buffer once, and use put and position(0) to modify the Buffer content.

The garbage collector is running and holding up your frame. I would recommend taking a close look at the code that executes in the functions frame_limit_wait(), game_logic(), and draw_game() and do everything possible to prevent initializing new objects.
The most common techniques for this include:
use fields instead of local variables
use for(int i; i < x; i++) instead of for(variable : list)
making sure no extra processing is done while drawing frames.
If that still doesn't work then you can try offloading some processing to a separate thread or consider using the NDK and writing native code. With C++ you don't have to worry about the GC at all. However both of these methods will complicate your code quite a bit.

Try running your game_logic(),draw_game() code in a separate thread so that when its waiting in frame_limit_wait() , GC can be serviced in a better way.
I mean instead of using the main thread try running your rendering/update code in seperate thread.

Threading textures load process for android opengl game

I have a large amount of textures in JPG format.
And I need to preload them in opengl memory before the actual drawing starts.
I've asked a question and I've been told that the way to do this is to separate JPEG unpacking from glTexImage2D(...) calls to another thread.
The problem is I'm not quite sure how to do this.
OpenGL (handler?), needed to execute glTexImage2D is only available in GLSurfaceView.Renderer's onSurfaceCreated and OnDrawFrame methods.
I can't unpack all my textures and then in onSurfaceCreated(...) load them in opnegl,
because they probably won't fit in limited vm's memory (20-40MB?)
That means I have to unpack and load them one-by one, but in that case I can't get an opengl pointer.
Could someone, please, give me and example of threading of textures loading for opengl game?
It must be some some typical procedure, and I can't get any info anywhere.

As explained in 'OpenGLES preloading textures in other thread' there are two separate steps: bitmap creation and bitmap upload. In most cases you should be fine by just doing the bitmap creation on a secondary thread --- which is fairly easy.
If you experience frame drops while uploading the textures, call texImage2D from a background thread. To do so you'll need to create a new OpenGL context which shares it's textures with your rendering thread because each thread needs it's own OpenGL context.
EGLContext textureContext = egl.eglCreateContext(display, eglConfig, renderContext, null);
Getting the parameters for eglCreateContext is a little bit tricky. You need to use setEGLContextFactory on your SurfaceView to hook into the EGLContext creation:
#Override
public EGLContext createContext(final EGL10 egl, final EGLDisplay display, final EGLConfig eglConfig) {
EGLContext renderContext = egl.eglCreateContext(display, eglConfig, EGL10.EGL_NO_CONTEXT, null);
// create your texture context here
return renderContext;
}
Then you are ready to start a texture loading thread:
public void run() {
int pbufferAttribs[] = { EGL10.EGL_WIDTH, 1, EGL10.EGL_HEIGHT, 1, EGL14.EGL_TEXTURE_TARGET,
EGL14.EGL_NO_TEXTURE, EGL14.EGL_TEXTURE_FORMAT, EGL14.EGL_NO_TEXTURE,
EGL10.EGL_NONE };
EGLSurface localSurface = egl.eglCreatePbufferSurface(display, eglConfig, pbufferAttribs);
egl.eglMakeCurrent(display, localSurface, localSurface, textureContext);
int textureId = loadTexture(R.drawable.waterfalls);
// here you can pass the textureId to your
// render thread to be used with glBindTexture
}
I've created a working demonstration of the above code snippets at https://github.com/perpetual-mobile/SharedGLContextsTest.
This solution is based on many sources around the internet. The most influencing ones where these three:
http://www.khronos.org/message_boards/showthread.php/9029-Loading-textures-in-a-background-thread-on-Android
http://www.khronos.org/message_boards/showthread.php/5843-Texture-Sharing
Why is eglMakeCurrent() failing with EGL_BAD_MATCH?

You just have your main thread with the uploading routine, that has access to OpenGL and calls glTexImage2D. The other thread loads (and decodes) the image from file to memory. While the secondary thread loads the next image, the main thread uploads the previously loaded image into the texture. So you only need memory for two images, the one currently loaded from file and the one currently uploaded into the GL (which is the one loaded previously). Of course you need a bit of synchronization, to prevent the loader thread from overwriting the memory, that the main thread currently sends to GL and to prevent the main thread from sending unfinished data.

"There has to be a way to call GL functions outside of the initialization function." - Yes. Just copy the pointer to gl and use it anywhere.
"Just be sure to only use OpenGL in the main thread." Very important. You cannot call in your Game Engine (which may be in another thread) a texture-loading function which is not synchronized with the gl-thread. Set there a flag to signal your gl-thread to load a new texture (for example, you can place a function in OnDrawFrame(GL gl) which checks if there must be a new texture loaded.

To add to Rodja's answer, if you want an OpenGL ES 2.0 context, then use the following to create the context:
final int EGL_CONTEXT_CLIENT_VERSION = 0x3098;
int[] contextAttributes =
{ EGL_CONTEXT_CLIENT_VERSION, 2, EGL10.EGL_NONE };
EGLContext renderContext = egl.eglCreateContext(
display, config, EGL10.EGL_NO_CONTEXT, contextAttributes);
You still need to call setEGLContextClientVersion(2) as well, as that is also used by the default config chooser.
This is based on Attribute list in eglCreateContext

Found a solution for this, which is actually very easy: After you load the bitmap (in a separate thread), store it in an instance variable, and in draw method, you check if it's initialized, if yes, load the texture. Something like this:
if (bitmap != null && textureId == -1) {
initTexture(gl, bitmap);
}

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.