I'm confused with Android Surface. According to its document, I could understand Surface is a producer and SurfaceTexture is a consumer. And it seems there're examples passing SurfaceTexture to Camera or to MediaPlayer. I assume they (i.e., Camera & MediaPlayer) internally image stream to SurfaceTexture(i.e. consumer) through Surface(i.e. producer). But I could not find any example showing how OpenGL directly image stream to Surface, though the document says OpenGL also can be producer-side. Here, I'm assuming a flow like "texture drawn by OpenGL -> Surface -> SurfaceTexture".
I'm asking this question because I'm making an app that does offscreen rendering using OpenGL FBO and needs to copy the rendered result (i.e. output in the form of texture) to multiple SurfaceTexture destinations.
I guess I misunderstood something. I'd appreciate if anybody can correct me.
Related
I'm trying to understand graphics memory usage/flow in Android and specifically with respect to encoding frames from the camera using MediaCodec. In order to do that I'm having to understand a bunch of graphics, OpenGL, and Android terminology/concepts that are unclear to me. I've read the Android graphics architecture material, a bunch of SO questions, and a bunch of source but I'm still confused primarily because it seems that terms have different meanings in different contexts.
I've looked at CameraToMpegTest from fadden's site here. My specific question is how MediaCodec::createInputSurface() works in conjunction with Camera::setPreviewTexture(). It seems that an OpenGL texture is created and then this is used to create an Android SurfaceTexture which can then be passed to setPreviewTexture(). My specific questions:
What does calling setPreviewTexture() actually do in terms of what memory buffer the frames go to from the camera?
From my understanding an OpenGL texture is a chunk of memory that is accessible by the GPU. On Android this has to be allocated using gralloc with the correct usage flags. The Android description of SurfaceTexture mentions that it allows you to "stream images to a given OpenGL texture": https://developer.android.com/reference/android/graphics/SurfaceTexture.html#SurfaceTexture(int). What does a SurfaceTexture do on top of an OpenGL texture?
MediaCodec::createInputSurface() returns an Android Surface. As I understand it an Android Surface represents the producer side of a buffer queue so it may be multiple buffers. The API reference mentions that "the Surface must be rendered with a hardware-accelerated API, such as OpenGL ES". How do the frames captured by the camera get from the SurfaceTexture to this Surface that is input to the encoder? I see CameraToMpegTest creates an EGLSurface using this Surface somehow but not knowing much about EGL I don't get this part.
Can someone clarify the usage of "render"? I see things such as "render to a surface", "render to the screen" among other usages that seem to maybe mean different things.
Edit: Follow-up to mstorsjo's responses:
I dug into the code for SurfaceTexture and CameraClient::setPreviewTarget() in CameraService some more to try and understand the inner workings of Camera::setPreviewTexture() better and have some more questions. To my original question of understanding the memory allocation it seems like SurfaceTexture creates a BufferQueue and CameraService passes the associated IGraphicBufferProducer to the platform camera HAL implementation. The camera HAL can then set the gralloc usage flags appropriately (e.g. GRALLOC_USAGE_SW_READ_RARELY | GRALLOC_USAGE_SW_WRITE_NEVER | GRALLOC_USAGE_HW_TEXTURE) and also dequeue buffers from this BufferQueue. So the buffers that the camera captures frames into are gralloc allocated buffers with some special usage flags like GRALLOC_USAGE_HW_TEXTURE. I work on ARM platforms with unified memory architectures so the GPU and CPU can access the same memory so what kind of impact would the GRALLOC_USAGE_HW_TEXTURE flag have on how the buffer is allocated?
The OpenGL (ES) part of SurfaceTexture seems to mainly be implemented as part of GLConsumer and the magic seems to be in updateTexImage(). Are there additional buffers being allocated for the OpenGL (ES) texture or is the same gralloc buffer that was filled by the camera able to be used? Is there some memory copying that has to happen here to get the camera pixel data from the gralloc buffer into the OpenGL (ES) texture? I guess I don't understand what calling updateTexImage() does.
It means that the camera provides the output frames via an opaque handle instead of in a user-provided buffer within the application's address space (if using setPreviewCallback or setPreviewCallbackWithBuffer). This opaque handle, the texture, can be used within OpenGL drawing.
Almost. In this case, the OpenGL texture is not a physical chunk of memory, but a handle to a variable chunk of memory within an EGL context. In this case, the sample code itself doesn't actually allocate or size the texture, it only creates a "name"/handle for a texture using glGenTextures - it's basically just an integer. Within normal OpenGL (ES), you'd use OpenGL functions to allocate the actual storage for the texture and fill it with content. In this setup, SurfaceTexture provides an Android level API/abstraction to populate the texture with data (i.e. allocate storage for it with the right flags, provide it with a size and content) - allowing you to pass the SurfaceTexture to other classes that can fill it with data (either Camera that takes a SurfaceTexture directly, or wrap in the Surface class to be able to use it in other contexts). This allows filling the OpenGL texture with content efficiently, without having to pass a buffer of raw data to your application's process and having your app upload it to OpenGL.
(Answering points 3 and 4 in reverse order.) OpenGL (ES) is a generic API for drawing. In the normal/original setup, consider a game, you'd have a number of textures for different parts of the game content (backgrounds, props, actors, etc), and then with OpenGL APIs draw this to the screen. The textures could either be more or less just copied as such to the screen, or be wrapped around a 3D object built out of triangles. This is the process called "rendering", taking the input textures and set of triangles and drawing it. In the simplest cases, you would render content straight to the screen. The GPU usually can do the same rendering into any other output buffer as well. In games, it is common to render some scene into a texture, and use that prerendered texture as part of the final render which actually ends up displayed on the screen.
An EGL context is created for passing the output from the camera into the encoder input. An EGL context is basically a context for doing OpenGL rendering. The target for the rendering is the Surface from the encoder. That is, whatever graphics is drawn using OpenGL ends up in the encoder input buffer instead of on the screen. Now the scene that is drawn using OpenGL could be any sequence of OpenGL function calls, rendering a game scene into the encoder. (This is what the Android Breakout game recorder example does.) Within the context, an texture handle is created. Instead of filling the texture with content by loading a picure from disk, as in normal game graphics rendering, this is made into a SurfaceTexture, to allow Camera to fill it with the camera picture. The SurfaceTexture class provides a callback, giving a signal when the Camera has updated the content. When this callback is received, the EGL context is activated and one frame is rendered into the EGL context output target (which is the encoder input). The rendering itself doesn't do anything fancy, but more or else copies the input texture as-is straight into the output.
This might all sound quite roundabout, but it does give a few benefits:
The actual raw bits of the camera frames never need to be handled directly within the application code (and potentially never within the application's process and address space at all). For low resolutions, this isn't much of an issue, but the setPreviewCallback API is a bottleneck when it comes to higher resolutions.
You can do color adjustments and anything else you can do within OpenGL, almost for free with GPU acceleration.
I managed to write a demo displaying a 3D model on TextureView and the model can move according to the sensors of the phone. The 3D engine is wrote by C++ and what I need to do is giving the SurfaceTexture of TextureView to the 3D engine.
The engine calls the function ANativeWindow_fromSurface to retrieve a native window and draw 3D model on it. 3D engine is not the key point I want to talk about in this question.
Now I want to record the moving 3d model to a video. One way is using GL_TEXTURE_EXTERNAL_OES texture just like grafika, make 3D engine draw frames to the oes texture and draw the texture content to screen after every call of updateTexImage().But for some restrictions, I am not allowed to use this way.
I plan to use the SurfaceTexture of TextureView directly. I think functions such as attachToGLContext() and detachFromGLContext() will be useful for my work.
Could anyone give me some advices?
Grafika's "record GL app" has three different modes of operation:
Draw everything twice.
Render to an offscreen pbuffer, then blit that twice.
Draw once, then copy between framebuffers (requires GLES 3).
If you can configure the EGL surface that is rendered to, approaches 2 and 3 will work. For approach #3, bear in mind that the pixels don't go to the Surface (that's the Android Surface, not the EGL surface) until you call eglSwapBuffers().
If the engine code is managing the EGL surface and calling eglSwapBuffers() for you, then things are a bit more annoying. The SurfaceTexture attach/detach calls will let you access the GLES texture with the output from a different EGL context, but the render thread needs that when rendering the View UI. I'm not entirely sure how that's going to work out.
To play a media using Android MediaPlayer or MediaCodec, most of the time, you use SurfaceView or GLSurfaceView (There is another way to achieve this using TextureView, but let's not talk about it here, since it's a bit different type of view)
And as far as I know, capturing the video frame from SurfaceView is not possible - you don't have access to hw overlay.
How about GLSurfaceView? Since we have access to YUV pixels (we're, right?), is it possible?
Can anyone point me where i can find a sample code to do it?
I don't think below explanation can work, because it's assuming the color format is RGBA, and in above case, I think it's YUV.
When using GLES20.glReadPixels on android, the data returned by it is not exactly the same with the living preview
Thank you and have a great day.
You are correct in that you cannot read back from a Surface. It's the producer side of a producer-consumer pair. GLSurfaceView is just a bunch of code wrapped around a SurfaceView that (in theory) makes working with GLES easier.
So you have to send the preview somewhere else. One approach is to send it to a SurfaceTexture, which converts every frame sent to its Surface into a GLES texture. The texture can then be rendered twice, once for display and once to an offscreen pbuffer that can be saved as a bitmap (just like this question).
I'm not sure why you don't want to talk about TextureView. It's a View that uses SurfaceTexture under the hood, and it provides a getBitmap() call that does exactly what you want.
I am using MediaCodec's decoder to output data to a surface. Using the .configure function, I passed a surface created through surfaceComposerClient. The problem is that the codec fails to start. I presume this is an issue with the way my surface is setup (when I set surface to NULL the codec starts)
Looking at MediaCodec decoder java examples it seems like I need to create an EGL backed SurfaceTexture. Is it possible to natively create a surface texture using C++/NDK? Are there any examples out there of this?
I assume this is not a "normal" app, since you're interacting with SurfaceFlinger directly.
You can find examples in some internal OpenGL tests -- the code was fixed up for the 5.0 Lollipop release. Take a look at the San Angeles demo, which uses the WindowSurface class to get a surface from SurfaceComposerClient.
You don't need a SurfaceTexture, or do anything with EGL, to decode a video to a surface. Surfaces have a producer-consumer structure, and EGL and MediaCodec are two different examples of producers. (SurfaceFlinger is the consumer.)
It's never easy to know why MediaCodec is failing. You can try drawing on the surface with GLES to see if it's valid, but my guess is that your problem is elsewhere.
For a SurfaceTexture, the app is both the producer and the consumer; it provides a way to decode the video to a surface that you can then manipulate as a GLES texture. This adds unnecessary overhead if you just want the video to play on screen.
refer to SimplePlayer.h&.cpp in Android-4.4 source code. it's used to decode an media file and output decoded video into surface. i think it is similar with your scenario.
Ideally I'd like to accomplish two goals:
Pass the Camera preview data to a MediaCodec encoder via a Surface. I can create the Surface using MediaCodec.createInputSurface() but the Camera.setPreviewDisplay() takes a SurfaceHolder, not a Surface.
In addition to passing the Camera preview data to the encoder, I'd also like to display the preview on-screen (so the user can actually see what they are encoding). If the encoder wasn't involved then I'd use a SurfaceView, but that doesn't appear to work in this scenario since SurfaceView creates its own Surface and I think I need to use the one created by MediaCodec.
I've searched online quite a bit for a solution and haven't found one. Some examples on bigflake.com seem like a step in the right direction but they take an approach that adds a bunch of EGL/SurfaceTexture overhead that I'd like to avoid. I'm hoping there is a simpler example or solution where I can get the Camera and MediaCodec talking more directly without involving EGL or textures.
As of Android 4.3 (API 18), the bigflake CameraToMpegTest approach is the correct way.
The EGL/SurfaceTexture overhead is currently unavoidable, especially for what you want to do in goal #2. The idea is:
Configure the Camera to send the output to a SurfaceTexture. This makes the Camera output available to GLES as an "external texture".
Render the SurfaceTexture to the Surface returned by MediaCodec#createInputSurface(). That feeds the video encoder.
Render the SurfaceTexture a second time, to a GLSurfaceView. That puts it on the display for real-time preview.
The only data copying that happens is performed by the GLES driver, so you're doing hardware-accelerated blits, which will be fast.
The only tricky bit is you want the external texture to be available to two different EGL contexts (one for the MediaCodec, one for the GLSurfaceView). You can see an example of creating a shared context in the "Android Breakout game recorder patch" sample on bigflake -- it renders the game twice, once to the screen, once to a MediaCodec encoder.
Update: This is implemented in Grafika ("Show + capture camera").
Update: The multi-context approach in "show + capture camera" approach is somewhat flawed. The "continuous capture" Activity uses a plain SurfaceView, and is able to do both screen rendering and video recording with a single EGL context. This is recommended.