I am using OpenGLES2 output to display to a SurfaceView or encode to mp4 using MediaCodec.
However, I can only do one at a time. I can obviously draw using OpenGLES2 onto two separate surfaces but that would be a really inefficient use of the GPU.
What I want is to use some sort of reference counting to reuse the buffer to both draw on the screen and encode the single OpenGLES2 output. Like how camera service does in the Shared Surfaces concept.
Can can one do both display and encode of a buffer? Is there some sort of tee element (like in GStreamer) present in Android?
There is no T-component available at the moment.
But you can avoid rendering twice by drawing to a Framebuffer Object and then copying the frame both to the screen and to the encoder.
Here's an example (pretty old).
You can't make your surfaceView bigger than the screen.
Although there are multiple ways to do this in different manner but directly you can't reuse surfaceview to encode after display to screen.
Related
Is it possible to render two video streams on a one surfaceview for blending?
I wanna make an application to render two videos for blending to a one same surfaceview and then save as a video file.
If that's impossible, is this approach possible that render two videos using two surfaceview for blending and save as a one video file?
Please help me.
Thank you for reading.
No, that's not possible. You'll need to use multiple SurfaceTextures instead, one per video decoder, and render all the textures into one view using Open GL.
See https://source.android.com/devices/graphics/architecture.html for more explanations on how this works; in particular, each Surface can only have one producer and one consumer.
Say you have a 20 second video (perhaps taken with the device camera) and you want to add an overlay in to the video.
(The overlay would simply be a normal raster image, i.e. an Android Image (doc).)
You want to create a new video, with the overlay as part of the video image, and save the video.
In fact, can MediaCodec SDK be used to do this job?
https://developer.android.com/reference/android/media/MediaCodec
In the past, you would usually use FFMPEG for such a problem, but that is a mess and slow.
Is MediaCodec possible here?
Since it is "new" I just can't find any information on this.....
this is possible using MediaCodec.
For a start, take a look at the DecodeEditEncode example from here
This example is shows how to resize a vide using a OpenGL ES shader. What you want to do is render your overlay over the video also using a OpenGL ES shader.
Another good source for examples on MediaCodec can be found here
Here you can find some examples on how to use basic rendering techniques. Look at the Hardware scaler exerciser.
When you have the video part up and running, this is probably where the actual struggle starts since there are no standard methods to render text in OpenGl ES. I'd probably just draw text to a Canvas and make a texture out of it, probably is slow though.
If you have a static overlay, like a watermark, you could create it beforehand and ship as a resource.
To play a media using Android MediaPlayer or MediaCodec, most of the time, you use SurfaceView or GLSurfaceView (There is another way to achieve this using TextureView, but let's not talk about it here, since it's a bit different type of view)
And as far as I know, capturing the video frame from SurfaceView is not possible - you don't have access to hw overlay.
How about GLSurfaceView? Since we have access to YUV pixels (we're, right?), is it possible?
Can anyone point me where i can find a sample code to do it?
I don't think below explanation can work, because it's assuming the color format is RGBA, and in above case, I think it's YUV.
When using GLES20.glReadPixels on android, the data returned by it is not exactly the same with the living preview
Thank you and have a great day.
You are correct in that you cannot read back from a Surface. It's the producer side of a producer-consumer pair. GLSurfaceView is just a bunch of code wrapped around a SurfaceView that (in theory) makes working with GLES easier.
So you have to send the preview somewhere else. One approach is to send it to a SurfaceTexture, which converts every frame sent to its Surface into a GLES texture. The texture can then be rendered twice, once for display and once to an offscreen pbuffer that can be saved as a bitmap (just like this question).
I'm not sure why you don't want to talk about TextureView. It's a View that uses SurfaceTexture under the hood, and it provides a getBitmap() call that does exactly what you want.
The basic issue I am trying to solve is to delay what is sent to a virtual display by a second or so. So basically, I am trying to shift all frames by 1 second after the initial recording. Note that a surface is used as an input and another surface is used as an output through this virtual display. My initial hunch is to explore a few ideas, given that modification of the Android framework or use of non-public APIs is fine. Java or native C/C++ is fine.
a) I tried delaying frames posted to the virtual display or output surface by a second or two in SurfaceFlinger. This does not work as it causes all surfaces to be delayed by the same amount of time (synchronous processing of frames).
b) MediaCodec uses a surface as an input to encode, and then produce the decoded data. Is there anyway to use MediaCodec such that it does not actually encode and only produce unencoded raw frames? Seems unlikely. Moreover, how does MediaCodec do this under the hood? Process things frame by frame. If I can extrapolate the method I might be able to extract frame by frame from my input surface and create a ring buffer delayed by the amount of time I require.
c) How do software decoders, such as FFmpeg, actually do this in Android? I assume they take in a surface but how would they extrapolate and process frame by frame
Note that I can certainly encode and decode to retrieve the frames and post them but I want to avoid actually decoding. Note that modifying the Android framework or using non-public APIs is fine.
I also found this: Getting a frame from SurfaceView
It seems like option d) could be using a SurfaceTexture but I would like to avoid the process of encoding/decoding.
As I understand it, you have a virtual display that is sending its output to a Surface. If you just use a SurfaceView for output, frames output by the virtual display appear on the physical display immediately. The goal is to introduce one second of latency between when the virtual display generates a frame and when the Surface consumer receives it, so that (again using SurfaceView as an example) the physical display shows everything a second late.
The basic concept is easy enough: send the virtual display output to a SurfaceTexture, and save the frame into a circular buffer; meanwhile another thread is reading frames out of the tail end of the circular buffer and displaying them. The trouble with this is what #AdrianCrețu pointed out in the comments: one second of full-resolution screen data at 60fps will occupy a significant fraction of the device's memory. Not to mention that copying that much data around will be fairly expensive, and some devices might not be able to keep up.
(It doesn't matter whether you do it in the app or in SurfaceFlinger... the data for up to 60 screen-sized frames has to be held somewhere for a full second.)
You can reduce the volume of data in various ways:
Reduce the resolution. Scaling 2560x1600 to 1280x800 removes 3/4 of the pixels. The loss of quality should be difficult to notice on most displays, but it depends on what you're viewing.
Reduce the color depth. Switching from ARGB8888 to RGB565 will cut the size in half. This will be noticeable though.
Reduce the frame rate. You're generating the frames for the virtual display, so you can choose to update it more slowly. Animation is still reasonably smooth at 30fps, halving the memory requirements.
Apply image compression, e.g. PNG or JPEG. Fairly effective, but too slow without hardware support.
Encode inter-frame differences. If not much is changing from frame to frame, the incremental changes can be very small. Desktop-mirroring technologies like VNC do this. Somewhat slow to do in software.
A video codec like AVC will both compress frames and encode inter-frame differences. That's how you get 1GByte/sec down to 10Mbit/sec and still have it look pretty good.
Consider, for example, the "continuous capture" example in Grafika. It feeds the Camera output into a MediaCodec encoder, and stores the H.264-encoded output in a ring buffer. When you hit "capture", it saves the last 7 seconds. This could just as easily play the camera feed with a 7-second delay, and it only needs a few megabytes of memory to do it.
The "screenrecord" command can dump H.264 output or raw frames across the ADB connection, though in practice ADB is not fast enough to keep up with raw frames (even on tiny displays). It's not doing anything you can't do from an app (now that we have the mediaprojection API), so I wouldn't recommend using it as sample code.
If you haven't already, it may be useful to read through the graphics architecture doc.
I was able to decode an mp4 video. If I configure the decoder using a Surface I can see the video on screen. Now, I want to edit the frame (adding a yellow line or even better overlapping a tiny image) and encode the video as a new video. It is not necessary to show the video and I don't care now about the performance.(If I show the frames while editing I could have a gap if the editing function takes a lot of time), So, What do you recommend to me, configure the decoder with a GlSurface anyway and use OpenGl (GLES), or configure it with null and somehow convert the Bytebuffer to a Bitmap, modify it, and encode the bitmap as a byte array? Also I saw in Grafika page that you cand use a Surface with a custom Rederer and use OpenGl (GLES). Thanks
You will have to use OpenGLES. ByteBuffer/Bitmap approach can not give realistic performance/features.
Now that you've been able to decode the Video (using MediaExtractor and Codec) to a Surface, you need to use the SurfaceTexture used to create the Surface as an External Texture and render using GLES to another Surface retrieved from MediaCodec configured as an encoder.
Though Grafika doesn't have an exactly similar complete project, you can start with your existing project and then try to use either of the following subprojects in grafika Continuous Camera or Show + capture camera, which currently renders Camera frames (fed to SurfaceTexture) to a Video (and display).
So essentially, the only change is the MediaCodec feeding frames to SurfaceTexture instead of the Camera.
Google CTS DecodeEditEncodeTest does exactly the same and can be used as a reference in order to make the learning curve smoother.
Using this approach, you can certainly do all sorts of things like manipulating the playback speed of video (fast forward and slow-down), adding all sorts of overlays on the scene, play with colors/pixels in the video using shaders etc.
Checkout filters in Show + capture camera for an illustration for the same.
Decode-edit-Encode flow
When using OpenGLES, 'editing' of the frame happens via rendering using GLES to the Encoder's input surface.
If decoding and rendering+encoding are separated out in different threads, you're bound to skip a few frames every frame, unless you implement some sort of synchronisation between the two threads to keep the decoder waiting until the render+encode for that frame has happened on the other thread.
Although modern hardware codecs support simultaneous video encoding and decoding, I'd suggest, do the decoding, rendering and encoding in the same thread, especially in your case, when the performance is not a major concern right now. That will help avoiding the problems of having to handle synchronisation on your own and/or frame jumps.