Decoding and Rendering Video on Android

Decoding and Rendering Video on Android - android

What I need to do is to decode video frames and render the frames on a trapezoidal surface. I'm using Android 2.2 as my development platform
I'm not using the mediaplayer service since I need access to the decoded frames.
Here's what I have so far:
I am using stagefright framework to extract decoded video frames.
each frame is then converted from YUV420 to RGB format
the converted frames are then copied to a texture and rendered to an OpenGL surface
Note that I am using Processing and not using OpenGL calls directly.
So now my problems are
i can only decode mp4 files with stagefright
the rendering is too slow, around 100ms for a 320x420 frame
there is no audio yet, I can only render videos but I still don't know how to synchronize the playing of the audio frames.
So for my questions...
how can I support other video formats? Shoud I use stagefright or should I switch to ffmpeg?
how can I improve the performance? I should be able to support at least 720p?
Should I use OpenGL calls directly instead of Processing? Will this improve the performance?
How can I sync the audio frames during playback?

Adding other video formats and codecs to stagefright
If you have parsers for "other" video formats, then you need to implement Stagefright media extractor plug-in and integrate into awesome player. Similarly if you have OMX Components for required Video Codecs, you need to integrate them into OMXCodec class.
Using FFMPEG components in stagefright, or using FFMPEG player instead of stagefright does not seem trivial.
However if required formats are already available in Opencore, then you can modify Android Stack so that Opencore gets chosen for those formats. You need to port the logic of getting YUV data to Opencore.
(get dirty with MIOs)
Playback performance
The surface flinger, used for normal playback uses Overlay for rendering. It usually provides around 4 - 8 video buffers (so far what I have seen). So you can check how many different buffers you are getting in OPEN GL rendering. Increasing buffer will definitely improve the performance.
Also, check time taken for YUV to RGB conversion. Can optimize or use opensource library to improve performance.
Usually Open GL is not used for Video Rendering (known for Graphics). So not sure on the performance.
Audio Video Sync
Audio time is used as reference. In Stagefright, awesome player uses Audio Player for playing out audio. This player implements an interface for providing time data. Awesome player uses this for rendering Video. Basically Video frames are rendered when their presentation time matches with that of audio sample being played.
Shash

Related

Is it possible to render frames in Exoplayer?

I am pulling h264 and AAC frames and at the moment I am feeding them to MediaCodec, decoding and rendering them myself, but the code is getting too complicated and I need to cover all cases. I was thinking if it's possible to set up an Exoplayer instance and feed them as a source.
I can only find that it supports normal files and streams, but not separate frames? Do I need to mux the frames myself, and if so is there an easy way to do it?

If you mean that you are extracting frames from a video file or a live stream, and then want to work on them individually or display them individually, you may find that OpenCV would suit your use case.
You can fairly simply open a stream or file, go frame by frame and do what you want with the resulting decoded bitmap.
This answer has a Python and Android example that might be useful: https://stackoverflow.com/a/58921325/334402

Video in Android : change visual properties (e.g. saturation, brightness)

Assuming we have a Surface in Android that displays a video (e.g. h264) with a MediaPlayer:
1) Is it possible to change the displayed saturation, contrast & brightness of the displayed on the surface video? In real time? E.g. Images can use setColorFilter is there anything similar in Android to process the video frames?
Alternative question (if no. 1 is too difficult):
2) If we would like to export this video with e.g. an increased saturation, we should use a Codec, e.g. MediaCodec. What technology (method, class, library, etc...) should we use before the codec/save action to apply the saturation change?

For display only, one easy approach is to use a GLSurfaceView, a SurfaceTexture to render the video frames, and a MediaPlayer. Prokash's answer links to an open source library that shows how to accomplish that. There are a number of other examples around if you search those terms together. Taking that route, you draw video frames to an OpenGL texture and create OpenGL shaders to manipulate how the texture is rendered. (I would suggest asking Prokash for further details and accepting his answer if this is enough to fill your requirements.)
Similarly, you could use the OpenGL tools with MediaCodec and MediaExtractor to decode video frames. The MediaCodec would be configured to output to a SurfaceTexture, so you would not need to do much more than code some boilerplate to get the output buffers rendered. The filtering process would be the same as with a MediaPlayer. There are a number of examples using MediaCodec as a decoder available, e.g. here and here. It should be fairly straightforward to substitute the TextureView or SurfaceView used in those examples with the GLSurfaceView of Prokash's example.
The advantage of this approach is that you have access to all the separate tracks in the media file. Because of that, you should be able to filter the video track with OpenGL and straight copy other tracks for export. You would use a MediaCodec in encode mode with the Surface from the GLSurfaceView as input and a MediaMuxer to put it all back together. You can see several relevant examples at BigFlake.
You can use a MediaCodec without a Surface to access decoded byte data directly and manipulate it that way. This example illustrates that approach. You can manipulate the data and send it to an encoder for export or render it as you see fit. There is some extra complexity in dealing with the raw byte data. Note that I like this example because it illustrates dealing with the audio and video tracks separately.
You can also use FFMpeg, either in native code or via one of the Java wrappers out there. This option is more geared towards export than immediate playback. See here or here for some libraries that attempt to make FFMpeg available to Java. They are basically wrappers around the command line interface. You would need to do some extra work to manage playback via FFMpeg, but it is definitely doable.
If you have questions, feel free to ask, and I will try to expound upon whatever option makes the most sense for your use case.

If you are using a player that support video filters then you can do that.
Example of such a player is VLC, which is built around FFMPEG [1].
VLC is pretty easy to compile for Android. Then all you need is the libvlc (aar file) and you can build your own app. See compile instructions here.
You will also need to write your own module. Just duplicate an existing one and modify it. Needless to say that VLC offers strong transcoding and streaming capabilities.
As powerful VLC for Android is, it has one huge drawback - video filters cannot work with hardware decoding (Android only). This means that the entire video processing is on the CPU.
Your other options are to use GLSL / OpenGL over surfaces like GLSurfaceView and TextureView. This guaranty GPU power.

how to retrieve NV21 data from DJI camera Phantom 3 Professional drone

As I described in a previous post, I'm working on an Android mobile app oriented to the real time augmented visualization of a drone's camera view (specifically I'm working on a DJI Phantom 3 Professional with relative SDK), using Wikitude framework for the AR part. Thanks to Alex's response, I implemented my own Wikitude Input Plugin in combination with dji's Video Stream Decoding.
I have some issues now. First of all, "DJI's Video Stream Decoding" demo uses FFmpeg for video frame parsing and MediaCodec for hardware decoding. So, it helps to parse video frames and decode the raw video stream data from DJI Camera and output the YUV data. You adviced me to "get the raw video data from the dji sdk and pass it to the Wikitude SDK": since Wikitude Input Plugin needs YUV 420 format, arranged to be compliant to the NV21 standard in order to provide the custom camera, I should pass to it the YUV data output of the MediaCodec, right?
About this point, I tried to retrieve bytebuffers from the MediaCodec output (and this is possible by setting Surface parameter to null into configure() method, which have the effect to invoke a callback and pass it out to an external listener), but I'm having some issues about colours in visualization, because the encoded video colour is not right (blue and red seem to be reversed, and there is too much noise when camera moves).. (please note that, when I pass a Surface not null, after the instruction codec.releaseOutputBuffer(outIndex, true), MediaCodec renders frames on that and shows video stream properly, but I need to pass the video stream to Wikitude Plugin and so I must set surface to null).
I tried to set different MediaFormat.KEY_COLOR_FORMAT but none of them works properly. How can I solve this point?

When decoding into bytebuffers with MediaCodec, you can't decide what color format the buffer uses; the decoder decides, and you have to deal with it. Each decoder can use a different format; some of them can be a standard format like COLOR_FormatYUV420Planar (corresponding to I420) or COLOR_FormatYUV420SemiPlanar (corresponding to NV12 - not NV21), while others can use completely proprietary formats.
See e.g. https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/EncodeDecodeTest.java#401 for an example on what formats the decoder can return that are supported, and https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/EncodeDecodeTest.java#963 for a reference showing that it is ok for decoders to return private formats.
You can have a look at e.g. http://git.videolan.org/?p=vlc.git;a=blob;f=modules/codec/omxil/qcom.c;h=301e9150ae66075ca264e83566504802ed57578c;hb=bdc690e9c0e2516c00a6d3733a77a87a25d9b6e3 for an example on how to interpret one common proprietary color format.

Record GLSurfaceView on < Android 4.3

I'm developing an app for applying effects to the camera image in real-time. Currently I'm using the MediaMuxer class in combination with MediaCodec. Those classes were implemented with Android 4.3.
Now I wanted to redesign my app and make it compatible for more devices. The only thing I found in the internet was a combination of FFmpeg and OpenCV, but I read that the framerate is not very well if I want to use a high resolution. Is there any possibility to encode video in real-time while capturing the camera image without using MediaMuxer and MediaCodec?
PS: I'm using GLSurfaceView for OpenGL fragment shader effects. So this is a must-have.

Real-time encoding of large frames at a moderate frame rate is not going to happen with software codecs.
MediaCodec was introduced in 4.1, so you can still take advantage of hardware-accelerated compression so long as you can deal with the various problems. You'd still need an alternative to MediaMuxer if you want a .mp4 file at the end.
Some commercial game recorders, such as Kamcord and Everyplay, claim to work on Android 4.1+. So it's technically possible, though I don't know if they used non-public APIs to feed surfaces directly into the video encoder.
In pre-Jellybean Android it only gets harder.
(For anyone interested in recording GL in >= 4.3, see EncodeAndMuxTest or Grafika's "Record GL app".)

Android: What should I use when making a native video player?

Currently I am doing research for a native video player project, initialy I tried to use ffmpeg as the decoder and return the Byte to java, then I use View::onDraw with Canvas to display frames. Unfortunately, the performance of this method is not good, so I am wondering whether there is anything else that I could use to display frames other then passing to java?
Also, other than display the frames, how can I play sound using C/C++ with NDK?
Thanks.

You can use ffmpeg http://ffmpeg.org and/or libtheora http://www.theora.org to decode video frames. Then just display the result via OpenGL ES 2 using render-to-texture. Refer to http://www.gamedev.net/topic/570295-opengl-and-xvidtheoraanything for details.
For audio you can use OpenAL. Here is the Android port: http://pielot.org/2010/12/14/openal-on-android

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.