How to capture H.264 encoded frame in android?

How to capture H.264 encoded frame in android? - android

I understand like, there are two ways of capturing video in android.
1) using SurfaceView API
2) using MediaRecorder API
I want to capture the H.264 encoded frames using the Android (3.0+) 's default encoder to send it over network using RTP.
While using preview callbacks with SurfaceView and SurfaceHolder classes, we are able to get raw frames shown as preview to the user. We were getting the frames in "onPreviewFrame" method of "PreviewCallback" class.
But, those frames are H.264 encoded.
So, I tried with "MediaRecorder" API to set H.264 encoding and "SurfaceView" to get the preview frames.
In this case, the previewcallbacks are not getting called.
Can you please let me know how to achieve this. Our main aim is to get the H.264 encoded frame (which hass been encoded using android's default codec).
Ref: 1) https://stackoverflow.com/a/8655244/698316
2) Similar issue: http://permalink.gmane.org/gmane.comp.handhelds.android.devel/214422
Can you suggest a way to capture the H.264 encoded frames using android's default H.264 codec support.

See Spydroid http://code.google.com/p/spydroid-ipcamera/
Basically you let the video encoder write a .mp4 with H.264 to a special file descriptor that calls your code on write. Then strip off the MP4 header and turn the H.264 NALUs into RTP packets.

Related

What is relation between MediaExtractor, MediaCodec and MediaMuxer in android SDK?

I'm trying to make some deep learning experiments on android on video samples. And I've got stuck into remuxing videos. I have a couple of questions to arrange information in my head:) I have read some pages: https://vec.io/posts/android-hardware-decoding-with-mediacodec and https://bigflake.com/mediacodec/#ExtractMpegFramesTest but still I have a mess.
My questions:
Can I read video with MediaExtractor and then pass data to MediaMuxer to save video in another file? Without using MediaCodec?
If I want to modify frames before saving, can I do that without using Surface? Just by modifying ByteBuffer? I assume that I need to decode data from MediaExtractor, then modify content, then encode it to MediaMuxer.
Does sample is the same as frame in context of method MediaExtractor::readSampleData ?
Do I need to decode sample?

This is a brief description of what each class does:
MediaExtrator: Extracts encoded video/audio data
MediaCodec: Depending on how its configured it can be a decoder or an encoder.
MediaMuxer: Muxes streams of data into an output file.
This is how you pipeline should generally look like:
MediaExtractor -> MediaCodec(As Decoder) -> Your editing -> MediaCodec(As Encoder) -> MediaMuxer
To answer you questions:
MediaExtractor will give you encoded data, if you want to do
anything with it you will have to decode it using a MediaCodec.
It might be possible to do so without a surface but it will be
pretty limited. Surfaces is the way to go. You can find more info
here:
Editing frames and encoding with MediaCodec
Sample can be a video frame or an audio sample
Yes you do need to decode samples to edit them

Android mediacodec: Is it possible to encode audio and video at the same time using mediacodec and muxer?

There is some good documentation on this site called big flake about how to use media muxer and mediacodec to encode then decode video as mp4, or extract video then encode it again and more stuff.
But it doesnt seem that there is a way to encode audio with video at the same time, no documentation or code about this. It doesn't seem impossible.
Question
Do you know any stable way of doing it that will work on all devices greater than android 18?
Why no one implemented it, is it hard to implement?

You have to create 2 Mediacodec instances, one for video and one for audio and then use MediaMuxer to mux the video with audio after encoding, you can take a look at ExtractDecodeEditEncodeMuxTest.java and at this project to capture camera/mic and save to mp4 file using Mediamuxer and Mediacodec

how to retrieve NV21 data from DJI camera Phantom 3 Professional drone

As I described in a previous post, I'm working on an Android mobile app oriented to the real time augmented visualization of a drone's camera view (specifically I'm working on a DJI Phantom 3 Professional with relative SDK), using Wikitude framework for the AR part. Thanks to Alex's response, I implemented my own Wikitude Input Plugin in combination with dji's Video Stream Decoding.
I have some issues now. First of all, "DJI's Video Stream Decoding" demo uses FFmpeg for video frame parsing and MediaCodec for hardware decoding. So, it helps to parse video frames and decode the raw video stream data from DJI Camera and output the YUV data. You adviced me to "get the raw video data from the dji sdk and pass it to the Wikitude SDK": since Wikitude Input Plugin needs YUV 420 format, arranged to be compliant to the NV21 standard in order to provide the custom camera, I should pass to it the YUV data output of the MediaCodec, right?
About this point, I tried to retrieve bytebuffers from the MediaCodec output (and this is possible by setting Surface parameter to null into configure() method, which have the effect to invoke a callback and pass it out to an external listener), but I'm having some issues about colours in visualization, because the encoded video colour is not right (blue and red seem to be reversed, and there is too much noise when camera moves).. (please note that, when I pass a Surface not null, after the instruction codec.releaseOutputBuffer(outIndex, true), MediaCodec renders frames on that and shows video stream properly, but I need to pass the video stream to Wikitude Plugin and so I must set surface to null).
I tried to set different MediaFormat.KEY_COLOR_FORMAT but none of them works properly. How can I solve this point?

When decoding into bytebuffers with MediaCodec, you can't decide what color format the buffer uses; the decoder decides, and you have to deal with it. Each decoder can use a different format; some of them can be a standard format like COLOR_FormatYUV420Planar (corresponding to I420) or COLOR_FormatYUV420SemiPlanar (corresponding to NV12 - not NV21), while others can use completely proprietary formats.
See e.g. https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/EncodeDecodeTest.java#401 for an example on what formats the decoder can return that are supported, and https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/EncodeDecodeTest.java#963 for a reference showing that it is ok for decoders to return private formats.
You can have a look at e.g. http://git.videolan.org/?p=vlc.git;a=blob;f=modules/codec/omxil/qcom.c;h=301e9150ae66075ca264e83566504802ed57578c;hb=bdc690e9c0e2516c00a6d3733a77a87a25d9b6e3 for an example on how to interpret one common proprietary color format.

Decoding Raw H264 stream in android?

I have a project where I have been asked to display a video stream in android, the stream is raw H.264 and I am connecting to a server and will receive a byte stream from the server.
Basically I'm wondering is there a way to send raw bytes to a decoder in android and display it on a surface?
I have been successful in decoding H264 wrapped in an mp4 container using the new MediaCodec and MediaExtractor API in android 4.1, unfortunately I have not found a way to decode a raw H264 file or stream using these API's.
I understand that one way is to compile and use FFmpeg but I'd rather use a built in method that can use HW acceleration. I also understand RTSP streaming is supported in android but this is not an option. Android version is not an issue.

I can't provide any code for this unfortunately, but I'll do my best to explain it based on how I got it to work.
So here is my overview of how I got raw H.264 encoded video to work using the MediaCodec class.
Using the link above there is an example of getting the decoder setup and how to use it, you will need to set it up for decoding H264 AVC.
The format of H.264 is that it’s made up of NAL Units, each starting with a start prefix of three bytes with the values 0x00, 0x00, 0x01 and each unit has a different type depending on the value of the 4th byte right after these 3 starting bytes. One NAL Unit IS NOT one frame in the video, each frame is made up of a number of NAL Units.
Basically I wrote a method that finds each individual unit and passes it to the decoder (one NAL Unit being the starting prefix and any bytes there after up until the next starting prefix).
Now if you have the decoder setup for decoding H.264 AVC and have an InputBuffer from the decoder then you are ready to go. You need to fill this InputBuffer with a NAL Unit and pass it back to the decoder and continue doing this for the length of the stream.
But, to make this work I had to pass the decoder a SPS (Sequence Parameter Set) NAL Unit first. This unit has a byte value of 0x67 after the starting prefix (the 4th byte), on some devices the decoder would crash unless it received this Unit first.
Basically until you find this unit, ignore all other NAL Units and keep parsing the stream until you get this unit, then you can pass all other units to the decoder.
Some devices didn't need the SPS first and some did, but you are better of passing it in first.
Now if you had a surface that you passed to the decoder when you configured it then once it gets enough NAL units for a frame it should display it on the surface.

You can download the raw H.264 from the server, then offer it via a local HTTP server running on the phone and then let VLC for Android do playback from that HTTP server. You should use VLC's http/h264:// scheme to force the demuxer to raw H.264 (if you don't force the demuxer VLC may not be able to recognize the stream, even when the MIME type returned by the HTTP server is set correctly). See
https://github.com/rauljim/tgs-android/blob/integrate_record/src/com/tudelft/triblerdroid/first/VideoPlayerActivity.java#L211
for an example on how to create an Intent that will launch VLC.
Note: raw H.264 apparently has no timing info, so VLC will play as fast as possible.
First embedding it in MPEGTS will be better. Haven't found a Android lib that will do that yet.

Here are the resources I've found helpful in a similar project:
This video has been super insightful in understanding how MediaCodec handles raw h.264 streams on a high level.
This thread goes into a bit more detail as to handling the SPS/PPS NALUs specifically. As was mentioned above, you need to separate individual NAL Units using the start prefix, and then hand the remaining data to the MediaCodec.
This repo (libstreaming) is a great example of decoding an H264 stream in Android using RTSP/RTP for transmission.

H.264 Real-time Streaming, Timestamp in NAL Units?

I'm trying to build a system that live-streams video and audio captured by android phones. Video and auido are captured on the android side using MediaRecorder, and then pushed directly to a server written in python. Clients should access this live feed using their browser, so the I implemented the streaming part of the system using flash. Right now both video and audio content appear on the client side, but the problem is that they are out of sync. I'm sure this is caused by wrong timestamp values in flash (currently I increment ts by 60ms for a frame of video, but clearly this value should be variable).
The audio is encoded into amr on the android phone, so I know exactly each frame of amr is 20ms. However, this is not the case with video, which is encoded into H.264. To synchronized them together, I would have to know exactly how many millisecs each frame of H.264 lasts, so that I can timestamp them later when delivering content using flash. My question is is this kind of information available in NAL units of H.264? I tried to find the answer in H.264 standard, but the information there is just overwhelming.
Can someone please point me at the right direction? Thanks.

Timestamps are not in NAL units, but are typically part of RTP. RTP/RTCP also takes care of media synchronisation.
The RTP payload format for H.264 might also be of interest to you.
If you are not using RTP, are you just sending raw data units over the network?

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.