Decode H.264 stream using MediaCodec API JNI - android

I am developing a H.264 decoder using MediaCodec API. I am trying to call MediaCodec java API in JNI layer inside a function like:
void Decompress(const unsigned char *encodedInputdata, unsigned int inputLength, unsigned char **outputDecodedData, int &width, int &height) {
// encodedInputdata is encoded H.264 remote stream
// .....
// outputDecodedData = call JNI function of MediaCodec Java API to decode
// .....
}
Later I will send the outputDecodedData to my existing video rendering pipeline and render on Surface.
I hope I will be able to write a Java function to decode the input stream, but these would be challenge -
This resource states that -
...you can't do anything with the decoded video frame but render them
to surface
Here a Surface has been passed decoder.configure(format, surface, null, 0) to render the output ByteBuffer on the surface and claimed We can't use this buffer but render it due to the API limit.
So, will I able to send the output ByteBuffer to native layer to cast as unsigned char* and pass to my rendering pipeline instead of passing a Surface ot configure()?

I see two fundamental problems with your proposed function definition.
First, MediaCodec operates on access units (NAL units for H.264), not arbitrary chunks of data from a stream, so you need to pass in one NAL unit at a time. Once the chunk is received, the codec may want to wait for additional frames to arrive before producing any output. You cannot in general pass in one frame of input and wait to receive one frame of output.
Second, as you noted, the ByteBuffer output is YUV-encoded in one of several color formats. The format varies from device to device; Qualcomm devices notably use their own proprietary format. (It has been reverse-engineered, though, so if you search around you can find some code to unravel it.)
The common workaround is to send the video frames to a SurfaceTexture, which converts them to GLES "external" textures. These can be manipulated in various ways, or rendered to a pbuffer and extracted with glReadPixels().

Related

SPS/PPS VUI Not Used By Android MediaCodec NDK

I'm trying to decode video with non-default colorimetry using MediaCodec NDK. I provide the SPS and PPS into the csd-0 and csd-1 buffers respectively, but that information does not seem to affect how the decoded video looks.
First, I initialize the AMediaFormat
AMediaFormat * format = AMediaFormat_new ();
AMediaFormat_setString (format, AMEDIAFORMAT_KEY_MIME, "video/avc");
AMediaFormat_setInt32 (format, AMEDIAFORMAT_KEY_WIDTH, this->width);
AMediaFormat_setInt32 (format, AMEDIAFORMAT_KEY_HEIGHT, this->height);
AMediaFormat_setInt32 (format, AMEDIAFORMAT_KEY_FRAME_RATE, this->fps_n);
Then I provide the SPS and PPS buffers for my video stream
uint8_t sps[] = { 0,0,0,1,103,100,0,52,172,43,64,8,0,24,54,2,220,4,32,6,148,0,0,15,160,0,7,83,2,61,42,128 };
uint8_t pps[] = { 0,0,0,1,104,238,60,176 };
const size_t sps_len = 32;
const size_t pps_len = 8;
AMediaFormat_setBuffer (format, "csd-0", sps, sps_len);
AMediaFormat_setBuffer (format, "csd-1", pps, pps_len);
And finally, I configure and start the codec
AMediaCodec_configure (codec, format, window, NULL, 0);
AMediaCodec_start (codec);
AMediaFormat_delete (format);
I would now begin queueing input buffers for decompression as usual. This runs, without any error in the logs, but the decoded video looks exactly the same, regardless of what I have set for the transfer characteristics (above it's set to '8' for linear gamma).
Does anyone have any suggestions on why the media codec doesn't seem to be actually using the colorimetry data that I have provided?
The color-space information in the H.264 stream is informational metadata only. So your observation is correct and the decompressor works as it should.
You will get the decompressed bitmap in the same color-space as it was encoded.
Usually the decompressor doesn't do or care about color-spaces. You have to do a color-space conversation after decompression.

Unable to use Android platform's MediaCodec class in "surface input" mode

I'm trying to write a simple video encoder that uses the Android platform's MediaCodec class in "surface input" mode.
These are the steps I'm following (supporting code left out for the sake of brevity):
mediaCodec = MediaCodec::CreateByType(looper, "video/avc", true);
mediaCodec->configure(config, NULL, NULL, CONFIGURE_FLAG_ENCODE);
mediaCodec->createInputSurface(&inputSurface);
mediaCodec->start();
Following this, I'm trying to dequeue a buffer from the created input surface (which is an IGraphiBufferProducer interface object), but it fails with the NO_INIT error:
inputSurface->dequeueBuffer(&slot, &fence, w, h, format, 0);
The error message in the ADB log is:
BufferQueueProducer: [GraphicBufferSource] dequeueBuffer: BufferQueue has no connected producer
Any idea why the buffer queue has no connected producer? I would assume that the MediaCodec class would handle the creation of the buffer queue as well as the connection of the producer and consumers to the queue.
I'm using Android API level 26 (7.1.2). I'm using the platform-level libs because my use case requires access to GraphicBuffer objects.
Thanks in advance!
EDIT: The general idea is to:
Dequeue buffers from the input surface & fill them.
Queue the filled buffers back to the input surface (which would presumably trigger the media codec (video encoder) instance that the surface belongs
to).
Dequeue output buffers (containing raw H.264 bitstream data) from the media codec instance, and write it to file.
Release output buffers back to the media codec instance.
From IGraphiBufferProducer documentation:
// * NO_INIT - the buffer queue has been abandoned or the producer is not
// connected.
I guess that the part that is missing in your code is this "connect".
IGraphiBufferProducer has such a method, are you using it?

Controlling Frame Rate of VirtualDisplay

I'm writing an Android application, and in it, I have a VirtualDisplay to mirror what is on the screen and I then send the frames from the screen to an instance of a MediaCodec. It works, but, I want to add a way of specifying the FPS of the encoded video, but I'm unsure how to do so.
From what I've read and experimented with, dropping encoded frames (based on the presentation times) doesn't work well as it ends up with blocky/artifact ridden video as opposed to a smooth video at a lower framerate. Other reading suggests that the only way to do what I want (limit the FPS) would be to limit the incoming FPS to the MediaCodec, but the VirtualDisplay just receives a Surface which is constructed from the MediaCodec as below
mSurface = <instance of MediaCodec>.createInputSurface();
mVirtualDisplay = mMediaProjection.createVirtualDisplay(
"MyDisplay",
screenWidth,
screenHeight,
screenDensity,
DisplayManager.VIRTUAL_DISPLAY_FLAG_AUTO_MIRROR,
mSurface,
null,
null);
I've also tried subclassing Surface and limit the frames that are fed to the MediaCodec via the unlockCanvasAndPost(Canvas canvas) but the function never seems to be called on my instance, so, there may be some weirdness in how I extended Surface and the interaction with the Parcel as writeToParcel function is called on my instance, but that is the only function that is called in my instance (that I can tell).
Other reading suggests that I can go from encoder -> decoder -> encoder and limit the rate in which the second encoder is fed frames, but that's a lot of extra computation that I'd rather not do if I can avoid it.
Has anyone successfully limited the rate at which a VirtualDisplay feeds its Surface? Any help would be greatly appreciated!
Starting off with what you can't do...
You can't drop content from the encoded stream. Most of the frames in the encoded stream are essentially "diffs" from other frames. Without knowing how the frames interact, you can't safely drop content, and will end up with that corrupted macroblock look.
You can't specify the frame rate to the MediaCodec encoder. It might stuff that into metadata somewhere, but the only thing that really matters to the codec is the frames you're feeding into it, and the presentation time stamps associated with each frame. The encoder will not drop frames.
You can't do anything useful by subclassing Surface. The Canvas operations are only used for software rendering, which is unrelated to feeding in frames from a camera or virtual display.
What you can do is send the frames to an intermediate Surface, and then choose whether or not to forward them to the MediaCodec's input Surface. One approach would be to create a SurfaceTexture, construct a Surface from it, and pass that to the virtual display. When the SurfaceTexture's frame-available callback fires, you either ignore it, or render the texture onto the MediaCodec input Surface with GLES.
Various examples can be found in Grafika and on bigflake, none of which are an exact fit, but all of the necessary EGL and GLES classes are there.
You can reference the code sample from saki4510t's ScreenRecordingSample or RyanRQ's ScreenRecoder, they are all use the additional EGL Texture between the virtual display and media encoder, and the first one can keep at least 15 fps for the output video. You can search the keyword createVirtualDisplay from their code base for more details.

Problems of using MediaCodec.getOutputFormat() for an encoder in Android 4.1/4.2 devices

I'm trying to use MediaCodec to encode frames (either by camera or decoder) into a video.
When processing the encoder output by dequeueOutputBuffer(), I expect to receive the return index = MediaCodec.INFO_OUTPUT_FORMAT_CHANGED, so I can call getOutputFormat() to get the encoder output format as the input of the currently used ffmpeg muxer.
I have tested some pad/phone devices with Android version 4.1~4.3. All of them have at least one hardware video AVC encoder and is used in the test. On the devices with version 4.3, the encoder gives MediaCodec.INFO_OUTPUT_FORMAT_CHANGED before writing the encoded data as expected, and the output format returned from getOutputFormat() can be used by the muxer correctly. On the devices with 4.2.2 or lower, the encoder never gives MediaCodec.INFO_OUTPUT_FORMAT_CHANGED while it can still output the encoded elementary stream, but the muxer cannot know the exact output format.
I want to ask the following questions:
Does the behavior of encoder (gives MediaCodec.INFO_OUTPUT_FORMAT_CHANGED or not before outputing encoded data) depend on the Android API Level or the chips on individual devices?
If the encoder writes data before MediaCodec.INFO_OUTPUT_FORMAT_CHANGED appears, is there any way to get the output format of the encoded data?
The encoder still output the codec config data (with flag MediaCodec.BUFFER_FLAG_CODEC_CONFIG) on the devices before the encoded data. It is mostly used to config a decoder, but can I derive the output format by the codec config data?
I have tried these solutions to get the output format but failed:
Call getOutputFormat() frequently during the whole encode process. However, all of them throw IllegalStateException without the appearance of MediaCodec.INFO_OUTPUT_FORMAT_CHANGED.
Use the initial MediaFormat use to config the encoder at the beginning, like the example:
m_init_encode_format = MediaFormat.createVideoFormat(m_encode_video_mime, m_frame_width, m_frame_height);
int encode_bit_rate = 3000000;
int encode_frame_rate = 15;
int encode_iframe_interval = 2;
m_init_encode_format.setInteger(MediaFormat.KEY_COLOR_FORMAT, m_encode_color_format);
m_init_encode_format.setInteger(MediaFormat.KEY_BIT_RATE, encode_bit_rate);
m_init_encode_format.setInteger(MediaFormat.KEY_FRAME_RATE, encode_frame_rate);
m_init_encode_format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, encode_iframe_interval);
m_encoder = MediaCodec.createByCodecName(m_video_encoder_codec_info.getName());
m_encoder.configure(m_init_encode_format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
// Assume m_init_encode_format is the output format of the encoder
However it fails since the output format of the encoder is still "changed" from the initial one.
Please help me to realize the behavior of an encoder, and if there is any solution to query the output format if the required MediaCodec.INFO_OUTPUT_FORMAT_CHANGED is missing.
By comparing the output format and the codec config data, the missing fields are csd-0, csd-1, and a "what" field with value = 1869968451.
(I do not understand the "what" field. It seems to be a constant and is not required. Can anyone tell me about its meaning?)
If I parse the codec config data as the csd-1 field (last 8 bytes) and csd-0 field (remaining bytes), it seems that the muxer can work correctly and output a video playable on all of the testing devices.
(But I want to ask: is this 8-byte assumption correct, or there is more reliable way to parse the data?)
However, I got another problem that If I decode the video by Android MediaCodec again, the BufferInfo.presentationTimeUs value get by dequeueOutputBuffer() is 0 for most of the decoded frames. Only the last few frames has correct time. The sample time get by MediaExtractor.getSampleTime() is correct and exactly the value I set to the encoder/muxer, but the decoded frame time is not. This issue only happen on 4.2.2 or lower device.
It is strange that the frame time is incorrect but the video can be playback in correct speed on the device. (Most of the devices with 4.2.2 or lower I've tested has only 1 Video AVC decoder.) Do I need to set other fields that may affect the presentation time?
The behavior of MediaCodec encoders was changed in Android 4.3 to accommodate the introduction of the MediaMuxer class. In Android 4.3, you will always receive INFO_OUTPUT_FORMAT_CHANGED from the encoder. In previous releases, you will not. (I've updated the relevant FAQ entry.)
There is no way to query the encoder for the MediaFormat.
I haven't used an ffmpeg-based muxer, so I'm not sure what information it needs. If it's looking for the csd-0 / csd-1 keys, you can extract those from the CODEC_CONFIG packet (I think you have to parse the SPS / PPS values out and place them in the separate keys). Examining the contents of the MediaFormat on a 4.3 device will show you which fields you're lacking.
To init ffmpeg muxer for video correctly i use next:
int outputBufferIndex = videoCodec.dequeueOutputBuffer(bufferInfo, -1);
if (MediaCodec.BUFFER_FLAG_CODEC_CONFIG == bufferInfo.flags) {
ByteBuffer outputBuffer = outputBuffers[outputBufferIndex];
headerData = new byte[bufferInfo.size];
outputBuffer.get(headerData);
// jni call
WriteVideoHeader(headerData, headerData.length);
videoCodec.releaseOutputBuffer(outputBufferIndex, false);
}
In jni I use something like this:
jint Java_com_an_FileWriterEx_WriteVideoHeader(JNIEnv * env, jobject this, jbyteArray data, jint datasize)
{
jboolean isCopy;
jbyte* rawjBytes = (*env)->GetByteArrayElements(env, data, &isCopy);
WriteVideoHeaderInternal(env, m_pFormatCtx, m_pVideoStream, rawjBytes, datasize);
(*env)->ReleaseByteArrayElements(env, data, rawjBytes, 0);
return 0;
}
jint WriteVideoHeaderInternal(JNIEnv * env, AVFormatContext* pFormatCtx, AVStream* pVideoStream, jbyte* data, jint datasize)
{
jboolean bNoError = JNI_TRUE;
jbyte* pExtDataBuffer = av_malloc(datasize);
if(!pExtDataBuffer)
{
LOGI("av alloc error\n");
bNoError = JNI_FALSE;
}
if (bNoError)
{
memcpy(pExtDataBuffer, data, datasize * sizeof(jbyte));
pVideoStream->codec->extradata = pExtDataBuffer;
pVideoStream->codec->extradata_size = datasize;
}
}
For the parsing of codec config data, it is wrong that assuming the last 8 bytes are the PPS data. The data must be parsed according to the start code and nal_unit_type.

how to get the underlying buffer of EGLImage?

I want to implement OMX_UseEGLImage in my native openmax componet on android,but how to get the underlying buffer associated with an EGLImage specified by eglImage?
the client api will create a EGLImage and call OMX_UseEGLImage to notify my native openmax componet to use eglimage:
eglImage = eglCreateImageKHR(
m_egl_display,
m_egl_context,
EGL_GL_TEXTURE_2D_KHR,
(EGLClientBuffer)(egl_buffer->texture_id),
&attrib);
OMX_UseEGLImage(hComponent,ppBufferHdr,nPortIndex,pAppPrivate,eglImage);
the problem is how i can use eglImage? is there anyway get the underlying buffer associated with eglImage?
I think that the call OMX_UseEGLImage is only applicable to render.
For example, consider the two components: decoder and render with tunneled communication. Decoder output port connected to the Render input port via tunnel. Decoder output port is buffer supplier.
At the transition from OMX_StateLoaded to OMX_StateIdle:
Decoder creates native buffer:android::GraphicBuffer * buffer = new android::GraphicBuffer();android_native_buffer_t * native_buffer = buffer->getNativeBuffer();
Decoder creates EGLImage:EGLImageKHR egl_image = eglCreateImageKHR((EGLClientBuffer)native_buffer)
Decoder call on tunneled port: OMX_UseEGLImage(&buffer_header, egl_image)
Render allocates a buffer_header and remembers egl_image
In the state OMX_StateIdle:
The decoder knows the correspondence between the native buffer, buffer_header and egl_image.
The render knows the correspondence between the buffer_header and egl_image.
In the state OMX_StateExecuting:
Decoder writes frames in native buffer, and call OMX_EmptyThisBuffer(buffer_header) on the tunneled port
Render call glEGLImageTargetTexture2DOES(egl_image) to draw frames.
At the transition from OMX_StateIdle to OMX_StateLoaded:
Decoder call OMX_FreeBuffer(buffer_header) on tunneled port
Render free buffer_header
Decoder call eglDestroyImageKHR(egl_image)
Decoder delete native_buffer

Categories

Resources