I'm using the Android MediaCodec library to decode a video stored on the file system. I get an output buffer that looks legit (with proper bufferinfo.offset and size). Its format seems to be 256 (which is JPEG). I tried decoding it with BitmapFactory.decodeByteArray, but the result was null.
Does anyone know the correct way to ascertain the format of the output buffer? What's the correct way to start decoding the output byte arrays?
The MediaCodec color formats are defined by the MediaCodecInfo.CodecCapabilities class. 256 is used internally, and generally doesn't mean that you have a buffer of JPEG data. The confusion here is likely because you're looking at constants in the ImageFormat class, but those only apply to camera output. (For example, ImageFormat.NV16 is a YCbCr format, while COLOR_Format32bitARGB8888 is RGB, but both have the numeric value 16.)
Some examples of MediaCodec usage, including links to CTS tests that exercise MediaCodec, can be found here. On some devices you will not be able to decode data from the ByteBuffer output, and must instead decode to a Surface.
Related
I am using mediacodec to decodec a h264 stream on samsung S6, android 5.1.1, found the input buffer to mediacodec must start with "0001"(and don't need to set pps, sps), or the ACodec will report error.
I also tried to use mediaextractor to play a mp4 file, it works fine, but the buffer to mediacodec is not start with "0001".
I don't know why decodec a h264 stream has such limitation, currently i need to analyze the stream from socket, and cut the data into small packages(each package start with 0001) and then give them to mediacodec, but it is inefficient.
MediaFormat format = MediaFormat.createVideoFormat(MediaFormat.MIMETYPE_VIDEO_AVC, 1024, 1024);
Some specific decoders may also be able to decode H264 NAL units in the "mp4" format (with a different kind of startcode), but that's not guaranteed across all devices.
It may be that Samsung's version of MediaExtractor returns it in this format, if they know that their own decoder can handle it. There is at least earlier precedent that Samsung did the same, nonstandard thing with timestamps with their version of MediaExtractor, see e.g. https://code.google.com/p/android/issues/detail?id=74356.
(Having MediaExtractor return data that only the current device's decoder can handle is wrong IMO, though, since one may want to use MediaExtractor to read a file but send the compressed data over the network to another device for decoding, and in these cases, returning data in a nonstandard format is wrong.)
As fadden wrote, MediaCodec operates on full NAL units though, so you need to provide data in this format (even if you think it feels inefficient). If you receive data over a socket in a format where this information (about frame boundaries) isn't easily available, then that's an issue with your protocol format (e.g., implementing RTP reception is not easy!), not with MediaCodec itself - it's a quite common limitation to need to have full frames before decoding, instead of being able to feed random chunks until you have a full frame. This shouldn't be inefficient unless your own implementation of it is inefficient.
In general android will expect nal units for each input. For some devices i have found that setting the csd-0/1 on the media-format for h264 to not work consistently. But if you feed each of the parameters sets as input buffers the media-codec will pick it up as a format-change.
int outputBufferIndex = NativeDecoder.DequeueOutputBuffer (info, 1000);
if (outputBufferIndex == (int)MediaCodec.InfoOutputFormatChanged) {
Console.WriteLine ("Format changed: {0}", NativeDecoder.OutputFormat);
} else if (outputBufferIndex >= 0) {
CodecOutputBufferAvailable (NativeDecoder, outputBufferIndex, info);
}
Also note it is mandatory for Nexus and some other samsung devices to set:
formatDescription.SetInteger(MediaFormat.KeyWidth, SelectedPalette.Value.Width);
formatDescription.SetInteger(MediaFormat.KeyHeight, SelectedPalette.Value.Height);
formatDescription.SetInteger(MediaFormat.KeyMaxInputSize, SelectedPalette.Value.Width * SelectedPalette.Value.Height);
I am lucky in my situation i can query these resolutions. But you can parse the resolution manually from SPS and PPS nal units.
// NOTE i am using Xamarin here. But the calls and things are pretty much the same. I am fairly certain there are bugs in the iOS VideoToolbox Xamarin Wrapper so yeah.. Keep that in mind if your ever considering Xamarin for video decoding. Its great for everything but anything thats slightly more custom or low-level.
I searched for hours ..
I just want a working decode/encode of a recorded movie.
Is this even possible on android 4.1?
Now i writes only a few kb's to my mp4 file. No errors.
After this will work, i will use KEY_FRAME_RATE and KEY_I_FRAME_INTERVAL to put it in slow motion.
I used a mediaExtractor to configure the MediaCodec.
I see 3 steps (see gist for complete code):
1./
encoder.dequeueInputBuffer(5000);
extractor.readSampleData(inputBuf, offset);
ptsUsec2 = extractor.getSampleTime();
encoder.queueInputBuffer(inputBufIndex, ...);
2./
encoder.dequeueOutputBuffer(info, 5000);
ByteBuffer encodedData = encoderOutputBuffers[encoderStatus];
//i write encodedData to a FileOutputStream (to save the MP4);
decoder.queueInputBuffer(inputBufIndex, ...);
3./
decoder.dequeueOutputBuffer(info, 5000);
decoder.releaseOutputBuffer(decoderStatus, ...);
Here is the complete function i modified from google's EncodeDecodeTest file:
gist
Thanks for help,
Felix
Some additional information is available on bigflake. In particular, FAQ item #9.
The format of frames coming out of the MediaCodec decoder is not guaranteed to be useful. Many popular devices decode data into a proprietary YUV format, which is why the checkFrame() function in the buffer-to-buffer test can't always verify the results. You'd expect the MediaCodec encoder to be able to accept the frames output by the decoder, but that's not guaranteed.
Coding against API 18+ is generally much easier because you can work with a Surface rather than a ByteBuffer.
Of course, if all you want is slow-motion video, you don't need to decode and re-encode the H.264 stream. All you need to do is alter the presentation time stamps, which are in the .mp4 wrapper. On API 18+, you can extract with MediaExtractor and immediately encode with MediaMuxer, without involving MediaCodec at all. On API 16, MediaMuxer doesn't exist, so you'd need some other way to wrap H.264 as .mp4.
Unless, of course, you have some aversion to variable-frame-rate video, in which case you'll need to re-encode it with the "slow motion" frames repeated (and timestamps adjusted appropriately). The KEY_FRAME_RATE and KEY_I_FRAME_INTERVAL values will not help you -- they're set when the encoder is configured, and have no affect on frame timing.
I'm having a very difficult time with MediaCodc. I've used it previously to decode a raw h.264 stream and learned a significant amount. At least I thought I had.
My stream is h.264 in Annex B format. Looking at the raw data, the structure of my NAL packet code types are as follows:
[0x09][0x09][0x06] and then 8 packets of [0x21].
[0x09][0x09][0x27][0x28][0x06] and then 8 packets of [0x21].
This is not how I am receiving them. I am attempting to build a complete Access Unit from these raw NAL unit types.
First thing that is strange to me is the double [0x09] which is the Access Unit Delimiter packet. I am pretty sure the h.264 document specifies only 1 AUD per Access Unit. BTW, I am able to record the raw data and play it using ffmpeg with, and without, the extra AUD. For now, I am detecting this message and stripping the first one off before sending the entire Access Unit to the MediaCodec.
Second thing is, I have hardcoded the SPS/PPS byte array messages [0x27/0x28] and I am setting these in the MediaFormat used to initialize the MediaCodec, similar to:
format.setByteBuffer("csd-0", ByteBuffer.wrap( mySPS ));
format.setByteBuffer("csd-1", ByteBuffer.wrap( myPPS ));
My video stream provider vendor tells me the video is 1280 x 720, however, when I convert it to an mp4 file, the metadata says its 960 x 720. Another oddity.
Changing these different parameters around, I am still unable to get a valid buffer index in my Thread that processes the decoder output (dequeueOutputBuffer returns -1). I have also varied the timeout for this to no avail. If I manually set the SPS/PPS as the first packet NOT using the example above, I do get the -3 "output buffers have changed" which is meaningless since I am using API 20. But everything else I get returned is -1.
I've read about the Emulation Prevention Byte encoding of h.264. I am able to strip this byte out and send to the MediaCodec. Doesn't seem to make a difference. Also, the documentation for MediaCodec doesn't explicitly say if the code is expecting the EPB to be stripped out or not ... ?
Other than the video frame resolution, the only other thing that is different than my previous success is the existence of the SEI packet type [0x06]. I'm not sure if I should be doing something special with this or not?
I know there are a number of folks who have used MediaCodec have had issues with it, mostly because the documentation is not very good. Can anyone offer any other advice as to what I could be doing wrong?
I am successfully using MediaCodec to decode audio, however when I load a file with 24-bit samples, I have no way of knowing this has occurred. Since the application was assuming 16-bit samples, it fails.
When I print the MediaFormat, I see
{mime=audio/raw, durationUs=239000000, bits-format=6, channel-count=2, channel-mask=0, sample-rate=96000}
I assume that the "bits-format" would be a hint, however this key is not declared in the API, and is not actually emitted when the output format changes. I get
{mime=audio/raw, what=1869968451, channel-count=2, channel-mask=0, sample-rate=96000}
(By the way what is the "what" key? I notice if I interpret as a 4charcode, it is "outC"... just a flag that it is an output format?)
So what is the best recourse here? If I feed the ByteBuffer straight to the AudioTrack it plays static of course (assuming PCM 16).
If I know the value, then I can convert it myself!
I understand from other questions that you cannot dictate the output format either.
I am trying to use the MediaCodec API for decoding without using the MediaExtractor API. Instead, i use mp4parser to get the samples from the mp4 files. For now, i am only using h.264 / avc coded video content.
The official documentation of the MediaCodec API states:
buffers do not start and end on arbitrary byte boundaries, this is not a stream of bytes, it's a stream of access units.
Meaning, i have to feed access units to the decoder. However, i miss some details in this information:
For h.264, in an mp4 sample, there can be multiple NAL units, that are each preceded by 4 (default) bytes specifying the NAL unit length.
Now my questions:
There can be mp4 samples, where codec config NAL units (sps, pps) are mixed with NAL units containing coded (parts of) frames. In that case, should i pass the flag BUFFER_FLAG_CODEC_CONFIG at the call of queueInputBuffers()?
There can also be other (additional) NAL units in mp4 samples, like SEI or access unit delimiter NAL units. What about those? No problem?
I tried different kinds of possibilities, but all the feedback i get from Android is that the calls of dequeueOutputBuffer() time out (or don't return, if i pass -1 as timeout parameter). As a result, i don't seem have a way to troubleshoot this issue.
Any advice what to do or where to look is of course very welcome as well.
The NAL length prefixes that specify the NAL unit length need to be converted to Annex-B startcodes (bytes 0x00, 0x00, 0x00, 0x01) before passing to MediaCodec for decoding. (Some decoders might actually accept the MP4 format straight away, but it's not too common.)
The SPS/PPS that is stored in the avcC atom in the file also needs to be converted to use Annex-B startcodes. Note that the avcC atom contains a few other fields that you don't need to pass on to the decoder. You can either pass the SPS and PPS packed in one buffer (with startcodes before each of them) with the BUFFER_FLAG_CODEC_CONFIG flag set before sending any actual frames, or pass them (with Annex-B startcodes) in the MediaFormat you use to configure the decoder (either in one ByteBuffer with the key "csd-0", or in two separate keys as "csd-0" and "csd-1").
If your file has got more SPS/PPS inside each frame, you should just be able to pass them as part of the frame, and most decoders should be able to cope with it (especially if it's the same SPS/PPS as before and not a configuration change).
Thus: Pass all NAL units belonging to one sample in one single buffer, but with all NAL unit length headers rewritten to startcodes. And to work with MP4 files that don't happen to have SPS/PPS inside the stream itself, parse the avcC atom (I don't know in which format mp4parser returns this) and pass the SPS and PPS with startcodes to the decoder (either via MediaFormat as "csd-0" or as the first buffer, with BUFFER_FLAG_CODEC_CONFIG set).
getting -1 is normal just carry on decoding you should see something on the screen. As long as it doesn't throw the IllegalState exception just carry on decoding