I am successfully using MediaCodec to decode audio, however when I load a file with 24-bit samples, I have no way of knowing this has occurred. Since the application was assuming 16-bit samples, it fails.
When I print the MediaFormat, I see
{mime=audio/raw, durationUs=239000000, bits-format=6, channel-count=2, channel-mask=0, sample-rate=96000}
I assume that the "bits-format" would be a hint, however this key is not declared in the API, and is not actually emitted when the output format changes. I get
{mime=audio/raw, what=1869968451, channel-count=2, channel-mask=0, sample-rate=96000}
(By the way what is the "what" key? I notice if I interpret as a 4charcode, it is "outC"... just a flag that it is an output format?)
So what is the best recourse here? If I feed the ByteBuffer straight to the AudioTrack it plays static of course (assuming PCM 16).
If I know the value, then I can convert it myself!
I understand from other questions that you cannot dictate the output format either.
Related
I'm decoding an online AAC stream with MediaCodec and trying to play it with AudioTrack. AudioTrack requires an AudioFormat which requires knowing the details of PCM encoding. I was able to get the audio to play correctly by configuring it as ENCODING_PCM_16BIT but is that guaranteed to always be 16-bit PCM encoding? I want to avoid making that assumption.
I would expect MediaCodec.Callback.onOutputFormatChanged to get the encoding information as part of it's MediaFormat parameter, which seems like it should be the case by this https://stackoverflow.com/a/49812393/2399236, but this is all the information I get out of the media format by calling toString() on it:
aac-drc-heavy-compression=1
sample-rate=48000
aac-drc-boost-level=127
aac-drc-output-loudness=-1
mime=audio/raw
channel-count=2
aac-drc-effect-type=3
aac-drc-cut-level=127
aac-encoded-target-level=-1
aac-max-output-channel_count=8
aac-target-ref-level=64
aac-drc-album-mode=0
Should I just assume that it's 16-bit PCM, or is there any way to get that information?
Note: I'm targetting min SDK 28, currently testing in debugger on emulated Pixel 2 API 30.
I have some automated tests that try to decode a few m4a files to PCM data using Android's MediaDecoder and MediaExtractor. The files are generated with various encoders: fdk-aac, ffmpeg (with fdk or the default aac encoder), iOS.
On Android 9 the test fails for the clips created with ffmpeg, which results in empty PCM files. The same clips are decoded fine on older versions of Android.
I double checked my code and the decoding process goes as expected:
I extract compressed data using MediaExtractor
Enqueue it to the codec
Dequeue the output buffer from the codec.
The issue is that by the time the last available input buffer is enqueued and the output buffer with MediaCodec.BUFFER_FLAG_END_OF_STREAM is dequeued, all output buffers are empty!
Then I noticed that the MediaFormat info extracted from the audio file with MediaExtractor.getTrackFormat(int track) contains an undocumented "encoder-delay" key.
For android 8 and lower, that key is only present for m4a clips encoded with the iTunSMPB tag info. Here's a summary of the values I get for my test files:
iOS-encoded file: 2112 frames
fdkaac with iTunSMPB tag: 2048 frames
fdkaac with ISO delay info: key not present
ffmpeg: key not present
ffmpeg (fdk): key not present
On Android 9, instead, I get the following results:
iOS-encoded file: 2112 frames
fdkaac with iTunSMPB tag: 2048 frames
fdkaac with ISO delay info: 2048 frames
ffmpeg: 45158 frames
ffmpeg (fdk): 90317 frames
It looks like something has changed and MediaExtractor is now able to retrieve the encoder delay for all the files under test. This is good in theory, since the files with no "encoder-delay" info do show a delay in the decoded PCM data (this was a known issue).
But... while the value for the "fdkaac with ISO delay info" case is correct and leads to a valid PCM file with no initial padding (finally!), the values for the ffmpeg-generated files look huge and likely wrong!
I know the real encoder delay values are 1024 for the ffmpeg case, and 2048 for the ffmpeg (fdk) case, and I think the high value for key in the extracted format is the reason why the file is empty.
In fact, if I try setting the "encoder-delay" key to 0 in the format just before passing it to MediaCodec.configure(...) I get the correct uncompressed data with the expected delay.
My guess at this point is that the MediaExtractor encoder delay value retrieval has some bug, but maybe there's something I am overlooking.
Since ffmpeg is quite popular, it's quite likely that many of my app users will try importing files generated with it, and at this point I can't see a foolproof solution to the issue.
Does anyone have a suggestion / workaround?
I opened an issue on the android issue tracker:
https://issuetracker.google.com/issues/118398811
And for now I just implemented a workaround: when the "encoder-delay" value is present in the MediaFormat object and it's an impossibly high value, I just set it to zero. Something like:
if (format.containsKey("encoder-delay") && format.getInteger("encoder-delay") > THRESHOLD) {
format.setInteger("encoder-delay", 0);
}
NB: This means the initial gap will not be trimmed away, but for M4a files that don't have such info this is already the case on pre-android-9 devices.
I searched for hours ..
I just want a working decode/encode of a recorded movie.
Is this even possible on android 4.1?
Now i writes only a few kb's to my mp4 file. No errors.
After this will work, i will use KEY_FRAME_RATE and KEY_I_FRAME_INTERVAL to put it in slow motion.
I used a mediaExtractor to configure the MediaCodec.
I see 3 steps (see gist for complete code):
1./
encoder.dequeueInputBuffer(5000);
extractor.readSampleData(inputBuf, offset);
ptsUsec2 = extractor.getSampleTime();
encoder.queueInputBuffer(inputBufIndex, ...);
2./
encoder.dequeueOutputBuffer(info, 5000);
ByteBuffer encodedData = encoderOutputBuffers[encoderStatus];
//i write encodedData to a FileOutputStream (to save the MP4);
decoder.queueInputBuffer(inputBufIndex, ...);
3./
decoder.dequeueOutputBuffer(info, 5000);
decoder.releaseOutputBuffer(decoderStatus, ...);
Here is the complete function i modified from google's EncodeDecodeTest file:
gist
Thanks for help,
Felix
Some additional information is available on bigflake. In particular, FAQ item #9.
The format of frames coming out of the MediaCodec decoder is not guaranteed to be useful. Many popular devices decode data into a proprietary YUV format, which is why the checkFrame() function in the buffer-to-buffer test can't always verify the results. You'd expect the MediaCodec encoder to be able to accept the frames output by the decoder, but that's not guaranteed.
Coding against API 18+ is generally much easier because you can work with a Surface rather than a ByteBuffer.
Of course, if all you want is slow-motion video, you don't need to decode and re-encode the H.264 stream. All you need to do is alter the presentation time stamps, which are in the .mp4 wrapper. On API 18+, you can extract with MediaExtractor and immediately encode with MediaMuxer, without involving MediaCodec at all. On API 16, MediaMuxer doesn't exist, so you'd need some other way to wrap H.264 as .mp4.
Unless, of course, you have some aversion to variable-frame-rate video, in which case you'll need to re-encode it with the "slow motion" frames repeated (and timestamps adjusted appropriately). The KEY_FRAME_RATE and KEY_I_FRAME_INTERVAL values will not help you -- they're set when the encoder is configured, and have no affect on frame timing.
I'm having a very difficult time with MediaCodc. I've used it previously to decode a raw h.264 stream and learned a significant amount. At least I thought I had.
My stream is h.264 in Annex B format. Looking at the raw data, the structure of my NAL packet code types are as follows:
[0x09][0x09][0x06] and then 8 packets of [0x21].
[0x09][0x09][0x27][0x28][0x06] and then 8 packets of [0x21].
This is not how I am receiving them. I am attempting to build a complete Access Unit from these raw NAL unit types.
First thing that is strange to me is the double [0x09] which is the Access Unit Delimiter packet. I am pretty sure the h.264 document specifies only 1 AUD per Access Unit. BTW, I am able to record the raw data and play it using ffmpeg with, and without, the extra AUD. For now, I am detecting this message and stripping the first one off before sending the entire Access Unit to the MediaCodec.
Second thing is, I have hardcoded the SPS/PPS byte array messages [0x27/0x28] and I am setting these in the MediaFormat used to initialize the MediaCodec, similar to:
format.setByteBuffer("csd-0", ByteBuffer.wrap( mySPS ));
format.setByteBuffer("csd-1", ByteBuffer.wrap( myPPS ));
My video stream provider vendor tells me the video is 1280 x 720, however, when I convert it to an mp4 file, the metadata says its 960 x 720. Another oddity.
Changing these different parameters around, I am still unable to get a valid buffer index in my Thread that processes the decoder output (dequeueOutputBuffer returns -1). I have also varied the timeout for this to no avail. If I manually set the SPS/PPS as the first packet NOT using the example above, I do get the -3 "output buffers have changed" which is meaningless since I am using API 20. But everything else I get returned is -1.
I've read about the Emulation Prevention Byte encoding of h.264. I am able to strip this byte out and send to the MediaCodec. Doesn't seem to make a difference. Also, the documentation for MediaCodec doesn't explicitly say if the code is expecting the EPB to be stripped out or not ... ?
Other than the video frame resolution, the only other thing that is different than my previous success is the existence of the SEI packet type [0x06]. I'm not sure if I should be doing something special with this or not?
I know there are a number of folks who have used MediaCodec have had issues with it, mostly because the documentation is not very good. Can anyone offer any other advice as to what I could be doing wrong?
I'm using the Android MediaCodec library to decode a video stored on the file system. I get an output buffer that looks legit (with proper bufferinfo.offset and size). Its format seems to be 256 (which is JPEG). I tried decoding it with BitmapFactory.decodeByteArray, but the result was null.
Does anyone know the correct way to ascertain the format of the output buffer? What's the correct way to start decoding the output byte arrays?
The MediaCodec color formats are defined by the MediaCodecInfo.CodecCapabilities class. 256 is used internally, and generally doesn't mean that you have a buffer of JPEG data. The confusion here is likely because you're looking at constants in the ImageFormat class, but those only apply to camera output. (For example, ImageFormat.NV16 is a YCbCr format, while COLOR_Format32bitARGB8888 is RGB, but both have the numeric value 16.)
Some examples of MediaCodec usage, including links to CTS tests that exercise MediaCodec, can be found here. On some devices you will not be able to decode data from the ByteBuffer output, and must instead decode to a Surface.