ffmpeg how to save decoded audio data to pcm

ffmpeg how to save decoded audio data to pcm - android

I have succeed decode audio data from a mp4 using avcodec_decode_audio4， I want to save the decoded frames,so I tried below
if (got_frame) {
int size;
uint8_t *data;
int ref = 0;
ret = swr_convert(swr, &data, frame->nb_samples, (const uint8_t **)frame->extended_data, frame->nb_samples);
//fwrite(data, 1, frame->nb_samples, fp_audio);
ref++;
int szie = av_samples_get_buffer_size(NULL, 2, 1024, AV_SAMPLE_FMT_FLTP, 1);
for (int i = 0; i < frame->linesize[0]/4; i++)
{
fwrite(frame->data[0] + 4*i, 1, 4, fp_audio);
fwrite(frame->data[1] + 4*i, 1, 4, fp_audio);
ref++;
}
av_frame_unref(frame);
}
but the pcm sounds strange, I also tried directed write as follows
fwrite(frame->data[0], 1, frame->linesize[0], fp_audio);
or:
fwrite(frame->data[0], 1, frame->linesize[0], fp_audio);
fwrite(frame->data[1], 1, frame->linesize[0], fp_audio);
I know that the decoded pcm is AV_SAMPLE_FMT_FLTP
any help would be appreciated

FLTP is planar float, so in case of stereo, you have two buffers, data[0] and data[1], which are per-channel planes.
For things like .wav or so, you typically want to write interleaved data, so basically an array where each even entry is left and each odd entry is right channel. To do that, convert to FLT (without P). Also note that .wav typically uses int16, not float, so for that, convert to S16.
Decoders output planar because that's how compressed streams typically layout their data, so for the individual decoders, this makes more sense.

Related

MediaCodec - downsampled audio from 48k Hz to 44.1k Hz still plays at slower speed

So far in my quest to concatenate videos with MediaCodec I've finally managed to resample 48k Hz audio to 44.1k Hz.
I've been testing joining videos together with two videos, the first one having an audio track with 22050 Hz 2 channels format, the second one having an audio track with 24000 Hz 1 channel format. Since my decoder just outputs 44100 Hz 2 channels raw audio for the first video and 48000 Hz 2 channels raw audio for the second one, I resampled the ByteBuffers that the second video's decoder outputs from 48000 Hz down to 44100 Hz using this method:
private byte[] minorDownsamplingFrom48kTo44k(byte[] origByteArray)
{
int origLength = origByteArray.length;
int moddedLength = origLength * 147/160;
//int moddedLength = 187*36;
int delta = origLength - moddedLength;
byte[] resultByteArray = new byte[moddedLength];
int arrayIndex = 0;
for(int i = 0; i < origLength; i+=44)
{
for(int j = i; j < (i+40 > origLength ? origLength : i + 40); j++)
{
resultByteArray[arrayIndex] = origByteArray[j];
arrayIndex++;
}
//Log.i("array_iter", i+" "+arrayIndex);
}
//smoothArray(resultByteArray, 3);
return resultByteArray;
}
However, in the output video file, the video plays at a slower speed upon reaching the second video with the downsampled audio track. The pitch is the same and the noise is gone, but the audio samples just play slower.
My output format is actually 22050 Hz 2 channels, following the first video.
EDIT: It's as if the player still plays the audio as if it has a sample rate of 48000 Hz even after it's downsampled to 44100 Hz.
My questions:
How do I mitigate this problem? Because I don't think changing the timestamps works in this case. I just use the decoder-provided timestamps with some offset based on the first video's last timestamp.
Is the issue related to the CSD-0 ByteBuffers?
If MediaCodec has the option of changing the video bitrate on the fly, would a new feature of changing the audio sample rate or channel count on the fly be feasible?

Turns out it was something as simple as limiting the size of my ByteBuffers.
The decoder outputs 8192 bytes (2048 samples).
After downsampling, the data becomes 7524 bytes (1881 samples) - originally 7526 bytes but that amounts to 1881.5 samples, so I rounded it down.
The prime mistake was in this code where I have to bring the sample rate close to the original:
byte[] finalByteBufferContent = new byte[size / 2]; //here
for (int i = 0; i < bufferSize; i += 2) {
if ((i + 1) * ((int) samplingFactor) > testBufferContents.length) {
finalByteBufferContent[i] = 0;
finalByteBufferContent[i + 1] = 0;
} else {
finalByteBufferContent[i] = testBufferContents[i * ((int) samplingFactor)];
finalByteBufferContent[i + 1] = testBufferContents[i * ((int) samplingFactor) + 1];
}
}
bufferSize = finalByteBufferContent.length;
Where size is the decoder output ByteBuffer's length and testBufferContents is the byte array I use to modify its contents (and is the one that was downsampled to 7524 bytes).
The resulting byte array's length was still 4096 bytes instead of 3762 bytes.
Changing new byte[size / 2] to new byte[testBufferContents.length / 2] resolved that problem.

Media Extractor: Decoder gives wrong Width on Android 4.2

I'm writing a plugin for Unity that decodes and takes the frames from a video file using the Media Extractor and re-encodes to a new video file. However the frames are being decoded into an array of the wrong size (on android 4.2.2) because the codec thinks the height is 736 when it is actually 720.
for (int i = 0; i < numTracks; ++i)
{
MediaFormat format = extractor.getTrackFormat(i);
String mime = format .getString(MediaFormat.KEY_MIME);
if(mime.startsWith("video/"))
{
extractor.selectTrack(i);
//Decoder
decoder = MediaCodec.createDecoderByType(mime);
decoder.configure(format, null, null, 0);
break;
}
}
The output buffer index returns INFO_OUTPUT_BUFFERS_CHANGED and then INFO_OUTPUT_FORMAT_CHANGED. Logging this informs me that decoder thinks there is a height of 736 instead of the correct 720.
decoder.queueInputBuffer(inputBufIndex, 0, sampleSize, extractor.getSampleTime(), 0);
//Get Outputbuffer Index
int outIndex = decoder.dequeueOutputBuffer(info, 10000);
This works fine on a device running 4.4, the problem is only present on an older 4.2 device. Anyone have any thoughts?

Keep in mind that you need to check the crop fields in MediaFormat as well, the height field is the full height of the output buffer including potential padding. See e.g. the checkFrame function in https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/EncodeDecodeTest.java - you'll get the actual content height as format.getInteger("crop-bottom") - format.getInteger("crop-top") + 1.

Android - get duration of AMR audio file programmatically

How do I get duration of an AMR file?
mRecorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
mRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);
I want to get the duration of the file after the recording is stopped WITHOUT creating any MediaPlayer and get the duration from it. For a regular Wav file I simply do this:
fileLength / byteRate
but for AMR I didn't know the byteRate and I'm not sure this will be ok though since WAV is raw PCM data(uncompressed) and AMR is compressed.

Maybe the 3GP container contains information about the content length? The 3GPP file format spec is available if you want to read it.
For a raw .amr file you'd have to traverse all the frames to find the length of the audio, since each frame can be encoded with a different bitrate.
The process for doing this would be:
Skip the first 6 bytes of the file (the AMR signature).
The rest of the file will be audio frames, which each starts with a one-byte header. Read that byte and look at bits 3..6 (the codec mode). For AMR-NB the valid codec modes are 0..7, which you can map to the size of the frame in bytes using the table below.
Once you know the size of the current frame, skip past it and parse the next frame. Repeat until you reach the end of the file.
If you've counted the number of frames in the file you can multiply that number by 20 to get the length of the audio in milliseconds.
Frame size table:
Codec mode Frame size
----------------------
0 13
1 14
2 16
3 18
4 20
5 21
6 27
7 32
(Source)

Java Code:
https://blog.csdn.net/fjh658/article/details/12869073
C# Code (From MemoryStream):
private double getAmrDuration(MemoryStream originalAudio)
{
double duration = -1;
int[] packedSize = new int[] { 12, 13, 15, 17, 19, 20, 26, 31, 5, 0, 0, 0, 0, 0, 0, 0 };
long length = originalAudio.Length;
int pos = 6;
int frameCount = 0;
int packedPos = -1;
byte[] datas = new byte[1];
while ((pos <= length))
{
originalAudio.Seek(pos, SeekOrigin.Begin);
if ((originalAudio.Read(datas, 0, 1) != 1))
{
duration = length > 0 ? ((length - 6) / 650) : 0;
break;
}
packedPos = (datas[0] >> 3) & 0x0F;
pos += packedSize[packedPos] + 1;
frameCount++;
}
/// //////////////////////////////////////////////////
duration = (duration + (frameCount * 20));
// 'p*20
return duration/1000;
}

Create video from screen grabs in android

I would like to record user interaction in a video that people can then upload to their social media sites.
For example, the Talking Tom Cat android app has a little camcorder icon. The user can press the camcorder icon, then interact with the app, press the icon to stop the recording and then the video is processed/converted ready for upload.
I think I can use setDrawingCacheEnabled(true) to save images but don't know how to add audio or make a video.
Update: After further reading I think I will need to use the NDK and ffmpeg. I prefer not to do this, but, if there are no other options, does anyone know how to do this?
Does anyone know how to do this in Android?
Relevant links...
Android Screen capturing or make video from images
how to record screen video as like Talking Tomcat application does in iphone?

Use the MediaCodec API with CONFIGURE_FLAG_ENCODE to set it up as an encoder. No ffmpeg required :)
You've already found how to grab the screen in the other question you linked to, now you just need to feed each captured frame to MediaCodec, setting the appropriate format flags, timestamp, etc.
EDIT: Sample code for this was hard to find, but here it is, hat tip to Martin Storsjö. Quick API walkthrough:
MediaFormat inputFormat = MediaFormat.createVideoFormat("video/avc", width, height);
inputFormat.setInteger(MediaFormat.KEY_BIT_RATE, bitRate);
inputFormat.setInteger(MediaFormat.KEY_FRAME_RATE, frameRate);
inputFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, colorFormat);
inputFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 75);
inputFormat.setInteger("stride", stride);
inputFormat.setInteger("slice-height", sliceHeight);
encoder = MediaCodec.createByCodecName("OMX.TI.DUCATI1.VIDEO.H264E"); // need to find name in media codec list, it is chipset-specific
encoder.configure(inputFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
encoder.start();
encoderInputBuffers = encoder.getInputBuffers();
encoderOutputBuffers = encoder.getOutputBuffers();
byte[] inputFrame = new byte[frameSize];
while ( ... have data ... ) {
int inputBufIndex = encoder.dequeueInputBuffer(timeout);
if (inputBufIndex >= 0) {
ByteBuffer inputBuf = encoderInputBuffers[inputBufIndex];
inputBuf.clear();
// HERE: fill in input frame in correct color format, taking strides into account
// This is an example for I420
for (int i = 0; i < width; i++) {
for (int j = 0; j < height; j++) {
inputFrame[ i * stride + j ] = ...; // Y[i][j]
inputFrame[ i * stride/2 + j/2 + stride * sliceHeight ] = ...; // U[i][j]
inputFrame[ i * stride/2 + j/2 + stride * sliceHeight * 5/4 ] = ...; // V[i][j]
}
}
inputBuf.put(inputFrame);
encoder.queueInputBuffer(
inputBufIndex,
0 /* offset */,
sampleSize,
presentationTimeUs,
0);
}
int outputBufIndex = encoder.dequeueOutputBuffer(info, timeout);
if (outputBufIndex >= 0) {
ByteBuffer outputBuf = encoderOutputBuffers[outputBufIndex];
// HERE: read get the encoded data
encoder.releaseOutputBuffer(
outputBufIndex,
false);
}
else {
// Handle change of buffers, format, etc
}
}
There are also some open issues.
EDIT: You'd feed the data in as a byte buffer in one of the supported pixel formats, for example I420 or NV12. There is unfortunately no perfect way of determining which formats would work on a particular device; however it is typical for the same formats you can get from the camera to work with the encoder.

Encoding H.264 from camera with Android MediaCodec

I'm trying to get this to work on Android 4.1 (using an upgraded Asus Transformer tablet). Thanks to Alex's response to my previous question, I already was able to write some raw H.264 data to a file, but this file is only playable with ffplay -f h264, and it seems like it's lost all information regarding the framerate (extremely fast playback). Also the color-space looks incorrect (atm using the camera's default on encoder's side).
public class AvcEncoder {
private MediaCodec mediaCodec;
private BufferedOutputStream outputStream;
public AvcEncoder() {
File f = new File(Environment.getExternalStorageDirectory(), "Download/video_encoded.264");
touch (f);
try {
outputStream = new BufferedOutputStream(new FileOutputStream(f));
Log.i("AvcEncoder", "outputStream initialized");
} catch (Exception e){
e.printStackTrace();
}
mediaCodec = MediaCodec.createEncoderByType("video/avc");
MediaFormat mediaFormat = MediaFormat.createVideoFormat("video/avc", 320, 240);
mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, 125000);
mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 15);
mediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Planar);
mediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 5);
mediaCodec.configure(mediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
mediaCodec.start();
}
public void close() {
try {
mediaCodec.stop();
mediaCodec.release();
outputStream.flush();
outputStream.close();
} catch (Exception e){
e.printStackTrace();
}
}
// called from Camera.setPreviewCallbackWithBuffer(...) in other class
public void offerEncoder(byte[] input) {
try {
ByteBuffer[] inputBuffers = mediaCodec.getInputBuffers();
ByteBuffer[] outputBuffers = mediaCodec.getOutputBuffers();
int inputBufferIndex = mediaCodec.dequeueInputBuffer(-1);
if (inputBufferIndex >= 0) {
ByteBuffer inputBuffer = inputBuffers[inputBufferIndex];
inputBuffer.clear();
inputBuffer.put(input);
mediaCodec.queueInputBuffer(inputBufferIndex, 0, input.length, 0, 0);
}
MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();
int outputBufferIndex = mediaCodec.dequeueOutputBuffer(bufferInfo,0);
while (outputBufferIndex >= 0) {
ByteBuffer outputBuffer = outputBuffers[outputBufferIndex];
byte[] outData = new byte[bufferInfo.size];
outputBuffer.get(outData);
outputStream.write(outData, 0, outData.length);
Log.i("AvcEncoder", outData.length + " bytes written");
mediaCodec.releaseOutputBuffer(outputBufferIndex, false);
outputBufferIndex = mediaCodec.dequeueOutputBuffer(bufferInfo, 0);
}
} catch (Throwable t) {
t.printStackTrace();
}
}
Changing the encoder type to "video/mp4" apparently solves the framerate-problem, but since the main goal is to make a streaming service, this is not a good solution.
I'm aware that I dropped some of Alex' code considering the SPS and PPS NALU's, but I was hoping this would not be necessary since that information was also coming from outData and I assumed the encoder would format this correctly. If this is not the case, how should I arrange the different types of NALU's in my file/stream?
So, what am I missing here in order to make a valid, working H.264 stream? And which settings should I use to make a match between the camera's colorspace and the encoder's colorspace?
I have a feeling this is more of a H.264-related question than a Android/MediaCodec topic. Or am I still not using the MediaCodec API correctly?
Thanks in advance.

For your fast playback - frame rate issue, there is nothing you have to do here. Since it is a streaming solution the other side has to be told the frame rate in advance or timestamps with each frame. Both of these are not part of elementary stream. Either pre-determined framerate is chosen or you pass on some sdp or something like that or you use existing protocols like rtsp. In the second case the timestamps are part of the stream sent in form of something like rtp. Then the client has to depay the rtp stream and play it bacl. This is how elementary streaming works. [either fix your frame rate if you have a fixed rate encoder or give timestamps]
Local PC playback will be fast because it will not know the fps. By giving the fps parameter before the input e.g
ffplay -fps 30 in.264
you can control the playback on the PC.
As for the file not being playable: Does it have a SPS and PPS. Also you should have NAL headers enabled - annex b format. I don't know much about android, but this is requirement for any h.264 elementary stream to be playable when they are not in any containers and need to be dumped and played later.
If android default is mp4, but default annexb headers will be switched off, so perhaps there is a switch to enable it. Or if you are getting data frame by frame, just add it yourself.
As for color format: I would guess the default should work. So try not setting it.
If not try 422 Planar or UVYV / VYUY interleaved formats. usually cameras are one of those. (but not necessary, these may be the ones I have encountered more often).

Android 4.3 (API 18) provides an easy solution. The MediaCodec class now accepts input from Surfaces, which means you can connect the camera's Surface preview to the encoder and bypass all the weird YUV format issues.
There is also a new MediaMuxer class that will convert your raw H.264 stream to a .mp4 file (optionally blending in an audio stream).
See the CameraToMpegTest source for an example of doing exactly this. (It also demonstrates the use of an OpenGL ES fragment shader to perform a trivial edit on the video as it's being recorded.)

You can convert color spaces like this, if you have set the preview color space to YV12:
public static byte[] YV12toYUV420PackedSemiPlanar(final byte[] input, final byte[] output, final int width, final int height) {
/*
* COLOR_TI_FormatYUV420PackedSemiPlanar is NV12
* We convert by putting the corresponding U and V bytes together (interleaved).
*/
final int frameSize = width * height;
final int qFrameSize = frameSize/4;
System.arraycopy(input, 0, output, 0, frameSize); // Y
for (int i = 0; i < qFrameSize; i++) {
output[frameSize + i*2] = input[frameSize + i + qFrameSize]; // Cb (U)
output[frameSize + i*2 + 1] = input[frameSize + i]; // Cr (V)
}
return output;
}
Or
public static byte[] YV12toYUV420Planar(byte[] input, byte[] output, int width, int height) {
/*
* COLOR_FormatYUV420Planar is I420 which is like YV12, but with U and V reversed.
* So we just have to reverse U and V.
*/
final int frameSize = width * height;
final int qFrameSize = frameSize/4;
System.arraycopy(input, 0, output, 0, frameSize); // Y
System.arraycopy(input, frameSize, output, frameSize + qFrameSize, qFrameSize); // Cr (V)
System.arraycopy(input, frameSize + qFrameSize, output, frameSize, qFrameSize); // Cb (U)
return output;
}

You can query the MediaCodec for it's supported bitmap format and query your preview.
Problem is, some MediaCodecs only support proprietary packed YUV formats that you can't get from the preview.
Particularly 2130706688 = 0x7F000100 = COLOR_TI_FormatYUV420PackedSemiPlanar .
Default format for the preview is 17 = NV21 = MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV411Planar = YCbCr 420 Semi Planar

If you did not explicitly request another pixel format, the camera preview buffers will arrive in a YUV 420 format known as NV21, for which COLOR_FormatYCrYCb is the MediaCodec equivalent.
Unfortunately, as other answers on this page mention, there is no guarantee that on your device, the AVC encoder supports this format. Note that there exist some strange devices that do not support NV21, but I don't know any that can be upgraded to API 16 (hence, have MediaCodec).
Google documentation also claims that YV12 planar YUV must be supported as camera preview format for all devices with API >= 12. Therefore, it may be useful to try it (the MediaCodec equivalent is COLOR_FormatYUV420Planar which you use in your code snippet).
Update: as Andrew Cottrell reminded me, YV12 still needs chroma swapping to become COLOR_FormatYUV420Planar.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.