I am developing a webrtc based video chat app, currently the video call is working, but I want to record a video from the remote video stream using the VideoFileRenderer, there are many implementations of the interface for example: https://chromium.googlesource.com/external/webrtc/+/master/sdk/android/api/org/webrtc/VideoFileRenderer.java
this is the implementation I am using. It saves the video to the file with no problem but I can only play it with desktop after using a codec because the file is .y4m not .mp4 and when I try to play it using VideoView it says that it can't play the video, even if I try to play the video with the videoPlayer that comes with the android it can't play it, I can only play it using MXPlayer, VLC, or any other application that has codecs in desktop.
to simplify the question:
How can I play video.y4m on native android VideoView?
I will simplify it more, I will assume that I don't understand the format of the recorded file, here is the code I am using to record the file:
When start recording:
remoteVideoFileRenderer = new VideoFileRenderer(
When finish recording:
Now the question again: I have a "fileToRecordTo" and this video file can be played on GOM(windows), VLC(windows, mac and Android), MXPlayer(Android) but I can't neither play it using the player that comes embedded with the Android(if worked, I would have used this player in my app) nor on Android native videoView.
any help.
Video only recording
I had a similar case in my project. At first, I tried WebRTC's default VideoFileRenderer but the video size was too big because no compression is applied.
I found this repository. It really helped in my case.
Here is a step by step guide. I've also made some adjustments.
Add this class to your project. It has lots of options to configure the final video format.
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.media.MediaMuxer;
import android.os.Handler;
import android.os.HandlerThread;
import android.util.Log;
import android.view.Surface;
import org.webrtc.EglBase;
import org.webrtc.GlRectDrawer;
import org.webrtc.VideoFrame;
import org.webrtc.VideoFrameDrawer;
import org.webrtc.VideoSink;
import java.io.IOException;
import java.nio.ByteBuffer;
class FileEncoder implements VideoSink {
private static final String TAG = "FileRenderer";
private final HandlerThread renderThread;
private final Handler renderThreadHandler;
private int outputFileWidth = -1;
private int outputFileHeight = -1;
private ByteBuffer[] encoderOutputBuffers;
private EglBase eglBase;
private EglBase.Context sharedContext;
private VideoFrameDrawer frameDrawer;
private static final String MIME_TYPE = "video/avc"; // H.264 Advanced Video Coding
private static final int FRAME_RATE = 30; // 30fps
private static final int IFRAME_INTERVAL = 5; // 5 seconds between I-frames
private MediaMuxer mediaMuxer;
private MediaCodec encoder;
private MediaCodec.BufferInfo bufferInfo;
private int trackIndex = -1;
private boolean isRunning = true;
private GlRectDrawer drawer;
private Surface surface;
FileEncoder(String outputFile, final EglBase.Context sharedContext) throws IOException {
renderThread = new HandlerThread(TAG + "RenderThread");
renderThreadHandler = new Handler(renderThread.getLooper());
bufferInfo = new MediaCodec.BufferInfo();
this.sharedContext = sharedContext;
mediaMuxer = new MediaMuxer(outputFile,
private void initVideoEncoder() {
MediaFormat format = MediaFormat.createVideoFormat(MIME_TYPE, 1280, 720);
format.setInteger(MediaFormat.KEY_BIT_RATE, 6000000);
format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE);
format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, IFRAME_INTERVAL);
try {
encoder = MediaCodec.createEncoderByType(MIME_TYPE);
encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
renderThreadHandler.post(() -> {
eglBase = EglBase.create(sharedContext, EglBase.CONFIG_RECORDABLE);
surface = encoder.createInputSurface();
drawer = new GlRectDrawer();
} catch (Exception e) {
Log.wtf(TAG, e);
public void onFrame(VideoFrame frame) {
if (outputFileWidth == -1) {
outputFileWidth = frame.getRotatedWidth();
outputFileHeight = frame.getRotatedHeight();
renderThreadHandler.post(() -> renderFrameOnRenderThread(frame));
private void renderFrameOnRenderThread(VideoFrame frame) {
if (frameDrawer == null) {
frameDrawer = new VideoFrameDrawer();
frameDrawer.drawFrame(frame, drawer, null, 0, 0, outputFileWidth, outputFileHeight);
* Release all resources. All already posted frames will be rendered first.
void release() {
isRunning = false;
renderThreadHandler.post(() -> {
if (encoder != null) {
private boolean encoderStarted = false;
private volatile boolean muxerStarted = false;
private long videoFrameStart = 0;
private void drainEncoder() {
if (!encoderStarted) {
encoderOutputBuffers = encoder.getOutputBuffers();
encoderStarted = true;
while (true) {
int encoderStatus = encoder.dequeueOutputBuffer(bufferInfo, 10000);
if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
// not expected for an encoder
encoderOutputBuffers = encoder.getOutputBuffers();
Log.e(TAG, "encoder output buffers changed");
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
// not expected for an encoder
MediaFormat newFormat = encoder.getOutputFormat();
Log.e(TAG, "encoder output format changed: " + newFormat);
trackIndex = mediaMuxer.addTrack(newFormat);
if (!muxerStarted) {
muxerStarted = true;
if (!muxerStarted)
} else if (encoderStatus < 0) {
Log.e(TAG, "unexpected result fr om encoder.dequeueOutputBuffer: " + encoderStatus);
} else { // encoderStatus >= 0
try {
ByteBuffer encodedData = encoderOutputBuffers[encoderStatus];
if (encodedData == null) {
Log.e(TAG, "encoderOutputBuffer " + encoderStatus + " was null");
// It's usually necessary to adjust the ByteBuffer values to match BufferInfo.
encodedData.limit(bufferInfo.offset + bufferInfo.size);
if (videoFrameStart == 0 && bufferInfo.presentationTimeUs != 0) {
videoFrameStart = bufferInfo.presentationTimeUs;
bufferInfo.presentationTimeUs -= videoFrameStart;
if (muxerStarted)
mediaMuxer.writeSampleData(trackIndex, encodedData, bufferInfo);
isRunning = isRunning && (bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) == 0;
encoder.releaseOutputBuffer(encoderStatus, false);
if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
} catch (Exception e) {
Log.wtf(TAG, e);
private long presTime = 0L;
Now on your Activity/Fragment class
Declare a variable of the above class
FileEncoder recording;
When you receive the stream you want to record(remote or local) you can initialize the recording.
FileEncoder recording = new FileEncoder("path/to/video", rootEglBase.eglBaseContext)
When the call session ends, you need to stop and release the recording.
This is enough to record the video but without audio.
Video & Audio recording
To record local peer's audio you need to consume this class(https://webrtc.googlesource.com/src/+/master/examples/androidapp/src/org/appspot/apprtc/RecordedAudioToFileController.java). But first you need to setup an AudioDeviceModule object
AudioDeviceModule adm = createJavaAudioDevice()
peerConnectionFactory = PeerConnectionFactory.builder()
private AudioDeviceModule createJavaAudioDevice() {
//Implement AudioRecordErrorCallback
//Implement AudioTrackErrorCallback
return JavaAudioDeviceModule.builder(this)
//Default audio source is Voice Communication which is good for VoIP sessions. You can change to the audio source you want.
Merge audio and video
Add this dependency
implementation 'com.googlecode.mp4parser:isoparser:1.1.22'
Then add this piece to your code when your call finishes. Make sure that video and audio recording are stopped and released properly.
try {
Movie video;
video = MovieCreator.build("path/to/recorded/video");
Movie audio;
audio = MovieCreator.build("path/to/recorded/audio");
Track audioTrack = audio.getTracks().get(0)
Container out = new DefaultMp4Builder().build(video);
FileChannel fc = new FileOutputStream(new File("path/to/final/output")).getChannel();
} catch (IOException e) {
I know this isn't the best solution for recording audio and video in an Android WebRTC video call. If someone knows how to extract audio using WebRTC please add a comment.
I am working on WebRTC. My task is to record audio and video before call connect. I have recorded only video using VideoFileRenderer class. How can I record audio and mux with video using MediaMuxer?
This is a VideoFileRender class which I am using for recording and muxing.
package org.webrtc;
import android.media.AudioFormat;
import android.media.AudioRecord;
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.media.MediaMuxer;
import android.media.MediaRecorder;
import android.os.Handler;
import android.os.HandlerThread;
import android.util.Log;
import android.view.Surface;
import org.webrtc.audio.AudioDeviceModule;
import org.webrtc.audio.JavaAudioDeviceModule;
import org.webrtc.audio.JavaAudioDeviceModule.SamplesReadyCallback;
import java.io.IOException;
import java.nio.ByteBuffer;
public class VideoFileRender implements VideoSink {
private static final String TAG = "VideoFileRender";
private final HandlerThread renderThread;
private final Handler renderThreadHandler;
private final HandlerThread audioHandlerThread;
private final Handler audioThreadHandler;
private int outputFileWidth = -1;
private int outputFileHeight = -1;
private ByteBuffer[] encoderOutputBuffers;
private ByteBuffer[] audioInputBuffers;
private ByteBuffer[] audioOutputBuffers;
private EglBase eglBase;
private EglBase.Context sharedContext;
private VideoFrameDrawer frameDrawer;
// TODO: these ought to be configurable as well
private static final String MIME_TYPE = "video/avc"; // H.264 Advanced Video Coding
private static final int FRAME_RATE = 30; // 30fps
private static final int FRAME_INTERVAL = 5; // 5 seconds between I-frames
private MediaMuxer mediaMuxer;
private MediaCodec encoder;
private MediaCodec.BufferInfo bufferInfo, audioBufferInfo;
private int trackIndex = -1;
private int audioTrackIndex;
private boolean isRunning = true;
private GlRectDrawer drawer;
private Surface surface;
private MediaCodec audioEncoder;
private AudioThread audioThread;
// private AudioThread audioThread;
private static final int SAMPLE_RATE = 44100; // 44.1[KHz] is only setting guaranteed to be available on all devices.
private static final int BIT_RATE = 64000;
public static final int SAMPLES_PER_FRAME = 1024; // AAC, bytes/frame/channel
public static final int FRAMES_PER_BUFFER = 25; // AAC, frame/buffer/sec
public VideoFileRender(String outputFile, final EglBase.Context sharedContext, boolean withAudio) throws IOException {
renderThread = new HandlerThread(TAG + "RenderThread");
renderThreadHandler = new Handler(renderThread.getLooper());
// if (withAudio) {
audioHandlerThread = new HandlerThread(TAG + "AudioThread");
audioThreadHandler = new Handler(audioHandlerThread.getLooper());
/*} else {
audioThread = null;
audioThreadHandler = null;
bufferInfo = new MediaCodec.BufferInfo();
this.sharedContext = sharedContext;
// Create a MediaMuxer. We can't add the video track and start() the muxer here,
// because our MediaFormat doesn't have the Magic Goodies. These can only be
// obtained from the encoder after it has started processing data.
mediaMuxer = new MediaMuxer(outputFile,
audioTrackIndex = withAudio ? -1 : 0;
audioThread = new AudioThread();
private void initVideoEncoder() {
MediaFormat format = MediaFormat.createVideoFormat(MIME_TYPE, outputFileWidth, outputFileHeight);
// Set some properties. Failing to specify some of these can cause the MediaCodec
// configure() call to throw an unhelpful exception.
format.setInteger(MediaFormat.KEY_BIT_RATE, 6000000);
format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE);
format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, FRAME_INTERVAL);
// Create a MediaCodec encoder and configure it with our format. Get a Surface
// we can use for input and wrap it with a class that handles the EGL work.
try {
encoder = MediaCodec.createEncoderByType(MIME_TYPE);
encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
renderThreadHandler.post(() -> {
eglBase = EglBase.create(sharedContext, EglBase.CONFIG_RECORDABLE);
surface = encoder.createInputSurface();
drawer = new GlRectDrawer();
} catch (Exception e) {
Log.wtf(TAG, e);
public void onFrame(VideoFrame frame) {
if (outputFileWidth == -1) {
outputFileWidth = frame.getRotatedWidth();
outputFileHeight = frame.getRotatedHeight();
Log.d(TAG, "onFrame: " + outputFileWidth + " " + outputFileHeight);
renderThreadHandler.post(() -> renderFrameOnRenderThread(frame));
private void renderFrameOnRenderThread(VideoFrame frame) {
if (frameDrawer == null) {
frameDrawer = new VideoFrameDrawer();
frameDrawer.drawFrame(frame, drawer, null, 0, 0, outputFileWidth, outputFileHeight);
* Release all resources. All already posted frames will be rendered first.
public void release() {
isRunning = false;
if (audioThreadHandler != null)
audioThreadHandler.post(() -> {
if (audioEncoder != null) {
audioThread = null;
renderThreadHandler.post(() -> {
muxerStarted = false;
if (encoder != null) {
private boolean encoderStarted = false;
private volatile boolean muxerStarted = false;
private long videoFrameStart = 0;
private void drainEncoder() {
if (!encoderStarted) {
encoderOutputBuffers = encoder.getOutputBuffers();
encoderStarted = true;
while (true) {
int encoderStatus = encoder.dequeueOutputBuffer(bufferInfo, 10000);
if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
// not expected for an encoder
encoderOutputBuffers = encoder.getOutputBuffers();
Log.e(TAG, "encoder output buffers changed");
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
// not expected for an encoder
MediaFormat newFormat = encoder.getOutputFormat();
Log.e(TAG, "encoder output format changed: " + newFormat);
trackIndex = mediaMuxer.addTrack(newFormat);
if (trackIndex != -1 && !muxerStarted) {
muxerStarted = true;
if (!muxerStarted)
} else if (encoderStatus < 0) {
Log.e(TAG, "unexpected result from encoder.dequeueOutputBuffer: " + encoderStatus);
} else { // encoderStatus >= 0
try {
ByteBuffer encodedData = encoderOutputBuffers[encoderStatus];
if (encodedData == null) {
Log.e(TAG, "encoderOutputBuffer " + encoderStatus + " was null");
// It's usually necessary to adjust the ByteBuffer values to match BufferInfo.
encodedData.limit(bufferInfo.offset + bufferInfo.size);
if (videoFrameStart == 0 && bufferInfo.presentationTimeUs != 0) {
videoFrameStart = bufferInfo.presentationTimeUs;
bufferInfo.presentationTimeUs -= videoFrameStart;
if (muxerStarted)
mediaMuxer.writeSampleData(trackIndex, encodedData, bufferInfo);
isRunning = isRunning && (bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) == 0;
encoder.releaseOutputBuffer(encoderStatus, false);
if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
} catch (Exception e) {
Log.wtf(TAG, e);
private long presTime = 0L;
private class AudioThread extends Thread {
public void run() {
if (audioEncoder == null) {
try {
final MediaFormat audioFormat = MediaFormat.createAudioFormat(MIME_TYPE, SAMPLE_RATE, 1);
audioFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
audioFormat.setInteger(MediaFormat.KEY_CHANNEL_MASK, AudioFormat.CHANNEL_IN_MONO);
audioFormat.setInteger(MediaFormat.KEY_BIT_RATE, BIT_RATE);
audioFormat.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1);
audioEncoder = MediaCodec.createEncoderByType(MIME_TYPE);
audioEncoder.configure(audioFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
audioInputBuffers = audioEncoder.getInputBuffers();
audioOutputBuffers = audioEncoder.getOutputBuffers();
} catch (Exception e) {
try {
final int min_buffer_size = AudioRecord.getMinBufferSize(
if (buffer_size < min_buffer_size)
buffer_size = ((min_buffer_size / SAMPLES_PER_FRAME) + 1) * SAMPLES_PER_FRAME * 2;
AudioRecord audioRecord = new AudioRecord(
MediaRecorder.AudioSource.MIC, SAMPLE_RATE,
AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, buffer_size);
if (audioRecord.getState() != AudioRecord.STATE_INITIALIZED)
audioRecord = null;
if (audioRecord != null) {
final ByteBuffer buf = ByteBuffer.allocateDirect(SAMPLES_PER_FRAME);
int readBytes;
for (; isRunning; ) {
// read audio data from internal mic
readBytes = audioRecord.read(buf, SAMPLES_PER_FRAME);
if (readBytes > 0) {
// set audio data to encoder
encode(buf, readBytes, getPTSUs());
} catch (Exception e) {
private void encode(ByteBuffer buffer, int length, long presentationTimeUs) {
final ByteBuffer[] inputBuffers = audioEncoder.getInputBuffers();
final int inputBufferIndex = audioEncoder.dequeueInputBuffer(10000);
if (inputBufferIndex >= 0) {
final ByteBuffer inputBuffer = inputBuffers[inputBufferIndex];
if (buffer != null) {
// if (DEBUG) Log.v(TAG, "encode:queueInputBuffer");
if (length <= 0) {
audioEncoder.queueInputBuffer(inputBufferIndex, 0, 0,
presentationTimeUs, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
} else {
audioEncoder.queueInputBuffer(inputBufferIndex, 0, length,
presentationTimeUs, 0);
} else if (inputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER) {
// wait for MediaCodec encoder is ready to encode
// nothing to do here because MediaCodec#dequeueInputBuffer(TIMEOUT_USEC)
// will wait for maximum TIMEOUT_USEC(10msec) on each call
private long prevOutputPTSUs = 0;
protected long getPTSUs() {
long result = System.nanoTime() / 1000L;
if (result < prevOutputPTSUs)
result = (prevOutputPTSUs - result) + result;
return result;
I expect using scenario mp4 file will be created with audio and video, but it creates an mp4 file with only video.
Please help me! I used this example in https://github.com/pchab/AndroidRTC to streaming video and audio from a android device to an other android device.In this example, they used 2 librarys is : libjingle_peerConnection and SocketIo client but i don't know how to save streaming data as h.264 format?
After a lot of tries and hard work about this project, I found the solution for saving video as mp4 without any problem.
add this VideoFileRenderer.java to your project
package org.webrtc;
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.media.MediaMuxer;
import android.os.Handler;
import android.os.HandlerThread;
import android.util.Log;
import android.view.Surface;
import org.webrtc.audio.AudioDeviceModule;
import org.webrtc.audio.JavaAudioDeviceModule;
import org.webrtc.audio.JavaAudioDeviceModule.SamplesReadyCallback;
import java.io.IOException;
import java.nio.ByteBuffer;
public class VideoFileRenderer implements VideoSink, SamplesReadyCallback {
private static final String TAG = "VideoFileRenderer";
private final HandlerThread renderThread;
private final Handler renderThreadHandler;
private final HandlerThread audioThread;
private final Handler audioThreadHandler;
private int outputFileWidth = -1;
private int outputFileHeight = -1;
private ByteBuffer[] encoderOutputBuffers;
private ByteBuffer[] audioInputBuffers;
private ByteBuffer[] audioOutputBuffers;
private EglBase eglBase;
private EglBase.Context sharedContext;
private VideoFrameDrawer frameDrawer;
// TODO: these ought to be configurable as well
private static final String MIME_TYPE = "video/avc"; // H.264 Advanced Video Coding
private static final int FRAME_RATE = 30; // 30fps
private static final int IFRAME_INTERVAL = 5; // 5 seconds between I-frames
private MediaMuxer mediaMuxer;
private MediaCodec encoder;
private MediaCodec.BufferInfo bufferInfo, audioBufferInfo;
private int trackIndex = -1;
private int audioTrackIndex;
private boolean isRunning = true;
private GlRectDrawer drawer;
private Surface surface;
private MediaCodec audioEncoder;
private AudioDeviceModule audioDeviceModule;
public VideoFileRenderer(String outputFile, final EglBase.Context sharedContext, boolean withAudio) throws IOException {
renderThread = new HandlerThread(TAG + "RenderThread");
renderThreadHandler = new Handler(renderThread.getLooper());
if (withAudio) {
audioThread = new HandlerThread(TAG + "AudioThread");
audioThreadHandler = new Handler(audioThread.getLooper());
} else {
audioThread = null;
audioThreadHandler = null;
bufferInfo = new MediaCodec.BufferInfo();
this.sharedContext = sharedContext;
// Create a MediaMuxer. We can't add the video track and start() the muxer here,
// because our MediaFormat doesn't have the Magic Goodies. These can only be
// obtained from the encoder after it has started processing data.
mediaMuxer = new MediaMuxer(outputFile,
audioTrackIndex = withAudio ? -1 : 0;
private void initVideoEncoder() {
MediaFormat format = MediaFormat.createVideoFormat(MIME_TYPE, outputFileWidth, outputFileHeight);
// Set some properties. Failing to specify some of these can cause the MediaCodec
// configure() call to throw an unhelpful exception.
format.setInteger(MediaFormat.KEY_BIT_RATE, 6000000);
format.setInteger(MediaFormat.KEY_FRAME_RATE, FRAME_RATE);
format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, IFRAME_INTERVAL);
// Create a MediaCodec encoder and configure it with our format. Get a Surface
// we can use for input and wrap it with a class that handles the EGL work.
try {
encoder = MediaCodec.createEncoderByType(MIME_TYPE);
encoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
renderThreadHandler.post(() -> {
eglBase = EglBase.create(sharedContext, EglBase.CONFIG_RECORDABLE);
surface = encoder.createInputSurface();
drawer = new GlRectDrawer();
} catch (Exception e) {
Log.wtf(TAG, e);
public void onFrame(VideoFrame frame) {
if (outputFileWidth == -1) {
outputFileWidth = frame.getRotatedWidth();
outputFileHeight = frame.getRotatedHeight();
renderThreadHandler.post(() -> renderFrameOnRenderThread(frame));
private void renderFrameOnRenderThread(VideoFrame frame) {
if (frameDrawer == null) {
frameDrawer = new VideoFrameDrawer();
frameDrawer.drawFrame(frame, drawer, null, 0, 0, outputFileWidth, outputFileHeight);
* Release all resources. All already posted frames will be rendered first.
public void release() {
isRunning = false;
if (audioThreadHandler != null)
audioThreadHandler.post(() -> {
if (audioEncoder != null) {
renderThreadHandler.post(() -> {
if (encoder != null) {
private boolean encoderStarted = false;
private volatile boolean muxerStarted = false;
private long videoFrameStart = 0;
private void drainEncoder() {
if (!encoderStarted) {
encoderOutputBuffers = encoder.getOutputBuffers();
encoderStarted = true;
while (true) {
int encoderStatus = encoder.dequeueOutputBuffer(bufferInfo, 10000);
if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
// not expected for an encoder
encoderOutputBuffers = encoder.getOutputBuffers();
Log.e(TAG, "encoder output buffers changed");
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
// not expected for an encoder
MediaFormat newFormat = encoder.getOutputFormat();
Log.e(TAG, "encoder output format changed: " + newFormat);
trackIndex = mediaMuxer.addTrack(newFormat);
if (audioTrackIndex != -1 && !muxerStarted) {
muxerStarted = true;
if (!muxerStarted)
} else if (encoderStatus < 0) {
Log.e(TAG, "unexpected result fr om encoder.dequeueOutputBuffer: " + encoderStatus);
} else { // encoderStatus >= 0
try {
ByteBuffer encodedData = encoderOutputBuffers[encoderStatus];
if (encodedData == null) {
Log.e(TAG, "encoderOutputBuffer " + encoderStatus + " was null");
// It's usually necessary to adjust the ByteBuffer values to match BufferInfo.
encodedData.limit(bufferInfo.offset + bufferInfo.size);
if (videoFrameStart == 0 && bufferInfo.presentationTimeUs != 0) {
videoFrameStart = bufferInfo.presentationTimeUs;
bufferInfo.presentationTimeUs -= videoFrameStart;
if (muxerStarted)
mediaMuxer.writeSampleData(trackIndex, encodedData, bufferInfo);
isRunning = isRunning && (bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) == 0;
encoder.releaseOutputBuffer(encoderStatus, false);
if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
} catch (Exception e) {
Log.wtf(TAG, e);
private long presTime = 0L;
private void drainAudio() {
if (audioBufferInfo == null)
audioBufferInfo = new MediaCodec.BufferInfo();
while (true) {
int encoderStatus = audioEncoder.dequeueOutputBuffer(audioBufferInfo, 10000);
if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
// not expected for an encoder
audioOutputBuffers = audioEncoder.getOutputBuffers();
Log.w(TAG, "encoder output buffers changed");
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
// not expected for an encoder
MediaFormat newFormat = audioEncoder.getOutputFormat();
Log.w(TAG, "encoder output format changed: " + newFormat);
audioTrackIndex = mediaMuxer.addTrack(newFormat);
if (trackIndex != -1 && !muxerStarted) {
muxerStarted = true;
if (!muxerStarted)
} else if (encoderStatus < 0) {
Log.e(TAG, "unexpected result fr om encoder.dequeueOutputBuffer: " + encoderStatus);
} else { // encoderStatus >= 0
try {
ByteBuffer encodedData = audioOutputBuffers[encoderStatus];
if (encodedData == null) {
Log.e(TAG, "encoderOutputBuffer " + encoderStatus + " was null");
// It's usually necessary to adjust the ByteBuffer values to match BufferInfo.
encodedData.limit(audioBufferInfo.offset + audioBufferInfo.size);
if (muxerStarted)
mediaMuxer.writeSampleData(audioTrackIndex, encodedData, audioBufferInfo);
isRunning = isRunning && (audioBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) == 0;
audioEncoder.releaseOutputBuffer(encoderStatus, false);
if ((audioBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
} catch (Exception e) {
Log.wtf(TAG, e);
public void onWebRtcAudioRecordSamplesReady(JavaAudioDeviceModule.AudioSamples audioSamples) {
if (!isRunning)
audioThreadHandler.post(() -> {
if (audioEncoder == null) try {
audioEncoder = MediaCodec.createEncoderByType("audio/mp4a-latm");
MediaFormat format = new MediaFormat();
format.setString(MediaFormat.KEY_MIME, "audio/mp4a-latm");
format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, audioSamples.getChannelCount());
format.setInteger(MediaFormat.KEY_SAMPLE_RATE, audioSamples.getSampleRate());
format.setInteger(MediaFormat.KEY_BIT_RATE, 64 * 1024);
format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
audioEncoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
audioInputBuffers = audioEncoder.getInputBuffers();
audioOutputBuffers = audioEncoder.getOutputBuffers();
} catch (IOException exception) {
Log.wtf(TAG, exception);
int bufferIndex = audioEncoder.dequeueInputBuffer(0);
if (bufferIndex >= 0) {
ByteBuffer buffer = audioInputBuffers[bufferIndex];
byte[] data = audioSamples.getData();
audioEncoder.queueInputBuffer(bufferIndex, 0, data.length, presTime, 0);
presTime += data.length * 125 / 12; // 1000000 microseconds / 48000hz / 2 bytes
then add this implementation for this recording MediaRecorderImpl.java
package com.vedja.hassan.kavandeh_master.utils;
import android.support.annotation.Nullable;
import android.util.Log;
import com.vedja.hassan.kavandeh_master.utils.utils.EglUtils;
import org.webrtc.VideoFileRenderer;
import org.webrtc.VideoTrack;
import org.webrtc.audio.AudioDeviceModule;
import org.webrtc.audio.JavaAudioDeviceModule;
import java.io.File;
public class MediaRecorderImpl {
private final Integer id;
private final VideoTrack videoTrack;
private final AudioSamplesInterceptor audioInterceptor;
private VideoFileRenderer videoFileRenderer;
private boolean isRunning = false;
private File recordFile;
public MediaRecorderImpl(Integer id, #Nullable VideoTrack videoTrack, #Nullable AudioSamplesInterceptor audioInterceptor) {
this.id = id;
this.videoTrack = videoTrack;
this.audioInterceptor = audioInterceptor;
public void startRecording(File file) throws Exception {
recordFile = file;
if (isRunning)
isRunning = true;
//noinspection ResultOfMethodCallIgnored
if (videoTrack != null) {
videoFileRenderer = new VideoFileRenderer(
audioInterceptor != null
if (audioInterceptor != null)
audioInterceptor.attachCallback(id, videoFileRenderer);
} else {
Log.e(TAG, "Video track is null");
if (audioInterceptor != null) {
//TODO(rostopira): audio only recording
throw new Exception("Audio-only recording not implemented yet");
public File getRecordFile() { return recordFile; }
public void stopRecording() {
isRunning = false;
if (audioInterceptor != null)
if (videoTrack != null && videoFileRenderer != null) {
videoFileRenderer = null;
private static final String TAG = "MediaRecorderImpl";
and use above code with this code
final AudioSamplesInterceptor inputSamplesInterceptor = new AudioSamplesInterceptor();
private OutputAudioSamplesInterceptor outputSamplesInterceptor = null;
private final SparseArray<MediaRecorderImpl> mediaRecorders = new SparseArray<>();
void startRecordingToFile(String path, Integer id, #Nullable VideoTrack videoTrack, #Nullable AudioChannel audioChannel) throws Exception {
AudioSamplesInterceptor interceptor = null;
if (audioChannel == AudioChannel.INPUT)
interceptor = inputSamplesInterceptor;
else if (audioChannel == AudioChannel.OUTPUT) {
if (outputSamplesInterceptor == null)
outputSamplesInterceptor = new OutputAudioSamplesInterceptor(audioDeviceModule);
interceptor = outputSamplesInterceptor;
mediaRecorder = new MediaRecorderImpl(id, videoTrack, interceptor);
mediaRecorder.startRecording(new File(path));
mediaRecorders.append(id, mediaRecorder);
void stopRecording(Integer id) {
MediaRecorderImpl mediaRecorder = mediaRecorders.get(id);
if (mediaRecorder != null) {
File file = mediaRecorder.getRecordFile();
if (file != null) {
ContentValues values = new ContentValues(3);
values.put(MediaStore.Video.Media.TITLE, file.getName());
values.put(MediaStore.Video.Media.MIME_TYPE, "video/mp4");
values.put(MediaStore.Video.Media.DATA, file.getAbsolutePath());
getContentResolver().insert(MediaStore.Video.Media.EXTERNAL_CONTENT_URI, values);
finaly use this
try {
VedjaSharedPreference sharedPreference = new VedjaSharedPreference(getContext());
final File dir = new File(sharedPreference.getStringParam(StaticParameter.SAVING_URL) + "/audio/");
dir.mkdirs(); //create folders where write files
final File file = new File(dir, "Vedja-".concat(String.valueOf(System.currentTimeMillis())).concat(".mp3"));
VideoTrack videoTrack = null;
MediaStreamTrack track = slaveManagerActivity.remoteStream.videoTracks.get(0);
if (track instanceof VideoTrack)
videoTrack = (VideoTrack) track;
AudioChannel audioChannel = AudioChannel.OUTPUT;
slaveManagerActivity.startRecordingToFile(file.getPath(), 1, videoTrack, audioChannel);
} catch (Exception e) {
throw new RuntimeException(
"Failed to open video file for output: ", e);
Maybe after copying this code, some class is not found in your project. you can search this class on the internet.
in this project have a class VideoFileRendere you can use this Rendere for save video in file
I am trying to adapt the code found in ExtractDecodeEditEncodeMuxTest.java in order to extract audio and video from a mp4 recorded via Cordova's device.capture.captureVideo, decode the audio, edit the decoded audio samples, encode the audio, and mux the audio back with the video and save as an mp4 again.
My first attempt is simply to extract, decode, encode and mux audio without trying to edit any of the audio samples - if I can do this I am fairly certain that I can edit the decoded samples as desired. I don't need to edit the video, so I assume I can simply use MediaExtractor to extract and mux the video track.
However, the problem I am having is that I cannot seem to get the audio decoding/encoding process right. What keeps happening is that the muxer creates the mp4 from the extracted video track and the extracted -> decoded -> encoded audio track, but while the video plays fine, the audio starts with a short burst of noise, then what seems like the last couple seconds of audio data playing normally (but at the beginning of the video), then silence for the rest of the video.
Some of the relevant fields:
private MediaFormat audioFormat;
private MediaFormat videoFormat;
private int videoTrackIndex = -1;
private int audioTrackIndex = -1;
private static final int MAX_BUFFER_SIZE = 256 * 1024;
// parameters for the audio encoder
private static final String OUTPUT_AUDIO_MIME_TYPE = "audio/mp4a-latm"; // Advanced Audio Coding
private static final int OUTPUT_AUDIO_CHANNEL_COUNT = 2; // Must match the input stream. not using this, getting from input format
private static final int OUTPUT_AUDIO_BIT_RATE = 128 * 1024;
private static final int OUTPUT_AUDIO_AAC_PROFILE = MediaCodecInfo.CodecProfileLevel.AACObjectHE; //not using this, getting from input format
private static final int OUTPUT_AUDIO_SAMPLE_RATE_HZ = 44100; // Must match the input stream
private static final String TAG = "vvsLog";
private static final Boolean DEBUG = false;
private static final Boolean INFO = true;
/** How long to wait for the next buffer to become available. */
private static final int TIMEOUT_USEC = 10000;
private String videoPath;
The code configuring the decoder, encoder and muxer:
MediaCodecInfo audioCodecInfo = selectCodec(OUTPUT_AUDIO_MIME_TYPE);
if (audioCodecInfo == null) {
// Don't fail CTS if they don't have an AAC codec (not here, anyway).
Log.e(TAG, "Unable to find an appropriate codec for " + OUTPUT_AUDIO_MIME_TYPE);
MediaExtractor videoExtractor = null;
MediaExtractor audioExtractor = null;
MediaCodec audioDecoder = null;
MediaCodec audioEncoder = null;
MediaMuxer muxer = null;
try {
* Video
* just need to configure the extractor, no codec processing required
videoExtractor = createExtractor(originalAssetPath);
String vidMimeStartsWith = "video/";
int videoInputTrack = getAndSelectTrackIndex(videoExtractor, vidMimeStartsWith);
videoFormat = videoExtractor.getTrackFormat(videoInputTrack);
* Audio
* needs an extractor plus an audio decoder and encoder
audioExtractor = createExtractor(originalAssetPath);
String audMimeStartsWith = "audio/";
int audioInputTrack = getAndSelectTrackIndex(audioExtractor, audMimeStartsWith);
audioFormat = audioExtractor.getTrackFormat(audioInputTrack);
MediaFormat outputAudioFormat = MediaFormat.createAudioFormat(OUTPUT_AUDIO_MIME_TYPE,
outputAudioFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, audioFormat.getInteger(MediaFormat.KEY_AAC_PROFILE));
outputAudioFormat.setInteger(MediaFormat.KEY_BIT_RATE, OUTPUT_AUDIO_BIT_RATE);
// Create a MediaCodec for the decoder, based on the extractor's format, configure and start it.
audioDecoder = createAudioDecoder(audioFormat);
// Create a MediaCodec for the desired codec, then configure it as an encoder and start it.
audioEncoder = createAudioEncoder(audioCodecInfo, outputAudioFormat);
//create muxer to overwrite original asset path
muxer = createMuxer(originalAssetPath);
//add the video and audio tracks
* need to wait to add the audio track until after the first encoder output buffer is created
* since the encoder changes the MediaFormat at that time
* and the muxer needs the correct format, including the correct Coded Specific Data (CSD) ByteBuffer
The monster doExtractDecodeEditEncodeMux method:
private void doExtractDecodeEditEncodeMux(
MediaExtractor videoExtractor,
MediaExtractor audioExtractor,
MediaCodec audioDecoder,
MediaCodec audioEncoder,
MediaMuxer muxer) {
ByteBuffer videoInputBuffer = ByteBuffer.allocate(MAX_BUFFER_SIZE);
MediaCodec.BufferInfo videoBufferInfo = new MediaCodec.BufferInfo();
ByteBuffer[] audioDecoderInputBuffers = null;
ByteBuffer[] audioDecoderOutputBuffers = null;
ByteBuffer[] audioEncoderInputBuffers = null;
ByteBuffer[] audioEncoderOutputBuffers = null;
MediaCodec.BufferInfo audioDecoderOutputBufferInfo = null;
MediaCodec.BufferInfo audioEncoderOutputBufferInfo = null;
audioDecoderInputBuffers = audioDecoder.getInputBuffers();
audioDecoderOutputBuffers = audioDecoder.getOutputBuffers();
audioEncoderInputBuffers = audioEncoder.getInputBuffers();
audioEncoderOutputBuffers = audioEncoder.getOutputBuffers();
audioDecoderOutputBufferInfo = new MediaCodec.BufferInfo();
audioEncoderOutputBufferInfo = new MediaCodec.BufferInfo();
* sanity checks
int videoExtractedFrameCount = 0;
int audioExtractedFrameCount = 0;
int audioDecodedFrameCount = 0;
int audioEncodedFrameCount = 0;
long lastPresentationTimeVideoExtractor = 0;
long lastPresentationTimeAudioExtractor = 0;
long lastPresentationTimeAudioDecoder = 0;
long lastPresentationTimeAudioEncoder = 0;
// We will get these from the decoders when notified of a format change.
MediaFormat decoderOutputAudioFormat = null;
// We will get these from the encoders when notified of a format change.
MediaFormat encoderOutputAudioFormat = null;
// We will determine these once we have the output format.
int outputAudioTrack = -1;
// Whether things are done on the video side.
boolean videoExtractorDone = false;
// Whether things are done on the audio side.
boolean audioExtractorDone = false;
boolean audioDecoderDone = false;
boolean audioEncoderDone = false;
// The audio decoder output buffer to process, -1 if none.
int pendingAudioDecoderOutputBufferIndex = -1;
boolean muxing = false;
* need to wait to add the audio track until after the first encoder output buffer is created
* since the encoder changes the MediaFormat at that time
* and the muxer needs the correct format, including the correct Coded Specific Data (CSD) ByteBuffer
* muxer.start();
* muxing = true;
MediaMetadataRetriever retrieverTest = new MediaMetadataRetriever();
String degreesStr = retrieverTest.extractMetadata(MediaMetadataRetriever.METADATA_KEY_VIDEO_ROTATION);
if (degreesStr != null) {
Integer degrees = Integer.parseInt(degreesStr);
if (degrees >= 0) {
while (!videoExtractorDone || !audioEncoderDone) {
if (INFO) {
Log.d(TAG, String.format("ex:%d at %d | de:%d at %d | en:%d at %d ",
audioExtractedFrameCount, lastPresentationTimeAudioExtractor,
audioDecodedFrameCount, lastPresentationTimeAudioDecoder,
audioEncodedFrameCount, lastPresentationTimeAudioEncoder
* Extract and mux video
while (!videoExtractorDone && muxing) {
try {
videoBufferInfo.size = videoExtractor.readSampleData(videoInputBuffer, 0);
} catch (Exception e) {
if (videoBufferInfo.size < 0) {
videoBufferInfo.size = 0;
videoExtractorDone = true;
} else {
videoBufferInfo.presentationTimeUs = videoExtractor.getSampleTime();
lastPresentationTimeVideoExtractor = videoBufferInfo.presentationTimeUs;
videoBufferInfo.flags = videoExtractor.getSampleFlags();
muxer.writeSampleData(videoTrackIndex, videoInputBuffer, videoBufferInfo);
* Extract, decode, watermark, encode and mux audio
/** Extract audio from file and feed to decoder. **/
while (!audioExtractorDone && (encoderOutputAudioFormat == null || muxing)) {
int decoderInputBufferIndex = audioDecoder.dequeueInputBuffer(TIMEOUT_USEC);
if (decoderInputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER) {
if (DEBUG) {
Log.d(TAG, "audio decoder: returned input buffer: " + decoderInputBufferIndex);
ByteBuffer decoderInputBuffer = audioDecoderInputBuffers[decoderInputBufferIndex];
int size = audioExtractor.readSampleData(decoderInputBuffer, 0);
long presentationTime = audioExtractor.getSampleTime();
lastPresentationTimeAudioExtractor = presentationTime;
if (DEBUG) {
Log.d(TAG, "audio extractor: returned buffer of size " + size);
Log.d(TAG, "audio extractor: returned buffer for time " + presentationTime);
if (size >= 0) {
audioExtractorDone = !audioExtractor.advance();
if (audioExtractorDone) {
if (DEBUG) Log.d(TAG, "audio extractor: EOS");
// We extracted a frame, let's try something else next.
* Poll output frames from the audio decoder.
* Do not poll if we already have a pending buffer to feed to the encoder.
while (!audioDecoderDone && pendingAudioDecoderOutputBufferIndex == -1 && (encoderOutputAudioFormat == null || muxing)) {
int decoderOutputBufferIndex =
audioDecoderOutputBufferInfo, TIMEOUT_USEC);
if (decoderOutputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER) {
if (DEBUG) Log.d(TAG, "no audio decoder output buffer");
if (decoderOutputBufferIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
if (DEBUG) Log.d(TAG, "audio decoder: output buffers changed");
audioDecoderOutputBuffers = audioDecoder.getOutputBuffers();
if (decoderOutputBufferIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
decoderOutputAudioFormat = audioDecoder.getOutputFormat();
if (DEBUG) {
Log.d(TAG, "audio decoder: output format changed: "
+ decoderOutputAudioFormat);
if (DEBUG) {
Log.d(TAG, "audio decoder: returned output buffer: "
+ decoderOutputBufferIndex);
if (DEBUG) {
Log.d(TAG, "audio decoder: returned buffer of size "
+ audioDecoderOutputBufferInfo.size);
ByteBuffer decoderOutputBuffer =
if ((audioDecoderOutputBufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG)
!= 0) {
if (DEBUG) Log.d(TAG, "audio decoder: codec config buffer");
audioDecoder.releaseOutputBuffer(decoderOutputBufferIndex, false);
if (DEBUG) {
Log.d(TAG, "audio decoder: returned buffer for time "
+ audioDecoderOutputBufferInfo.presentationTimeUs);
if (DEBUG) {
Log.d(TAG, "audio decoder: output buffer is now pending: "
+ pendingAudioDecoderOutputBufferIndex);
pendingAudioDecoderOutputBufferIndex = decoderOutputBufferIndex;
// We extracted a pending frame, let's try something else next.
// Feed the pending decoded audio buffer to the audio encoder.
while (pendingAudioDecoderOutputBufferIndex != -1) {
if (DEBUG) {
Log.d(TAG, "audio decoder: attempting to process pending buffer: "
+ pendingAudioDecoderOutputBufferIndex);
int encoderInputBufferIndex = audioEncoder.dequeueInputBuffer(TIMEOUT_USEC);
if (encoderInputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER) {
if (DEBUG) Log.d(TAG, "no audio encoder input buffer");
if (DEBUG) {
Log.d(TAG, "audio encoder: returned input buffer: " + encoderInputBufferIndex);
ByteBuffer encoderInputBuffer = audioEncoderInputBuffers[encoderInputBufferIndex];
int size = audioDecoderOutputBufferInfo.size;
long presentationTime = audioDecoderOutputBufferInfo.presentationTimeUs;
lastPresentationTimeAudioDecoder = presentationTime;
if (DEBUG) {
Log.d(TAG, "audio decoder: processing pending buffer: "
+ pendingAudioDecoderOutputBufferIndex);
if (DEBUG) {
Log.d(TAG, "audio decoder: pending buffer of size " + size);
Log.d(TAG, "audio decoder: pending buffer for time " + presentationTime);
if (size >= 0) {
ByteBuffer decoderOutputBuffer =
decoderOutputBuffer.limit(audioDecoderOutputBufferInfo.offset + size);
audioDecoder.releaseOutputBuffer(pendingAudioDecoderOutputBufferIndex, false);
pendingAudioDecoderOutputBufferIndex = -1;
if ((audioDecoderOutputBufferInfo.flags
& MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
if (DEBUG) Log.d(TAG, "audio decoder: EOS");
audioDecoderDone = true;
// We enqueued a pending frame, let's try something else next.
// Poll frames from the audio encoder and send them to the muxer.
while (!audioEncoderDone && (encoderOutputAudioFormat == null || muxing)) {
int encoderOutputBufferIndex = audioEncoder.dequeueOutputBuffer(
audioEncoderOutputBufferInfo, TIMEOUT_USEC);
if (encoderOutputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER) {
if (DEBUG) Log.d(TAG, "no audio encoder output buffer");
if (encoderOutputBufferIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
if (DEBUG) Log.d(TAG, "audio encoder: output buffers changed");
audioEncoderOutputBuffers = audioEncoder.getOutputBuffers();
if (encoderOutputBufferIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
encoderOutputAudioFormat = audioEncoder.getOutputFormat();
if (DEBUG) {
Log.d(TAG, "audio encoder: output format changed");
if (outputAudioTrack >= 0) {
Log.e(TAG,"audio encoder changed its output format again?");
if (DEBUG) {
Log.d(TAG, "audio encoder: returned output buffer: "
+ encoderOutputBufferIndex);
Log.d(TAG, "audio encoder: returned buffer of size "
+ audioEncoderOutputBufferInfo.size);
ByteBuffer encoderOutputBuffer =
if ((audioEncoderOutputBufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG)
!= 0) {
if (DEBUG) Log.d(TAG, "audio encoder: codec config buffer");
// Simply ignore codec config buffers.
audioEncoder.releaseOutputBuffer(encoderOutputBufferIndex, false);
if (DEBUG) {
Log.d(TAG, "audio encoder: returned buffer for time "
+ audioEncoderOutputBufferInfo.presentationTimeUs);
if (audioEncoderOutputBufferInfo.size != 0) {
lastPresentationTimeAudioEncoder = audioEncoderOutputBufferInfo.presentationTimeUs;
audioTrackIndex, encoderOutputBuffer, audioEncoderOutputBufferInfo);
if ((audioEncoderOutputBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM)
!= 0) {
if (DEBUG) Log.d(TAG, "audio encoder: EOS");
audioEncoderDone = true;
audioEncoder.releaseOutputBuffer(encoderOutputBufferIndex, false);
// We enqueued an encoded frame, let's try something else next.
if (!muxing && (encoderOutputAudioFormat != null)) {
Log.d(TAG, "muxer: adding video track.");
videoTrackIndex = muxer.addTrack(videoFormat);
Log.d(TAG, "muxer: adding audio track.");
audioTrackIndex = muxer.addTrack(encoderOutputAudioFormat);
Log.d(TAG, "muxer: starting");
muxing = true;
* Done processing audio and video
Log.d(TAG,"encoded and decoded audio frame counts should match. decoded:"+audioDecodedFrameCount+" encoded:"+audioEncodedFrameCount);
Log.d(TAG,"decoded frame count should be less than extracted frame coun. decoded:"+audioDecodedFrameCount+" extracted:"+audioExtractedFrameCount);
Log.d(TAG,"no audio frame should be pending "+pendingAudioDecoderOutputBufferIndex);
PluginResult result = new PluginResult(PluginResult.Status.OK, videoPath);
I am seeing this ACodec error for the first several hundred audio frames extracted:
11-25 20:49:58.497 9807-13101/com.vvs.VVS430011 E/ACodec﹕ OMXCodec::onEvent, OMX_ErrorStreamCorrupt
11-25 20:49:58.497 9807-13101/com.vvs.VVS430011 W/AHierarchicalStateMachine﹕ Warning message AMessage(what = 'omx ', target = 8) = {
int32_t type = 0
int32_t node = 7115
int32_t event = 1
int32_t data1 = -2147479541
int32_t data2 = 0
} unhandled in root state.
Here's a pastebin of the entire logcat, which includes some sanity check logs in the format of:
D/vvsLog﹕ ex:{extracted frame #} at {presentationTime} | de:{decoded frame #} at {presentationTime} | en:{encoded frame #} at {presentationTime}
The presentationTime of encoded and decoded frames seems to be incrementing too quickly while those OMX_ErrorStreamCorrupt messages are appearing. When they stop, the presentationTime for the decoded and encoded frames seems to return to "normal", and also seems to match up with the actual "good" audio I hear at the beginning of the video - the "good" audio being from the end of the original audio track.
I am hoping someone with a lot more experience with these low-level Android multimedia APIs than I have can help me understand why this is happening. Keep in mind I am well aware that this code is not optimized, running in separate threads, etc.. - I will refactor to clean things up once I have a working example of the basic extract->decode->edit->encode->mux process.
Turns out the above code works fine - as long as you're not trying to mux the same file you're extracting, at the same time.
I had a previous version of this that extracted, then muxed tracks to the same file, and forgot to change that in this version.
This little method saved the day lol.
private String getMuxedAssetPath() {
String muxedAssetPath = Environment.getExternalStoragePublicDirectory(Environment.DIRECTORY_DCIM) + "/" + CAMERA_DIRECTORY + "/muxedAudioVideo.mp4";
File file = new File(muxedAssetPath);
if (!file.exists()) {
try {
} catch (IOException e) {
muxedAssetPath = null;
return muxedAssetPath;
I'm developing a JAVA RTP Streaming App for a company project, which should be capable of joining the Multicast Server and receive the RTP Packets.Later I use the H264 Depacketizer to recreate the a complete frame from the NAL FU (Keep append the data until End Bit & Marker Bit set )
I want to decode and display a raw h264 video byte stream in Android and therefore I'm currently using the MediaCodec classes with Hardware Decoder configured.
The Application is Up and running for the Jeallybean (API 17). Various Resolutions which I need to decodes are :
480P at 30/60 FPS
720P/I at 30/60 FPS
1080P/I at 30/60 FPS
Recently, Due to System Upgrade we are porting the App to Android L Version 5.0.2. My App is not capable of playing the high resolutions videos like 720p#60fps and 1080p#60fps.
For the debugging purpose I started feeding the Elementary H264 Frames with size from the dump file to MediaCodec and found out the Video is Lagging.
There are timestamps on the sample video I used and it seems the actual time taken to proceed by 1 sec in Rendered Video is more
Below is my sample code and links to sample video
h264 video https://www.dropbox.com/s/cocjhhovihm8q25/dump60fps.h264?dl=0
h264 framesize https://www.dropbox.com/s/r146d5zederrne1/dump60fps.size?dl=0
Also as this is my question on stackoverflow, Please bear with me on Bad code formatting and Direct references.
public class MainActivity extends Activity {
static final String TAG = "MainActivity";
private PlayerThread mPlayer = null;
private static final String MIME_TYPE = "video/avc";
private byte[] mSPSPPSFrame = new byte [3000];
private byte[] sps = new byte[37];
File videoFile = null;
File videoFile1 = null;
TextView tv ;
FileInputStream videoFileStream = null;
FileInputStream videoFileStream1 = null;
int[] tall = null ;
SpeedControlCallback mspeed = new SpeedControlCallback();
int mStreamLen = 0;
FrameLayout game;
RelativeLayout rl ;
public void onCreate(Bundle savedInstanceState) {
//mVideoSurfaceView = (SurfaceView)findViewById(R.id.videoSurfaceView);
SurfaceView first = (SurfaceView) findViewById(R.id.firstSurface);
first.getHolder().addCallback(new SurfaceHolder.Callback() {
public void surfaceCreated(SurfaceHolder surfaceHolder) {
Log.d(TAG, "First surface created!");
public void surfaceChanged(SurfaceHolder surfaceHolder, int i, int i2, int i3) {
Log.d(TAG, "surfaceChanged()");
if (mPlayer == null) {
mPlayer = new PlayerThread(surfaceHolder.getSurface());
public void surfaceDestroyed(SurfaceHolder surfaceHolder) {
Log.d(TAG, "First surface destroyed!");
tv = (TextView) findViewById(R.id.textview);
videoFile = new File("/data/local/tmp/dump60fps.h264");
videoFile1 = new File("/data/local/tmp/dump60fps.size");
private class PlayerThread extends Thread {
private Surface surface;
public PlayerThread(Surface surface) {
this.surface = surface;
public void run() {
try {
decodeVideo(0, 1920,1080, 50, surface);
} catch (IOException e) {
} catch (InterruptedException e) {
} catch (Throwable e) {
private void decodeVideo(int testinput, int width, int height,
int threshold, Surface surface) throws Throwable {
MediaCodec codec = null;
MediaFormat mFormat;
final long kTimeOutUs = 10000;
MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
boolean sawInputEOS = false;
boolean sawOutputEOS = false;
MediaFormat oformat = null;
int errors = -1;
long presentationTimeUs = 0L;
boolean mVideoStart = false;
byte[] byteArray = new byte[65525*5*3];
int i;
int sizeInBytes = 0, index, sampleSize = 0;
try {
byte[] bytes = new byte[(int) videoFile1.length()];
FileInputStream fis = new FileInputStream(videoFile1);
String[] valueStr = new String(bytes).trim().split("\\s+");
tall = new int[valueStr.length];
mStreamLen = valueStr.length;
Log.e(TAG, "++++++ Total Frames ++++++"+mStreamLen);
for ( i = 0; i < valueStr.length; i++) {
tall[i] = Integer.parseInt(valueStr[i]);
} catch (IOException e1) {
index =1;
try {
videoFileStream = new FileInputStream(videoFile);
} catch (FileNotFoundException e1) {
if (mVideoStart == false) {
try {
sizeInBytes = videoFileStream.read(mSPSPPSFrame, 0,37);
Log.e(TAG, "VideoEngine configure ."+sizeInBytes);
//for (i = 0 ; i < sizeInBytes; i++){
// Log.e(TAG, "VideoEngine ."+mSPSPPSFrame[i]);}
} catch (IOException e1) {
sampleSize = sizeInBytes;
mFormat = MediaFormat.createVideoFormat(MIME_TYPE, 1920,1080);
mFormat.setByteBuffer("csd-0", ByteBuffer.wrap( mSPSPPSFrame,0, sizeInBytes));
codec = MediaCodec.createDecoderByType(MIME_TYPE);
codec.configure(mFormat, surface /*surface*/ , null /* crypto */, 0 /* flags */);
// index = 0;
while (!sawOutputEOS && errors < 0) {
if (!sawInputEOS) {
int inputBufIndex = codec.dequeueInputBuffer(kTimeOutUs);
//Log.d(TAG, String.format("Archana Dqing the input buffer with BufIndex #: %d",inputBufIndex));
if (inputBufIndex >= 0) {
ByteBuffer dstBuf = codec.getInputBuffers()[inputBufIndex];
* Read data from file and copy to the input ByteBuffer
try {
sizeInBytes = videoFileStream.read(byteArray, 0,
tall[index] /*+ 4*/);
sampleSize = tall[index]/*+ 4*/;
} catch (IOException e) {
if (sizeInBytes <= 0) {
0 /* offset */,
sawInputEOS = true;
else {
dstBuf.put(byteArray, 0, sizeInBytes);
if (mVideoStart == false) mVideoStart = true;
0 /* offset */,
mVideoStart ? 0:MediaCodec.BUFFER_FLAG_CODEC_CONFIG );
//Log.d(TAG, String.format(" After queueing the buffer to decoder with inputbufindex and samplesize #: %d ,%d ind %d",inputBufIndex,sampleSize,index));
int res = codec.dequeueOutputBuffer(info, kTimeOutUs);
//Log.d(TAG, String.format(" Getting the information about decoded output buffer flags,offset,PT,size #: %d %d %d %d",info.flags,info.offset,info.presentationTimeUs,info.size));
//Log.d(TAG, String.format(" Getting the output of decoder in res #: %d",res));
if (res >= 0) {
int outputBufIndex = res;
//Log.d(TAG, "Output PTS "+info.presentationTimeUs);
codec.releaseOutputBuffer(outputBufIndex, true /* render */);
//Log.d(TAG, String.format(" releaseoutputbuffer index= #: %d",outputBufIndex));
if ((info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
Log.d(TAG, "saw output EOS.");
sawOutputEOS = true;
} else if (res == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
Log.d(TAG, "output buffers have changed.");
} else if (res == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
oformat = codec.getOutputFormat();
Log.d(TAG, "output format has changed to " + oformat);
public boolean onCreateOptionsMenu(Menu menu) {
getMenuInflater().inflate(R.menu.activity_main, menu);
return true;
There are couples of workaround to problem with the above sample test.
Instead of feeding One Full frame to the decoder Inout, I was feeding single of NAL Units at a time. But still the playback was slow and could not match 60FPS
Google has changed the Implementation of Surface BufferQueue from Asynchronous to Synchronous.Hence when we call MediaCodec.dequeueBuffer to get decoded data, the server side (SurfaceTexture::dequeueBuffer) will wait for a buffer to be queued, and the client side waits for that, so that SurfaceTextureClient::dequeueBuffer will not return until a buffer has actually been queued on the server side. Where as in the Asynchronous Mode, a new GraphicBuffer is allocated.
I'm a bit new when it comes to MediaCodec (and video encoding/decoding in general), so correct me if anything I say here is wrong.
I want to play the raw h264 output of MediaCodec with VLC/ffplay. I need this to play becuase my end goal is to stream some live video to a computer, and MediaMuxer only produces a file on disk rather than something I can stream with (very) low latency to a desktop. (I'm open to other solutions, but I have not found anything else that fits the latency requirement)
Here is the code I'm using encode the video and write it to a file: (it's based off the MediaCodec example found here, only with the MediaMuxer part removed)
package com.jackos2500.droidtop;
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.opengl.EGL14;
import android.opengl.EGLConfig;
import android.opengl.EGLContext;
import android.opengl.EGLDisplay;
import android.opengl.EGLExt;
import android.opengl.EGLSurface;
import android.opengl.GLES20;
import android.os.Environment;
import android.util.Log;
import android.view.Surface;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
public class StreamH264 {
private static final String TAG = "StreamH264";
private static final boolean VERBOSE = true; // lots of logging
// where to put the output file (note: /sdcard requires WRITE_EXTERNAL_STORAGE permission)
private static final File OUTPUT_DIR = Environment.getExternalStorageDirectory();
public static int MEGABIT = 1000 * 1000;
private static final int IFRAME_INTERVAL = 10;
private static final int TEST_R0 = 0;
private static final int TEST_G0 = 136;
private static final int TEST_B0 = 0;
private static final int TEST_R1 = 236;
private static final int TEST_G1 = 50;
private static final int TEST_B1 = 186;
private MediaCodec codec;
private CodecInputSurface inputSurface;
private BufferedOutputStream out;
private MediaCodec.BufferInfo bufferInfo;
public StreamH264() {
private void prepareEncoder() throws IOException {
bufferInfo = new MediaCodec.BufferInfo();
MediaFormat format = MediaFormat.createVideoFormat("video/avc", 1280, 720);
format.setInteger(MediaFormat.KEY_BIT_RATE, 2 * MEGABIT);
format.setInteger(MediaFormat.KEY_FRAME_RATE, 30);
format.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, IFRAME_INTERVAL);
codec = MediaCodec.createEncoderByType("video/avc");
codec.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
inputSurface = new CodecInputSurface(codec.createInputSurface());
File dst = new File(OUTPUT_DIR, "test.264");
out = new BufferedOutputStream(new FileOutputStream(dst));
private void releaseEncoder() throws IOException {
if (VERBOSE) Log.d(TAG, "releasing encoder objects");
if (codec != null) {
codec = null;
if (inputSurface != null) {
inputSurface = null;
if (out != null) {
out = null;
public void stream() throws IOException {
try {
for (int i = 0; i < (30 * 5); i++) {
// Feed any pending encoder output into the file.
// Generate a new frame of input.
inputSurface.setPresentationTime(computePresentationTimeNsec(i, 30));
// Submit it to the encoder. The eglSwapBuffers call will block if the input
// is full, which would be bad if it stayed full until we dequeued an output
// buffer (which we can't do, since we're stuck here). So long as we fully drain
// the encoder before supplying additional input, the system guarantees that we
// can supply another frame without blocking.
if (VERBOSE) Log.d(TAG, "sending frame " + i + " to encoder");
// send end-of-stream to encoder, and drain remaining output
} finally {
// release encoder, muxer, and input Surface
private void drainEncoder(boolean endOfStream) throws IOException {
final int TIMEOUT_USEC = 10000;
if (VERBOSE) Log.d(TAG, "drainEncoder(" + endOfStream + ")");
if (endOfStream) {
if (VERBOSE) Log.d(TAG, "sending EOS to encoder");
ByteBuffer[] outputBuffers = codec.getOutputBuffers();
while (true) {
int encoderStatus = codec.dequeueOutputBuffer(bufferInfo, TIMEOUT_USEC);
if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) {
// no output available yet
if (!endOfStream) {
break; // out of while
} else {
if (VERBOSE) Log.d(TAG, "no output available, spinning to await EOS");
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
// not expected for an encoder
outputBuffers = codec.getOutputBuffers();
} else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
// should happen before receiving buffers, and should only happen once
MediaFormat newFormat = codec.getOutputFormat();
Log.d(TAG, "encoder output format changed: " + newFormat);
} else if (encoderStatus < 0) {
Log.w(TAG, "unexpected result from encoder.dequeueOutputBuffer: " + encoderStatus);
// let's ignore it
} else {
ByteBuffer encodedData = outputBuffers[encoderStatus];
if (encodedData == null) {
throw new RuntimeException("encoderOutputBuffer " + encoderStatus + " was null");
if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG) != 0) {
// The codec config data was pulled out and fed to the muxer when we got
// the INFO_OUTPUT_FORMAT_CHANGED status. Ignore it.
bufferInfo.size = 0;
if (bufferInfo.size != 0) {
// adjust the ByteBuffer values to match BufferInfo (not needed?)
encodedData.limit(bufferInfo.offset + bufferInfo.size);
byte[] data = new byte[bufferInfo.size];
if (VERBOSE) Log.d(TAG, "sent " + bufferInfo.size + " bytes to file");
codec.releaseOutputBuffer(encoderStatus, false);
if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
if (!endOfStream) {
Log.w(TAG, "reached end of stream unexpectedly");
} else {
if (VERBOSE) Log.d(TAG, "end of stream reached");
break; // out of while
private void generateSurfaceFrame(int frameIndex) {
frameIndex %= 8;
int startX, startY;
if (frameIndex < 4) {
// (0,0) is bottom-left in GL
startX = frameIndex * (1280 / 4);
startY = 720 / 2;
} else {
startX = (7 - frameIndex) * (1280 / 4);
startY = 0;
GLES20.glClearColor(TEST_R0 / 255.0f, TEST_G0 / 255.0f, TEST_B0 / 255.0f, 1.0f);
GLES20.glScissor(startX, startY, 1280 / 4, 720 / 2);
GLES20.glClearColor(TEST_R1 / 255.0f, TEST_G1 / 255.0f, TEST_B1 / 255.0f, 1.0f);
private static long computePresentationTimeNsec(int frameIndex, int frameRate) {
final long ONE_BILLION = 1000000000;
return frameIndex * ONE_BILLION / frameRate;
* Holds state associated with a Surface used for MediaCodec encoder input.
* <p>
* The constructor takes a Surface obtained from MediaCodec.createInputSurface(), and uses that
* to create an EGL window surface. Calls to eglSwapBuffers() cause a frame of data to be sent
* to the video encoder.
* <p>
* This object owns the Surface -- releasing this will release the Surface too.
private static class CodecInputSurface {
private static final int EGL_RECORDABLE_ANDROID = 0x3142;
private EGLDisplay mEGLDisplay = EGL14.EGL_NO_DISPLAY;
private EGLContext mEGLContext = EGL14.EGL_NO_CONTEXT;
private EGLSurface mEGLSurface = EGL14.EGL_NO_SURFACE;
private Surface mSurface;
* Creates a CodecInputSurface from a Surface.
public CodecInputSurface(Surface surface) {
if (surface == null) {
throw new NullPointerException();
mSurface = surface;
* Prepares EGL. We want a GLES 2.0 context and a surface that supports recording.
private void eglSetup() {
mEGLDisplay = EGL14.eglGetDisplay(EGL14.EGL_DEFAULT_DISPLAY);
if (mEGLDisplay == EGL14.EGL_NO_DISPLAY) {
throw new RuntimeException("unable to get EGL14 display");
int[] version = new int[2];
if (!EGL14.eglInitialize(mEGLDisplay, version, 0, version, 1)) {
throw new RuntimeException("unable to initialize EGL14");
// Configure EGL for recording and OpenGL ES 2.0.
int[] attribList = {
EGLConfig[] configs = new EGLConfig[1];
int[] numConfigs = new int[1];
EGL14.eglChooseConfig(mEGLDisplay, attribList, 0, configs, 0, configs.length,
numConfigs, 0);
checkEglError("eglCreateContext RGB888+recordable ES2");
// Configure context for OpenGL ES 2.0.
int[] attrib_list = {
mEGLContext = EGL14.eglCreateContext(mEGLDisplay, configs[0], EGL14.EGL_NO_CONTEXT,
attrib_list, 0);
// Create a window surface, and attach it to the Surface we received.
int[] surfaceAttribs = {
mEGLSurface = EGL14.eglCreateWindowSurface(mEGLDisplay, configs[0], mSurface,
surfaceAttribs, 0);
* Discards all resources held by this class, notably the EGL context. Also releases the
* Surface that was passed to our constructor.
public void release() {
if (mEGLDisplay != EGL14.EGL_NO_DISPLAY) {
EGL14.eglDestroySurface(mEGLDisplay, mEGLSurface);
EGL14.eglDestroyContext(mEGLDisplay, mEGLContext);
mSurface = null;
* Makes our EGL context and surface current.
public void makeCurrent() {
EGL14.eglMakeCurrent(mEGLDisplay, mEGLSurface, mEGLSurface, mEGLContext);
* Calls eglSwapBuffers. Use this to "publish" the current frame.
public boolean swapBuffers() {
boolean result = EGL14.eglSwapBuffers(mEGLDisplay, mEGLSurface);
return result;
* Sends the presentation time stamp to EGL. Time is expressed in nanoseconds.
public void setPresentationTime(long nsecs) {
EGLExt.eglPresentationTimeANDROID(mEGLDisplay, mEGLSurface, nsecs);
* Checks for EGL errors. Throws an exception if one is found.
private void checkEglError(String msg) {
int error;
if ((error = EGL14.eglGetError()) != EGL14.EGL_SUCCESS) {
throw new RuntimeException(msg + ": EGL error: 0x" + Integer.toHexString(error));
However, the file produced from this code does not play with VLC or ffplay. Can anyone tell me what I'm doing wrong? I believe it is due to an incorrect format (or total lack) of headers required for the playing of raw h264, as I have had success playing .264 files downloaded from the internet with ffplay. Also, I'm not sure exactly how I'm going to stream this video to a computer, so if somebody could give me some suggestions as to how I might do that, I would be very grateful! Thanks!
You should be able to play back a raw H264 stream (as you wrote, other raw .264 files play back just fine with VLC or ffplay), but you are missing the parameter sets. These are passed in two different ways, and you happen to be missing both. First they are returned in MediaFormat when you get MediaCodec.INFO_OUTPUT_FORMAT_CHANGED (which you don't handle, you just log a message about it), secondly they are returned in a buffer with MediaCodec.BUFFER_FLAG_CODEC_CONFIG set (which you ignore by setting the size to 0). The simplest solution here is to remove the special case handling of MediaCodec.BUFFER_FLAG_CODEC_CONFIG, and it should all work just fine.
The code you've based it on does things this way in order to test all the different ways of doing things - where you copied it from, the parameter sets were carried in the MediaFormat from MediaCodec.INFO_OUTPUT_FORMAT_CHANGED. If you wanted to use that in your case with a raw H264 bytestream, you could write the byte buffers with keys csd-0 and csd-1 from the MediaFormat and keep ignoring the buffers with MediaCodec.BUFFER_FLAG_CODEC_CONFIG set.
You cannot play just raw h264. It does not have any information about format. You also can find several great examples here. In order to stream you need to implement some streaming protocol like RTSP (in a case of real time streaming) or more flexible HLS (if real time is not required)