I'm manually reading a RTP/H264 stream and pass the H264 frames to the Android MediaCodec. I use the "markerBit" as a border for the frames. The MediaCodec is tied to a OpenGL Texture (SurfaceTexture).
In general everything works fine. But the Decoder appears to buffer frames. If I put a frame in the decoder it is not rendered immediately to the texture. After I put 2-3 frames more in the decoder the first frame is rendered to the texture.
I'm implementing against Android 4.4.4.
private static final int INFINITE_TIMEOUT = -1;
private static final int TIMEOUT_OUTPUT_BUFFER_MEDIA_CODEC = 1000;
...
int bufferIndex = codec.dequeueInputBuffer(INFINITE_TIMEOUT);
if (bufferIndex < 0) {
throw new RuntimeException("Error");
}
ByteBuffer inputBuffer = inputBuffers[bufferIndex];
inputBuffer.clear();
// Copy H264 data to inputBuffer
h264Frame.fill(inputBuffer);
codec.queueInputBuffer(bufferIndex, 0, inputBuffer.position(), 0, 0);
drainOutputBuffers();
...
and
private boolean drainOutputBuffers() {
MediaCodec.BufferInfo buffInfo = new MediaCodec.BufferInfo();
int outputBufferIndex = codec.dequeueOutputBuffer(buffInfo, TIMEOUT_OUTPUT_BUFFER_MEDIA_CODEC);
if (outputBufferIndex >= 0) {
codec.releaseOutputBuffer(outputBufferIndex, true);
return true;
}
switch (outputBufferIndex) {
case MediaCodec.INFO_TRY_AGAIN_LATER:
LOG.debug("Could not dequeue output buffer. Try again later");
break;
case MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:
LOG.warn("The output format has changed.");
break;
case MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:
LOG.warn("The output buffers has changed.");
break;
default:
LOG.warn("The output buffer index was negative: {}", outputBufferIndex);
}
return false;
}
On the rendering side I use the "onFrameAvailable" callback for checking if I have to update the texture on the openGl Thread. The flag I use for checking is guarded by a lock (synchronized).
I suspect that the presentation timestamp may influence the rendering. But I set it to 0. Thus I assume the frame should be rendered without a delay.
I'd like to have the frame rendered to the texture without having to put additional frames.
From the MediaCodec documentation
The Executing state has three sub-states: Flushed, Running and
End-of-Stream. Immediately after start() the codec is in the Flushed
sub-state, where it holds all the buffers. As soon as the first input
buffer is dequeued, the codec moves to the Running sub-state, where it
spends most of its life. When you queue an input buffer with the
end-of-stream marker, the codec transitions to the End-of-Stream
sub-state. In this state the codec no longer accepts further input
buffers, but still generates output buffers until the end-of-stream is
reached on the output. You can move back to the Flushed sub-state at
any time while in the Executing state using flush().
You need to "queue an input buffer with the end-of-stream marker". Do this with the first frame you feed to the decoder (make sure it is a keyframe).
This point is to tell the decoder not to expect anymore frames and therefore begin playback immediately. Otherwise it's normal to feed 3 or 4 frames before seeing anything. This an expectation of all MPEG decoders and is not Android-related.
If anyone looking at this in 2021:
Mediacodec May or may not hold frames (of course. I-frames I mean). It totally depends on the device. If you want to be sure the frame is released, for example when you plan to decode a single frame, you can use
mediaCodec.stop
every time you put the data in input buffer so it releases the frame. Afterwards, you have to again start the mediaCodec for next frame.
Mediacodec decoder buffers 6-7 frames before outputting first decoded output frame. It seems to flaw in the mediacodec. This will be problem in the streaming application.
So far my debugging shows decoding H264 with mediacodec has 6-7 frames delay during start of the stream.
Related
I'm working with Android MediaCodec and use it for a realtime H264 encoding and decoding frames from camera. I use MediaCodec in synchronous manner and render the output to the Surface of decoder and everething works fine except that I have a long latency from a realtime, it takes 1.5-2 seconds and I'm very confused why is it so.
I measured a total time of encoding and decoding processes and it keeps around 50-65 milliseconds so I think the problem isn't in them.
I tried to change the configuration of the encoder but it didn't help and currently it configured like this:
val formatEncoder = MediaFormat.createVideoFormat("video/avc", 1920, 1080)
formatEncoder.setInteger(MediaFormat.KEY_FRAME_RATE, 30)
formatEncoder.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 5)
formatEncoder.setInteger(MediaFormat.KEY_BIT_RATE, 1920 * 1080)
formatEncoder.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface)
val encoder = MediaCodec.createEncoderByType("video/avc")
encoder.configure(formatEncoder, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
val inputSurface = encoder.createInputSurface() // I use it to send frames from camera to encoder
encoder.start()
Changing the configuration of the decoder also didn't help me at all and currently I configured it like this:
val formatDecoder = MediaFormat.createVideoFormat("video/avc", 1920, 1080)
val decoder = MediaCodec.createDecoderByType("video/avc")
decoder.configure(formatDecoder , outputSurface, null, 0) // I use outputSurface to render decoded frames into it
decoder.start()
I use the following timeouts for waiting for available encoder/decoder buffers I tried to reduce their values but it didn't help me and I left them like this:
var TIMEOUT_IN_BUFFER = 10000L // microseconds
var TIMEOUT_OUT_BUFFER = 10000L // microseconds
Also I measured the time of consuming the inputSurface a frame and this time takes 0.03-0.05 milliseconds so it isn't a bottleneck. Actually I measured all the places where a bottleneck could be, but I wasn't found anything and I think the problem is in the encoder or decoder itself or in their configurations, or maybe I should use some special routine for sending frames to encoding/decoding..
I also tried to use HW accelerated codec and it's the only thing that helped me, when I use it the latency reduces to ~ 500-800 milliseconds but it still doesn't fit me for a realtime streaming.
It seems to me that the encoder or decoder buffers several frames before start displaying them on the surface and eventually it leads to the latency and if it really so then how can I disable bufferization or reduce the time of it?
Please help me I'm stucking on this problem for about half a year and have no idea how to reduce the latency, I'm sure that it's possible because popular apps like Telegram, Viber, WhatsApp etc. work fine and without latency so what's the secret here?
UPD 07.07.2021:
I still haven't found a solution to get rid of the latency. I've tried to change h264 profiles, increase and decrease I-frame inteval, bitrate, framerate, but result the same, the only thing that hepls a little to reduce the latency - downgrade the resolution from 1920x1080 to e.g. 640x480, but this "solution" doesn't suit me because I want to encode/decode a realtime video with 1920x1080 resolution.
UPD 08.07.2021:
I found out that if I change the values of TIMEOUT_IN_BUFFER and TIMEOUT_OUT_BUFFER from 10_000L to 100_000L it decreases the latency a bit but increases the delay of showing the first frame quite a lot after start encoding/decoding process.
It's possible your encoder is producing B frames -- bilinear interpolation frames. They increase quality and latency, and are great for movies. But no good for low-latency applications.
Key frames = I (interframes)
Predicted frames = P (difference from previous frames)
Interpolated frames = B
A sequence of frames including B frames might look like this:
IBBBPBBBPBBBPBBBI
11111111
12345678901234567
The encoder must encode each P frame, and the decoder must decode it, before the preceding B frames make any sense. So in this example the frames get encoded out of order like this:
1 5 2 3 4 9 6 7 8 13 10 11 12 17 17 13 14 15
In this example the decoder can't handle frame 2 until the encoder has sent frame 5.
On the other hand, this sequence without B frames allows coding and decoding the frames in order.
IPPPPPPPPPPIPPPPPPPPP
Try using the Constrained Baseline Profile setting. It's designed for low latency and low power use. It suppresses B frames. I think this works.
mediaFormat.setInteger(
"profile",
CodecProfileLevel.AVCProfileConstrainedBaseline);
I believe android h264 decoder have latency (at-least in most cases i've tried). Probably that's why android developers added PARAMETER_KEY_LOW_LATENCY from API level 30.
However I could decrease the delay some frames by querying for the output some more times.
Reason: no idea. It's just result of boring trial and errors
int inputIndex = m_codec.dequeueInputBuffer(-1);// Pass in -1 here bc we don't have a playback time reference
if (inputIndex >= 0) {
ByteBuffer buffer;
if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.LOLLIPOP) {
buffer = m_codec.getInputBuffer(inputIndex);
} else {
ByteBuffer[] bbuf = m_codec.getInputBuffers();
buffer = bbuf[inputIndex];
}
buffer.put(frame);
// tell the decoder to process the frame
m_codec.queueInputBuffer(inputIndex, 0, frame.length, 0, 0);
}
MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
int outputIndex = m_codec.dequeueOutputBuffer(info, 0);
if (outputIndex >= 0) {
m_codec.releaseOutputBuffer(outputIndex, true);
}
outputIndex = m_codec.dequeueOutputBuffer(info, 0);
if (outputIndex >= 0) {
m_codec.releaseOutputBuffer(outputIndex, true);
}
outputIndex = m_codec.dequeueOutputBuffer(info, 0);
if (outputIndex >= 0) {
m_codec.releaseOutputBuffer(outputIndex, true);
}
You need to configure customized(or KEY_LOW_LATENCY if it is supported) low latency parameters for different cpu venders. It is a common problem for android phone.
Check this code https://github.com/moonlight-stream/moonlight-android/blob/master/app/src/main/java/com/limelight/binding/video/MediaCodecHelper.java
Refer to this link, I just add a simple delay when output buffer available:
ByteBuffer buffer = outputBuffers[outIndex];
Log.v("DecodeActivity", "We can't use this buffer but render it due to the API limit, " + buffer);
// We use a very simple clock to keep the video FPS, or the video
// playback will be too fast
while (info.presentationTimeUs / 1000 > System.currentTimeMillis() - startMs) {
try {
sleep(10);
} catch (InterruptedException e) {
e.printStackTrace();
break;
}
}
decoder.releaseOutputBuffer(outIndex, true);
But when I feed a 25fps video only frames, the decoded video looks like only 10fps (many frames looks like dropped).
But if I add a frameconut to check the fps, it's really 25fps, and if add MediaMuxer to mux the frames in input buffer, it playbacks fine, which means frames actually not been dropped.
So it's wired why fames there but not show on screen, but if I remove the delay, the playback will be very quick (almost 50fps).
Just found issue caused by TextureView, after change TextureView to SurfaceView, it works fine now.
But still not clear why TextureView performance is so bad.
I have a problem with rtmp streaming of android surface to a client application. My solution has a very big latency, because my surface is not producing frames 60 times a second, it can produce it in any time (once in 30 seconds for example). So I want to show each new produced frame to the client immediately.
Android is pushing every frame, it looks fine. Client app (jwplayer or vlc) receives it, but it waiting for something. It becomes showing video only after receiving a number of frames. But I need to see every incoming frame on the client side when it just have been received.
How it is working now:
I have a Surface object, obtained from MediaCodec class. MediaCodec is set for h264 video encoding.
MediaCodec mEncoder;
.....
MediaFormat format = MediaFormat.createVideoFormat("video/avc", width, height);
format.setInteger(MediaFormat.KEY_COLOR_FORMAT, colorFormat);
format.setInteger(MediaFormat.KEY_BIT_RATE, videoBitrate);
format.setInteger(MediaFormat.KEY_FRAME_RATE, videoFramePerSecond);
format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, iframeInterval);
try {
mEncoder = MediaCodec.createEncoderByType("video/avc");
} catch (IOException e) {
e.printStackTrace();
}
mEncoder.configure(format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
mSurface = mEncoder.createInputSurface();
if (mSurfaceCallback!=null)
mSurfaceCallback.onSurfaceCreated(mSurface);
mEncoder.start();
Sometimes android is drawing to the surface. I can't control the rate of this drawings. Also I can't draw anything to that surface. When something is changed on the surface, MediaCodec is producing new byteBuffer with h264 frame. I send this frame by rtmp.
On a client side I have html page with jwplayer
<pre id="myElement"></pre>
<script>
var playerInstance = jwplayer("myElement");
playerInstance.setup({
file:"rtmp://127.0.0.1:1935/live/stream",
height: 800,
width: 480,
autostart: true,
controls: false,
rtmp: {
bufferlength: 0.1
}
});
</script>
I've tried to change iframeInterval, fps of encoding, bufferlength.. Nothing is really helpful.
Is there is any possibility to show incomming frames immeditely?
What do you mean?
If I understood right - you have:
vlc(client) ---- rtmp protocol ---- android (producer)
You encode video from something (may be camera) using MediaCodec and in vlc there is time latency? right?
At first - what are you using - direct input buffer or MediaCodec.Callback() ?
In callback - you can check every frame in onOutputBufferAvailable and calculate time from one frame to another - this will show you - is this problem on android side.
Then you can try to resolve frame transef problem
You can use WireShark to determine frame sending timing and chek - may be this is network problem
Than - vlc and other players try to fill some internal buffer and only after this starting to show video. Try to turn of vlc buffer (https://forum.videolan.org/viewtopic.php?t=40408). Then - there is common that vlc waiting for IDR frame. You can set interval for sending IDR frames in code
format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, iframeInterval);
iframeInterval in seconds (try to set 1 second)
(this will increase streaming size)
Sorry for my bad english
You can hopefully generate video frames at constant rate, even more than 20 fps to produce smooth video with acceptable latency. h264 encoder will handle a stable picture (one changing once in ~30 sec) gracefully, and when there is no change, frame size will be minimal.
I am developing a H.264 decoder using MediaCodec API. I am trying to call MediaCodec java API in JNI layer inside a function like:
void Decompress(const unsigned char *encodedInputdata, unsigned int inputLength, unsigned char **outputDecodedData, int &width, int &height) {
// encodedInputdata is encoded H.264 remote stream
// .....
// outputDecodedData = call JNI function of MediaCodec Java API to decode
// .....
}
Later I will send the outputDecodedData to my existing video rendering pipeline and render on Surface.
I hope I will be able to write a Java function to decode the input stream, but these would be challenge -
This resource states that -
...you can't do anything with the decoded video frame but render them
to surface
Here a Surface has been passed decoder.configure(format, surface, null, 0) to render the output ByteBuffer on the surface and claimed We can't use this buffer but render it due to the API limit.
So, will I able to send the output ByteBuffer to native layer to cast as unsigned char* and pass to my rendering pipeline instead of passing a Surface ot configure()?
I see two fundamental problems with your proposed function definition.
First, MediaCodec operates on access units (NAL units for H.264), not arbitrary chunks of data from a stream, so you need to pass in one NAL unit at a time. Once the chunk is received, the codec may want to wait for additional frames to arrive before producing any output. You cannot in general pass in one frame of input and wait to receive one frame of output.
Second, as you noted, the ByteBuffer output is YUV-encoded in one of several color formats. The format varies from device to device; Qualcomm devices notably use their own proprietary format. (It has been reverse-engineered, though, so if you search around you can find some code to unravel it.)
The common workaround is to send the video frames to a SurfaceTexture, which converts them to GLES "external" textures. These can be manipulated in various ways, or rendered to a pbuffer and extracted with glReadPixels().
I'm using AudioRecord to record the audio stream during a camera capturing process on Android device.
Since I want to process the frame data and handle audio/video samples, I do not use MediaRecorder.
I run AudioRecord in another thread with the calling of read() to gather the raw audio data.
Once I get a data stream, I feed them into an MediaCodec configured as an AAC audio encoder.
Here are some of my codes about the audio recorder / encoder:
m_encode_audio_mime = "audio/mp4a-latm";
m_audio_sample_rate = 44100;
m_audio_channels = AudioFormat.CHANNEL_IN_MONO;
m_audio_channel_count = (m_audio_channels == AudioFormat.CHANNEL_IN_MONO ? 1 : 2);
int audio_bit_rate = 64000;
int audio_data_format = AudioFormat.ENCODING_PCM_16BIT;
m_audio_buffer_size = AudioRecord.getMinBufferSize(m_audio_sample_rate, m_audio_channels, audio_data_format) * 2;
m_audio_recorder = new AudioRecord(MediaRecorder.AudioSource.MIC, m_audio_sample_rate,
m_audio_channels, audio_data_format, m_audio_buffer_size);
m_audio_encoder = MediaCodec.createEncoderByType(m_encode_audio_mime);
MediaFormat audio_format = new MediaFormat();
audio_format.setString(MediaFormat.KEY_MIME, m_encode_audio_mime);
audio_format.setInteger(MediaFormat.KEY_BIT_RATE, audio_bit_rate);
audio_format.setInteger(MediaFormat.KEY_CHANNEL_COUNT, m_audio_channel_count);
audio_format.setInteger(MediaFormat.KEY_SAMPLE_RATE, m_audio_sample_rate);
audio_format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
audio_format.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, m_audio_buffer_size);
m_audio_encoder.configure(audio_format, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
I found that the first time of AudioRecord.read() takes longer time to return, while the successive read() have time intervals that are more close to the real time of audio data.
For example, my audio format is 44100Hz 16Bit 1Channel, and the buffer size of AudioRecord is 16384, so a full buffer means 185.76 ms. When I record the system time for each call of read() and subtracting them from a base time, I get the following sequence:
time before each read(): 0ms, 345ms, 543ms, 692ms, 891ms, 1093ms, 1244ms, ...
I feed these raw data to the audio encoder with the above time values as PTS, and the encoder outputs encoded audio samples with the following PTS:
encoder output PTS: 0ms, 185ms, 371ms, 557ms, 743ms, 928ms, ...
It looks like that the encoder treats each part of data as having the same time period. I believe that the encoder works correctly since I give it raw data with the same size (16384) every time. However, if I use the encoder output PTS as the input of muxer, I'll get a video with audio content being faster then video content.
I want to ask that:
Is it expected that the first time of AudioRecord.read() blocks longer? I'm sure that the function call takes more than 300ms while it only records 16384 bytes as 186ms. Is this also an issue that depends on device / Android version?
What should I do to achieve audio/video synchronization? I have a workaround to measure the delay time of the first call of read(), then shift the PTS of audio samples by the delay. Is there another better way to handle this?
Convert the mono input to stereo. I was pulling my hair out for some time before I realised the AAC encoder exposed by MediaCoder only works with stereo input.