Strange performance of avcodec_decode_video2

Strange performance of avcodec_decode_video2 - android

I am developing an Android video player. I use ffmpeg in native code to decode video frame. In the native code, I have a thread called decode_thread that calls avcodec_decode_video2()
int decode_thread(void *arg) {
avcodec_decode_video2(codecCtx, pFrame, &frameFinished,pkt);
}
I have another thread called display_thread that uses aNativeWindow to display a decoded frame on a SurfaceView.
The problem is that if I let the decode_thread run continuously without a delay. It significantly reduces the performance of avcodec_decode_video2(). Sometimes it takes about 0.1 seconds to decode a frame. However if I put a delay on the decode_thread. Something likes this.
int decode_thread(void *arg) {
avcodec_decode_video2(codecCtx, pFrame, &frameFinished,pkt);
usleep(20*1000);
}
The performance of avcodec_decode_video2() is really good, about 0.001 seconds. However putting a delay on the decode_thread is not a good solution because it affects the playback. Could anyone explain the behavior of avcodec_decode_video2() and suggest me a solution?

It looks impossible that the performance of video decoding function would improve just because your thread sleeps. Most likely the video decoding thread gets preempted by another thread, and hence you get the increased timing (hence your thread did not work). When you add a call to usleep, this does the context switch to another thread. So when your decoding thread is scheduled again the next time, it starts with the full CPU slice, and is not interrupted in the decode_ video2 function anymore.
What should you do? You surely want to decode packets a little bit ahead than you show them - the performance of avcodec_decode_video2 certainly isn't constant, and if you try to stay just one frame ahead, you might not have enough time to decode one of the frames.
I'd create a producer-consumer queue with the decoded frames, with the top limit. The decoder thread is a producer, and it should run until it fills up the queue, and then it should wait until there's room for another frame. The display thread is a consumer, it would take frames from this queue and display them.

Related

How to wait for an asynchronous AudioTrack.write to finish?

I'd like to play back audio that is synthesized at 1/50 s increments. With the asynchronous streaming interface of AudioTrack my plan is to basically do the following:
while (!done)
{
frame = synthesize();
audio.waitForWrite(); // XXX
audio.write(frame, 0, frameSize, WRITE_NON_BLOCKING);
}
audio.waitForWrite(); // XXX
However, there is no waitForWrite or similar method on AudioTrack that I could use here; and if I just do a non-blocking write, the second frame will replace the first one in the middle, i.e. let's say synthesis of a 20ms frame takes 5 ms, then the first frame will play for 5ms and then get replaced by the second one after 5ms and so on, which is clearly not what I want.
On the other hand, if I use blocking writes, then I can't synthesize the next frame while the previous one is already playing.

You misunderstand streaming mode. Write doesn't take the amount of time it takes to play it. Write copies it to another buffer. In blocking mode, it will wait until the entire buffer is copied, but not until its played. In non-blocking mode, it will copy as much as it can right now and return immediately. There is no need to wait for it to be played in either mode, and no reason to.

SurfaceTexture's onFrameAvailable() method always called too late

I'm trying to get the following MediaExtractor example to work:
http://bigflake.com/mediacodec/ - ExtractMpegFramesTest.java (requires 4.1, API 16)
The problem I have is that outputSurface.awaitNewImage(); seems to always throw RuntimeException("frame wait timed out"), which is thrown whenever the mFrameSyncObject.wait(TIMEOUT_MS) call times out. No matter what I set TIMEOUT_MS to be, onFrameAvailable() always gets called right after the timeout occurs. I tried with 50ms and with 30000ms and it's the same.
It seems like the onFrameAvailable() call can't be done while the thread is busy, and once the timeout happens which ends the thread code execution, it can parse the onFrameAvailable() call.
Has anyone managed to get this example to work, or knows how MediaExtractor is supposed to work with GL textures?
Edit: tried this on devices with API 4.4 and 4.1.1 and the same happens on both.
Edit 2:
Got it working on 4.4 thanks to fadden. The issue was that the ExtractMpegFramesWrapper.runTest() method called th.join(); which blocked the main thread and prevented the onFrameAvailable() call from being processed. Once I commented th.join(); it works on 4.4. I guess maybe the ExtractMpegFramesWrapper.runTest() itself was supposed to run on yet another thread so the main thread didn't get blocked.
There was also a small issue on 4.1.2 when calling codec.configure(), it gave the error:
A/ACodec(2566): frameworks/av/media/libstagefright/ACodec.cpp:1041 CHECK(def.nBufferSize >= size) failed.
A/libc(2566): Fatal signal 11 (SIGSEGV) at 0xdeadbaad (code=1), thread 2625 (CodecLooper)
Which I solved by adding the following before the call:
format.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 0);
However the problem I have now on both 4.1.1 (Galaxy S2 GT-I9100) and 4.1.2 (Samsung Galaxy Tab GT-P3110) is that they both always set info.size to 0 for all frames. Here is the log output:
loop
input buffer not available
no output from decoder available
loop
input buffer not available
no output from decoder available
loop
input buffer not available
no output from decoder available
loop
input buffer not available
no output from decoder available
loop
submitted frame 0 to dec, size=20562
no output from decoder available
loop
submitted frame 1 to dec, size=7193
no output from decoder available
loop
[... skipped 18 lines ...]
submitted frame 8 to dec, size=6531
no output from decoder available
loop
submitted frame 9 to dec, size=5639
decoder output format changed: {height=240, what=1869968451, color-format=19, slice-height=240, crop-left=0, width=320, crop-bottom=239, crop-top=0, mime=video/raw, stride=320, crop-right=319}
loop
submitted frame 10 to dec, size=6272
surface decoder given buffer 0 (size=0)
loop
[... skipped 1211 lines ...]
submitted frame 409 to dec, size=456
surface decoder given buffer 1 (size=0)
loop
sent input EOS
surface decoder given buffer 0 (size=0)
loop
surface decoder given buffer 1 (size=0)
loop
surface decoder given buffer 0 (size=0)
loop
surface decoder given buffer 1 (size=0)
loop
[... skipped 27 lines all with size=0 ...]
surface decoder given buffer 1 (size=0)
loop
surface decoder given buffer 0 (size=0)
output EOS
Saving 0 frames took ? us per frame // edited to avoid division-by-zero error
So no images get saved. However the same code and video works on 4.3. The video I am using is an .mp4 file with "H264 - MPEG-4 AVC (avc1)" video codec and "MPEG AAAC Audio (mp4a)" audio codec.
I also tried other video formats, but they seem to die even sooner on 4.1.x, while both work on 4.3.
Edit 3:
I did as you suggested, and it seems to save the frame images correctly. Thank you.
Regarding KEY_MAX_INPUT_SIZE, I tried not setting, or setting it to 0, 20, 200, ... 200000000, all with the same result of info.size=0.
I am now unable to set the render to a SurfaceView or TextureView on my layout. I tried replacing this line:
mSurfaceTexture = new SurfaceTexture(mTextureRender.getTextureId());
with this, where surfaceTexture is a SurfaceTexture defined in my xml-layout:
mSurfaceTexture = textureView.getSurfaceTexture();
mSurfaceTexture.attachToGLContext(mTextureRender.getTextureId());
but it throws a weird error with getMessage()==null on the second line. I couldn't find any other way to get it to draw on a View of some kind. How can I change the decoder to display the frames on a Surface/SurfaceView/TextureView instead of saving them?

The way SurfaceTexture works makes this a bit tricky to get right.
The docs say the frame-available callback "is called on an arbitrary thread". The SurfaceTexture class has a bit of code that does the following when initializing (line 318):
if (this thread has a looper) {
handle events on this thread
} else if (there's a "main" looper) {
handle events on the main UI thread
} else {
no events for you
}
The frame-available events are delivered to your app through the usual Looper / Handler mechanism. That mechanism is just a message queue, which means the thread needs to be sitting in the Looper event loop waiting for them to arrive. The trouble is, if you're sleeping in awaitNewImage(), you're not watching the Looper queue. So the event arrives, but nobody sees it. Eventually awaitNewImage() times out, and the thread returns to watching the event queue, where it immediately discovers the pending "new frame" message.
So the trick is to make sure that frame-available events arrive on a different thread from the one sitting in awaitNewImage(). In the ExtractMpegFramesTest example, this is done by running the test in a newly-created thread (see the ExtractMpegFramesWrapper class), which does not have a Looper. (For some reason the thread that executes CTS tests has a looper.) The frame-available events arrive on the main UI thread.
Update (for "edit 3"): I'm a bit sad that ignoring the "size" field helped, but pre-4.3 it's hard to predict how devices will behave.
If you just want to display the frame, pass the Surface you get from the SurfaceView or TextureView into the MediaCodec decoder configure() call. Then you don't have to mess with SurfaceTexture at all -- the frames will be displayed as you decode them. See the two "Play video" activities in Grafika for examples.
If you really want to go through a SurfaceTexture, you need to change CodecOutputSurface to render to a window surface rather than a pbuffer. (The off-screen rendering is done so we can use glReadPixels() in a headless test.)

Android Camera onPreviewFrame frame rate not consistent

I am trying to encode a 30 frames per second video using MediaCodec through the Camera's PreviewCall back(onPreviewFrame). The video that I encoded always plays very fast(this is not desired).
So, I tried to check the number of frames that is coming into my camera's preview by setting up a int frameCount variable to remember its count. What I am expecting is 30 frames per second because I setup my camera's preview to have 30 fps preview(as shown below). The result that I get back is not the same.
I called the onPreviewFrame callback for 10 second, the number of frameCount I get back is only about 100 frames. This is bad because I am expecting 300 frames. Is my camera parameters setup correctly? Is this a limitation of Android's Camera preview call back? And if this is a limitation on the Android Camera's preview call back, then is there any other camera callback that can return the camera's image data(nv21,yuv, yv12) in 30 frames per second?
thanks for reading and taking your time to helpout. i would appreciate any comments and opinions.
Here is an example an encoded video using Camera's onPreviewFrame:
http://www.youtube.com/watch?v=I1Eg2bvrHLM&feature=youtu.be
Camera.Parameters parameters = mCamera.getParameters();
parameters.setPreviewFormat(ImageFormat.NV21);
parameters.setPictureSize(previewWidth,previewHeight);
parameters.setPreviewSize(previewWidth, previewHeight);
// parameters.setPreviewFpsRange(30000,30000);
parameters.setPreviewFrameRate(30);
mCamera.setParameters(parameters);
mCamera.setPreviewCallback(previewCallback);
mCamera.setPreviewDisplay(holder);

No, Android camera does not guarantee stable frame rate, especially at 30 FPS. For example, it may choose longer exposure at low lighting conditions.
But there are some ways we, app developers, can make things worse.
First, by using setPreviewCallback() instead of setPreviewCallbackWithBuffer(). This may cause unnecessary pressure on the garbage collector.
Second, if onPreviewFrame() arrives on the main (UI) thread, you cause any UI action directly delay the camera frames arrival. To keep onPreviewFrame() on a separate thread, you should open() the camera on a secondary Looper thread. Here I explained in detail how this can be achieved: Best use of HandlerThread over other similar classes.
Third, check that processing time is less than 20ms.

Accurate POSIX thread timing using NDK

I'm writing a simple NDK OpenSL ES audio app that records the users touches on a virtual piano keyboard and then plays them back forever over a set loop. After much experimenting and reading, I've settled on using a separate POSIX loop to achieve this. As you can see in the code it subtracts any processing time taken from the sleep time in order to make the interval of each loop as close to the desired sleep interval as possible (in this case it's 5000000 nanoseconds.
void init_timing_loop() {
pthread_t fade_in;
pthread_create(&fade_in, NULL, timing_loop, (void*)NULL);
}
void* timing_loop(void* args) {
while (1) {
clock_gettime(CLOCK_MONOTONIC, &timing.start_time_s);
tic_counter(); // simple logic gates that cycle the current tic
play_all_parts(); // for-loops through all parts and plays any notes (From an OpenSL buffer) that fall on the current tic
clock_gettime(CLOCK_MONOTONIC, &timing.finish_time_s);
timing.diff_time_s.tv_nsec = (5000000 - (timing.finish_time_s.tv_nsec - timing.start_time_s.tv_nsec));
nanosleep(&timing.diff_time_s, NULL);
}
return NULL;
}
The problem is that even using this the results are better, but quite inconsistent. sometimes notes will delay for perhaps even 50ms at a time, which makes for very wonky playback.
Is there a better way of approaching this? To debug I ran the following code:
gettimeofday(&timing.curr_time, &timing.tzp);
__android_log_print(ANDROID_LOG_DEBUG, "timing_loop", "gettimeofday: %d %d",
timing.curr_time.tv_sec, timing.curr_time.tv_usec);
Which gives a fairly consistent readout - that doesn't reflect the playback inaccuracies whatsoever. Are there other forces at work with Android preventing accurate timing? Or is OpenSL ES a potential issue? All the buffer data is loaded into memory - could there be bottlenecks there?
Happy to post more OpenSL code if needed... but at this stage I'm trying figure out if this thread loop is accurate or if there's a better way to do it.

You should consider seconds when using clock_gettime as well, you may get greater timing.start_time_s.tv_nsec than timing.finish_time_s.tv_nsec. tv_nsec starts from zero when tv_sec is increased.
timing.diff_time_s.tv_nsec =
(5000000 - (timing.finish_time_s.tv_nsec - timing.start_time_s.tv_nsec));
try something like
#define NS_IN_SEC 1000000000
(timing.finish_time_s.tv_sec * NS_IN_SEC + timing.finish_time_s.tv_nsec) -
(timing.start_time_s.tv_nsec * NS_IN_SEC + timing.start_time_s.tv_nsec)

AudioTrack restarting even after it is stopped

I created a simple application that generates a square wave of given frequency and plays it using AudioTrack in STREAM mode (STREAM_MUSIC). Everything seems to be working fine and the sound plays okay, however when the stream is finished I get messages in the log:
W/AudioTrack( 7579): obtainBuffer() track 0x14c228 disabled, restarting ...
Even after calling the stop() function I still get these.
I believe I properly set the AudioTrack buffer size, based on minimal size required by AudioTrack (in my case 6x1024). I feed it with smaller buffers of 1024 shorts.
Is it okay that I'm getting these and should I leave it like that?

Ok, I think the problem is solved. The error is generated when the buffer is not completely filled with data on time (buffer underrun) . I have no idea what the timeout is but if you experience this make sure that:
You don't call the play method until you have some data in the buffer.
You can generate the data fast enough to beat the timeout.
After you are finished feeding the buffer with data, before you call stop() method, make sure that the "last" buffer was completely filled with data before timeout.
I dealt with the last issue by always waiting a little (until timeout) then sending 1 buffer full of zeroes and finally calling the stop() function.
Keep in mind that you must always send the buffer in smaller chunks, even if you have the big chunk ready. It still bothers me a bit that I'm not 100% sure if that is the right way but the errors are gone so I guess I can live with that :)

I've found that even when the buffer is technically long enough, and filled with bytes, if they aren't properly formatted (audio shorts converted to a byte array) it will still throw you that error.

I was getting that warning when I instantiated the Audiotrack, called audioTrack.play() and there was a slight delay between the play() call and the audioTrack.write(). If I called play() right before write() the warning disappeared.

I've solved by this
if (mAudioTrack.getPlayState()!=AudioTrack.PLAYSTATE_PLAYING)
mAudioTrack.play();
mAudioTrack.write(b, 0, sz * 2);
mAudioTrack.stop();
mAudioTrack.flush();

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.