I'm using the Android NDK MediaCodec API to decode an MP4 stream of varying resolutions.
I am using the built-in functionality to render to a surface, calling AMediaCodec_releaseOutputBuffer with the render flag set to true.
I have found that every time the resolution changes, several frames of the old resolution are output on a surface the size of the new resolution. It might look something like this for a step up in resolution:
+------------------+-------------+
| frame of old res | |
| displayed too | |
| small | |
+------------------+ |
| |
| size of new resolution |
+--------------------------------+
After a few frames, the video looks normal again. I suspect it's something to do with the surface changing in size when an input buffer is queued with a new resolution, rather than when the first output buffer of that resolution is dequeued.
Has anyone encountered this issue before?
Thanks for your help.
I found the following in the MediaCodec documentation:
For some video formats it is also possible to change the picture size mid-stream. To do this for H.264, the new Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) values must be packaged together with an Instantaneous Decoder Refresh (IDR) frame in a single buffer, which then can be enqueued as a regular input buffer. The client will receive an INFO_OUTPUT_FORMAT_CHANGED return value from dequeueOutputBuffer() or onOutputBufferAvailable() just after the picture-size change takes place and before any frames with the new size have been returned.
Whenever I receive a new SPS and PPS I make sure not to pass them to the decoder straight away, but rather to prepend them to the next I-Frame.
This solved the problem - resolution changes are seamless :)
Does this particular decoder indicate support for MediaCodecInfo.CodecCapabilities.FEATURE_AdaptivePlayback?
I haven't encountered this issue myself (since I haven't tried decoding streams that change resolution), but as far as I know, this feature flag was introduced to signal proper handling for this use case. If you see this behaviour on a decoder with this feature flag, I would consider it a bug (or I haven't understood the actual implications of this feature flag properly).
Related
I am trying to save image sequences with fixed framerates (preferably up to 30) on an android device with FULL capability for camera2 (Galaxy S7), but I am unable to a) get a steady framerate, b) reach even 20fps (with jpeg encoding). I already included the suggestions from Android camera2 capture burst is too slow.
The minimum frame duration for JPEG is 33.33 milliseconds (for resolutions below 1920x1080) according to
characteristics.get(CameraCharacteristics.SCALER_STREAM_CONFIGURATION_MAP).getOutputMinFrameDuration(ImageFormat.JPEG, size);
and the stallduration is 0ms for every size (similar for YUV_420_888).
My capture builder looks as follows:
captureBuilder.set(CaptureRequest.CONTROL_AE_MODE, CONTROL_AE_MODE_OFF);
captureBuilder.set(CaptureRequest.SENSOR_EXPOSURE_TIME, _exp_time);
captureBuilder.set(CaptureRequest.CONTROL_AE_LOCK, true);
captureBuilder.set(CaptureRequest.SENSOR_SENSITIVITY, _iso_value);
captureBuilder.set(CaptureRequest.LENS_FOCUS_DISTANCE, _foc_dist);
captureBuilder.set(CaptureRequest.CONTROL_AF_MODE, CONTROL_AF_MODE_OFF);
captureBuilder.set(CaptureRequest.CONTROL_AWB_MODE, _wb_value);
// https://stackoverflow.com/questions/29265126/android-camera2-capture-burst-is-too-slow
captureBuilder.set(CaptureRequest.EDGE_MODE,CaptureRequest.EDGE_MODE_OFF);
captureBuilder.set(CaptureRequest.COLOR_CORRECTION_ABERRATION_MODE, CaptureRequest.COLOR_CORRECTION_ABERRATION_MODE_OFF);
captureBuilder.set(CaptureRequest.NOISE_REDUCTION_MODE, CaptureRequest.NOISE_REDUCTION_MODE_OFF);
captureBuilder.set(CaptureRequest.CONTROL_AF_TRIGGER, CaptureRequest.CONTROL_AF_TRIGGER_CANCEL);
// Orientation
int rotation = getWindowManager().getDefaultDisplay().getRotation();
captureBuilder.set(CaptureRequest.JPEG_ORIENTATION,ORIENTATIONS.get(rotation));
Focus distance is set to 0.0 (inf), iso is set to 100, exposure-time 5ms. Whitebalance can be set to OFF/AUTO/ANY VALUE, it does not impact the times below.
I start the capture session with the following command:
session.setRepeatingRequest(_capReq.build(), captureListener, mBackgroundHandler);
Note: It does not make a difference if I request RepeatingRequest or RepeatingBurst..
In the preview (only texture surface attached), everything is at 30fps.
However, as soon as I attach an image reader (listener running on HandlerThread) which I instantiate like follows (without saving, only measuring time between frames):
reader = ImageReader.newInstance(_img_width, _img_height, ImageFormat.JPEG, 2);
reader.setOnImageAvailableListener(readerListener, mBackgroundHandler);
With time-measuring code:
ImageReader.OnImageAvailableListener readerListener = new ImageReader.OnImageAvailableListener() {
#Override
public void onImageAvailable(ImageReader myreader) {
Image image = null;
image = myreader.acquireNextImage();
if (image == null) {
return;
}
long curr = image.getTimestamp();
Log.d("curr- _last_ts", "" + ((curr - last_ts) / 1000000) + " ms");
last_ts = curr;
image.close();
}
}
I get periodically repeating time differences like this:
99 ms - 66 ms - 66 ms - 99 ms - 66 ms - 66 ms ...
I do not understand why these take double or triple the time that the stream configuration map advertised for jpeg? The exposure time is well below the frame duration of 33ms. Is there some other internal processing happening that I am not aware of?
I tried the same for the YUV_420_888 format, which resulted in constant time-differences of 33ms. The problem I have here is that the cellphone lacks the bandwidth to store the images fast enough (I tried the method described in How to save a YUV_420_888 image?). If you know of any method to compress or encode these images fast enough myself, please let me know.
Edit: From the documentation of getOutputStallDuration: "In other words, using a repeating YUV request would result in a steady frame rate (let's say it's 30 FPS). If a single JPEG request is submitted periodically, the frame rate will stay at 30 FPS (as long as we wait for the previous JPEG to return each time). If we try to submit a repeating YUV + JPEG request, then the frame rate will drop from 30 FPS." Does this imply that I need to periodically request a single capture()?
Edit2: From https://developer.android.com/reference/android/hardware/camera2/CaptureRequest.html: "The necessary information for the application, given the model above, is provided via the android.scaler.streamConfigurationMap field using getOutputMinFrameDuration(int, Size). These are used to determine the maximum frame rate / minimum frame duration that is possible for a given stream configuration.
Specifically, the application can use the following rules to determine the minimum frame duration it can request from the camera device:
Let the set of currently configured input/output streams be called S.
Find the minimum frame durations for each stream in S, by looking it up in android.scaler.streamConfigurationMap using getOutputMinFrameDuration(int, Size) (with its respective size/format). Let this set of frame durations be called F.
For any given request R, the minimum frame duration allowed for R is the maximum out of all values in F. Let the streams used in R be called S_r.
If none of the streams in S_r have a stall time (listed in getOutputStallDuration(int, Size) using its respective size/format), then the frame duration in F determines the steady state frame rate that the application will get if it uses R as a repeating request."
The JPEG output is by way not the fastest way to fetch frames. You can accomplish this a lot faster by drawing the frames directly onto a Quad using OpenGL.
For burst capture, a faster solution would be capturing the images to RAM without encoding them, then encoding and saving them asynchronously.
On this website you can find a lot of excellent code related to android multimedia in general.
This specific program uses OpenGL to fetch the pixel data from an MPEG video. It's not difficult to use the camera as input instead of a video. You can basically use the texture used in the CodecOutputSurface class from the mentioned program as output texture for your capture request.
A possible solution I found consists of using and dumping YUV without encoding it as JPEG in combination with a micro Sd-card that is able to save up to 95Mb per second. (I had the misconception that YUV images would be larger, so with a cellphone that has full support for the camera2-pipeline, the write speed should be the limiting factor.
With this setup, I was able to achieve the following stable rates:
1920x1080, 15fps (approx. 4Mb * 15 == 60Mb/sec)
960x720, 30fps. (approx. 1.5Mb * 30 == 45Mb/sec)
I then encode the images offline from YUV to PNG using a python script.
I'm using ffmpeg in Android project via JNI to decode real-time H264 video stream. On the Java side I'm only sending the the byte arrays into native module. Native code is running a loop and checking data buffers for new data to decode. Each data chunk is processed with:
int bytesLeft = data->GetSize();
int paserLength = 0;
int decodeDataLength = 0;
int gotPicture = 0;
const uint8_t* buffer = data->GetData();
while (bytesLeft > 0) {
AVPacket packet;
av_init_packet(&packet);
paserLength = av_parser_parse2(_codecPaser, _codecCtx, &packet.data, &packet.size, buffer, bytesLeft, AV_NOPTS_VALUE, AV_NOPTS_VALUE, AV_NOPTS_VALUE);
bytesLeft -= paserLength;
buffer += paserLength;
if (packet.size > 0) {
decodeDataLength = avcodec_decode_video2(_codecCtx, _frame, &gotPicture, &packet);
}
else {
break;
}
av_free_packet(&packet);
}
if (gotPicture) {
// pass the frame to rendering
}
The system works pretty well until incoming video's resolution changes. I need to handle transition between 4:3 and 16:9 aspect ratios. While having AVCodecContext configured as follows:
_codecCtx->flags2|=CODEC_FLAG2_FAST;
_codecCtx->thread_count = 2;
_codecCtx->thread_type = FF_THREAD_FRAME;
if(_codec->capabilities&CODEC_FLAG_LOW_DELAY){
_codecCtx->flags|=CODEC_FLAG_LOW_DELAY;
}
I wasn't able to continue decoding new frames after video resolution change. The got_picture_ptr flag that avcodec_decode_video2 enables when whole frame is available was never true after that.
This ticket made me wonder if the issue isn't connected with multithreading. Only useful thing I've noticed is that when I change thread_type to FF_THREAD_SLICE the decoder is not always blocked after resolution change, about half of my attempts were successfull. Switching to single-threaded processing is not possible, I need more computing power. Setting up the context to one thread does not solve the problem and makes the decoder not keeping up with processing incoming data.
Everything work well after app restart.
I can only think of one workoround (it doesn't really solve the problem): unloading and loading the whole library after stream resolution change (e.g as mentioned in here). I don't think it's good tho, it will propably introduce other bugs and take a lot of time (from user's viewpoint).
Is it possible to fix this issue?
EDIT:
I've dumped the stream data that is passed to decoding pipeline. I've changed the resolution few times while stream was being captured. Playing it with ffplay showed that in moment when resolution changed and preview in application froze, ffplay managed to continue, but preview is glitchy for a second or so. You can see full ffplay log here. In this case video preview stopped when I changed resolution to 960x720 for the second time. (Reinit context to 960x720, pix_fmt: yuv420p in log).
I'm writing an Android application, and in it, I have a VirtualDisplay to mirror what is on the screen and I then send the frames from the screen to an instance of a MediaCodec. It works, but, I want to add a way of specifying the FPS of the encoded video, but I'm unsure how to do so.
From what I've read and experimented with, dropping encoded frames (based on the presentation times) doesn't work well as it ends up with blocky/artifact ridden video as opposed to a smooth video at a lower framerate. Other reading suggests that the only way to do what I want (limit the FPS) would be to limit the incoming FPS to the MediaCodec, but the VirtualDisplay just receives a Surface which is constructed from the MediaCodec as below
mSurface = <instance of MediaCodec>.createInputSurface();
mVirtualDisplay = mMediaProjection.createVirtualDisplay(
"MyDisplay",
screenWidth,
screenHeight,
screenDensity,
DisplayManager.VIRTUAL_DISPLAY_FLAG_AUTO_MIRROR,
mSurface,
null,
null);
I've also tried subclassing Surface and limit the frames that are fed to the MediaCodec via the unlockCanvasAndPost(Canvas canvas) but the function never seems to be called on my instance, so, there may be some weirdness in how I extended Surface and the interaction with the Parcel as writeToParcel function is called on my instance, but that is the only function that is called in my instance (that I can tell).
Other reading suggests that I can go from encoder -> decoder -> encoder and limit the rate in which the second encoder is fed frames, but that's a lot of extra computation that I'd rather not do if I can avoid it.
Has anyone successfully limited the rate at which a VirtualDisplay feeds its Surface? Any help would be greatly appreciated!
Starting off with what you can't do...
You can't drop content from the encoded stream. Most of the frames in the encoded stream are essentially "diffs" from other frames. Without knowing how the frames interact, you can't safely drop content, and will end up with that corrupted macroblock look.
You can't specify the frame rate to the MediaCodec encoder. It might stuff that into metadata somewhere, but the only thing that really matters to the codec is the frames you're feeding into it, and the presentation time stamps associated with each frame. The encoder will not drop frames.
You can't do anything useful by subclassing Surface. The Canvas operations are only used for software rendering, which is unrelated to feeding in frames from a camera or virtual display.
What you can do is send the frames to an intermediate Surface, and then choose whether or not to forward them to the MediaCodec's input Surface. One approach would be to create a SurfaceTexture, construct a Surface from it, and pass that to the virtual display. When the SurfaceTexture's frame-available callback fires, you either ignore it, or render the texture onto the MediaCodec input Surface with GLES.
Various examples can be found in Grafika and on bigflake, none of which are an exact fit, but all of the necessary EGL and GLES classes are there.
You can reference the code sample from saki4510t's ScreenRecordingSample or RyanRQ's ScreenRecoder, they are all use the additional EGL Texture between the virtual display and media encoder, and the first one can keep at least 15 fps for the output video. You can search the keyword createVirtualDisplay from their code base for more details.
The video decoding code of an app is typical, just like the example code in the MediaCodec document. Nothing special. The configuration statement is like the following:
myMediaCodec.configure(myMediaFormat, mySurface, null, 0);
Everything works fine. However, if I change the above code to the following to decode the video to a buffer instead of a surface:
myMediaCodec.configure(myMediaFormat, null, null, 0);
then the following code:
int iOutputBufferIndex = myMediaCodec.dequeueOutputBuffer(myBufferInfo, 100000);
will always return MediaCodec.INFO_TRY_AGAIN_LATER. Even more strangly, any subsequent call of myMediaCodec.stop() or myMediaCodec.release() will hang (i.e. the call never returns or generates an exception).
This happens on a generic (AGPTek) tablet (Allwinner A31S, 1.5GHz Cortex A7 Quad Core). On a simulator and another tablet (Asus Memo Pad), everything works fine.
I am asking for any tip to help get around this problem.
Do you provide one single input buffer worth of data before trying this, or do you pass as many packets as you can before dequeueInputBuffer also blocks or returns INFO_TRY_AGAIN_LATER? A decoder might not output data after only one packet of input (if the decoder has got some delay), but if it works with Suface output it should probably behave in the same way there.
If that (queueing as many input buffers as possible) doesn't work, I would say that this sounds like a decoder bug.
In my Android application, I am encoding some media in webm (vp8) format using MediaCodec. The encoding is working as expected. However, I need to ensure that I create a sync frame once in a while. Here is what I do:
encoder.queueInputBuffer(..., MediaCodec.BUFFER_FLAG_SYNC_FRAME);
Later in the code, I check for sync frame:
encoder.dequeueOutputBuffer(bufferInfo, 0);
boolean isSyncFrame = (bufferInfo.flags & MediaCodec.BUFFER_FLAG_SYNC_FRAME);
The problem is that isSyncFrame never gets a true value.
I am wondering if I am making a mistake in my encoding configuration. May be there is a better way to tell the encoder to create a sync frame once in a while.
I hope it is not a bug in MediaCodec. Thank you in advance for your help.
There is no (current as of Android 4.3) way to request an on-demand sync frame using MediaCodec encoders. This is partly due to OMX, the underlying codec implementation in Android, that does not provide a way to specify which input frame should be encoded as a sync frame; although it has a way to trigger a sync frame "in the near future".
feisal's answer is the only currently supported way to control sync frames, but you have to do it at configuration time.
==edit re: jesup
You can trigger a sync frame in the near future using MediaCodec.setParameter:
Bundle params = new Bundle();
params.putInt(MediaCodec.PARAMETER_KEY_REQUEST_SYNC_FRAME, 0);
mCodec.setParameters(syncFrame);
Unfortunately, there is no (reliable) way to tell in MediaCodec if an encoded buffer is a sync frame other than doing it on your own by inspecting the byte-codes.
you can set the rate of I-frames in the MediaFormat object of your encoder by setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, int secs_between_iframes );