I am trying to get specific frames at specific times as images from a movie using MediaExtractor and MediaCodec. I can do it successfully if:
I use extractor.seekTo(time, MediaExtractor.SEEK_TO_PREVIOUS_SYNC); , however, this only gives the nearest sync frame not the target frame.
I sequentially extract all frames using extractor.advance(); , but I need to get the target frame not all.
So, I try the following:
extractor.seekTo(time, MediaExtractor.SEEK_TO_PREVIOUS_SYNC);
while(extractor.getSampleTime()<time /*target time*/) extractor.advance();
This provides the correct frame, but for some reason the image is corrupted. It looks like the correct image (the one I get from the successful cases), but with some pixelation and a strange haze.
The while-loop is the only thing that is different between the successful cases and the corrupted ones. What to do to advance MediaExtractor to a specific time (not just sync time) without getting a corrupted image?
Thanks to fadden comment, I have to keep feeding the encoder since the I-frame has the full picture and the P and B frames have differences (this is how compression is achieved). So I need to start with an I-frame (it was same as sync frame) and keep feeding the other frames to the decoder to receive the full image.
Related
The basic issue I am trying to solve is to delay what is sent to a virtual display by a second or so. So basically, I am trying to shift all frames by 1 second after the initial recording. Note that a surface is used as an input and another surface is used as an output through this virtual display. My initial hunch is to explore a few ideas, given that modification of the Android framework or use of non-public APIs is fine. Java or native C/C++ is fine.
a) I tried delaying frames posted to the virtual display or output surface by a second or two in SurfaceFlinger. This does not work as it causes all surfaces to be delayed by the same amount of time (synchronous processing of frames).
b) MediaCodec uses a surface as an input to encode, and then produce the decoded data. Is there anyway to use MediaCodec such that it does not actually encode and only produce unencoded raw frames? Seems unlikely. Moreover, how does MediaCodec do this under the hood? Process things frame by frame. If I can extrapolate the method I might be able to extract frame by frame from my input surface and create a ring buffer delayed by the amount of time I require.
c) How do software decoders, such as FFmpeg, actually do this in Android? I assume they take in a surface but how would they extrapolate and process frame by frame
Note that I can certainly encode and decode to retrieve the frames and post them but I want to avoid actually decoding. Note that modifying the Android framework or using non-public APIs is fine.
I also found this: Getting a frame from SurfaceView
It seems like option d) could be using a SurfaceTexture but I would like to avoid the process of encoding/decoding.
As I understand it, you have a virtual display that is sending its output to a Surface. If you just use a SurfaceView for output, frames output by the virtual display appear on the physical display immediately. The goal is to introduce one second of latency between when the virtual display generates a frame and when the Surface consumer receives it, so that (again using SurfaceView as an example) the physical display shows everything a second late.
The basic concept is easy enough: send the virtual display output to a SurfaceTexture, and save the frame into a circular buffer; meanwhile another thread is reading frames out of the tail end of the circular buffer and displaying them. The trouble with this is what #AdrianCrețu pointed out in the comments: one second of full-resolution screen data at 60fps will occupy a significant fraction of the device's memory. Not to mention that copying that much data around will be fairly expensive, and some devices might not be able to keep up.
(It doesn't matter whether you do it in the app or in SurfaceFlinger... the data for up to 60 screen-sized frames has to be held somewhere for a full second.)
You can reduce the volume of data in various ways:
Reduce the resolution. Scaling 2560x1600 to 1280x800 removes 3/4 of the pixels. The loss of quality should be difficult to notice on most displays, but it depends on what you're viewing.
Reduce the color depth. Switching from ARGB8888 to RGB565 will cut the size in half. This will be noticeable though.
Reduce the frame rate. You're generating the frames for the virtual display, so you can choose to update it more slowly. Animation is still reasonably smooth at 30fps, halving the memory requirements.
Apply image compression, e.g. PNG or JPEG. Fairly effective, but too slow without hardware support.
Encode inter-frame differences. If not much is changing from frame to frame, the incremental changes can be very small. Desktop-mirroring technologies like VNC do this. Somewhat slow to do in software.
A video codec like AVC will both compress frames and encode inter-frame differences. That's how you get 1GByte/sec down to 10Mbit/sec and still have it look pretty good.
Consider, for example, the "continuous capture" example in Grafika. It feeds the Camera output into a MediaCodec encoder, and stores the H.264-encoded output in a ring buffer. When you hit "capture", it saves the last 7 seconds. This could just as easily play the camera feed with a 7-second delay, and it only needs a few megabytes of memory to do it.
The "screenrecord" command can dump H.264 output or raw frames across the ADB connection, though in practice ADB is not fast enough to keep up with raw frames (even on tiny displays). It's not doing anything you can't do from an app (now that we have the mediaprojection API), so I wouldn't recommend using it as sample code.
If you haven't already, it may be useful to read through the graphics architecture doc.
What I try to do
I made a video player app by using the source code from the following link.
https://github.com/kylelo/VideoPlayerGH
I want to implement some methods to calculate the complexity for each frame, then I can do some image processing after the calculation.
So the first step I need to do is to get the bitmap or pixel values from the video frame to analyze before it render on the screen, I have used glReadPixels() to get the pixel values into a new ByteBuffer in the draw() function. I can get the RGBA values successfully, but the frame rate droped from 60 fps to 20 fps on my device(HTC buffterfly s), I even have not did any image processing on it...
My question is
Is there any other more efficient way to realize this task? Even working on other layers of Android system is fine.
I really need some hints on it...
Because I am new in Android, so if there is any concept I am wrong, please tell me! I am really appreciate for everyone's help!
I'm trying to copy a part of a video, and save it as a GIF into the disk. The video can be local or remote, and the copy should be 2s max. I don't need to save every single frame, but every other frame (12-15 fps). I have the "frames to gif" part working, but the "get the frames" part is not great.
Here is what I tried so far:
- MediaMetadataRetriever: too slow (~1s per frame on a Nexus4), and only works with local files
- FFmpegMediaMetadataRetriever: same latency, but works with remote video
- TextureView.getBitmap(): I'm using a ScheduledExecutorService and every 60ms, grab the Bitmap (while playing...) It works well with small size getBitmap(100, 100), but for bigger ones (> 400), the whole process becomes really slow. And as the doc says Do not invoke this method from a drawing method anyway.
It seems that the best solution would be to access every frame while decoding, and save them. I tried OpenCV for Android but couldn't find an API to grab a frame at a specific time.
Now, I'm looking into those samples to understand how to use MediaCodec, but while running ExtractMpegFramesTest.java, I can't seem to extract any frame ("no output from decoder available").
Am I on the right track? Any other suggestion?
edit: went further with ExtractMpegFramesTest.java, thanks for this post.
edit 2: just to clarify, what I'm trying to achieve here is to play a video, and press a button to start capturing the frames.
My app should show a bunch of animated images (within a list or grid, ~10-20 images on a screen depending on the screen size). Each of these images can contain a lot of frames (up to 150). Actually these images are gifs initially. I get them as encoded mp4 files (to reduce the data size transferred through the network) and decode them on a device. For decoding I use android.media.MediaMetadataRetriever (it doesn't work well on Samsung devices) and I do it the following way: when I show the current frame I put a new decoding task - actually a task to get the next frame - to a priority queue. Priority depends on the time to which I need to get the next frame. If task "expires" (isn't processed by workers before expected time) I just put to a queue next task and etc. But my algorithm doesn't work well - animations are very slow and buggy (I don't also exclude that my implementation can have some bugs)..
So, here are the questions:
1) Are there any other possibly better ways to decode mp4?
2) Could someone please give me and advice which algorithm I should use to effectively decode mp4 files, so there weren't any animation lags?
3) I can't use simple AnimationDrawable because of out of memory errors, so how can I effectively manage frames cache?
Thanks!
The main question is if it is possible to somehow go around the frame index checking that ffmpeg does when writing the frame to a file.
Now I will explain my exact problem so you can understand better what I need or maybe think of an alternative solution.
Problem n0.1: I am getting video stream from two independent cameras and for some reason I want to save it in the same video file. First the frames from the first camera and then the frames from the second. When writing the frames from the second camera av_write_frame would return the error code -22 and will fail to add the frame. That's because the writing context is expecting a frame index following the index of the previously written frame (the last frame from camera 1) but he receives a frame with the index 0, the first frame from the second camera.
Problem no.2: Consider the following problem independently to the first one.
I am trying to save a video stream to a file but the frame rate is double the real speed. So because I couldn't find any working solution to speed down the frame rate i thought to write every frame twice in the video file. But it won't make any difference to the frame rate.
I also tried a different approach on the frame rate problem but it also failed(question here).
Any kind of working solution would be highly appreciated.
Also it's important that I can't use console commands, I need C code, as I need to integrate those functionalities in an Android application that is automated.