My Android app does live video processing using OpenGL. I'm trying to save it to video using MediaMuxer and MediaCodex.
The performance it not good enough. Each cycle the screen is updated, and it is saved to file. The screen is smooth, the video file is horrible. By this I mean major motion blur when it changes quickly and the frame-rate appears to be 1/2 or 1/3rd of what it should be.
It seems to be a limitation due to clamping of settings internally. I can't get it to spit out a video with a bit rate greater than 288KBPS. I think it is not clamping the requested parameters because there is no difference in frame rate for 1024x1024, 480x480, and 240x240. If it was having trouble keeping up, it should at least improve when the number of pixels drops by a factor > 10.
The app is here : https://play.google.com/store/apps/details?id=com.matthewjmouellette.snapdat.
I would love to post a code sample, but my program is 10K lines of code, with a lot of relevant code just for this problem.
Any help would be greatly appreciated.
EDIT:
I've tried like 10+ different things. I'm out of ideas right now. I wish I could just save the video uncompressed, the hard-drive should be able to keep up with a small enough image and medium fps.
It seems to be that the encoding method just doesn't work for my video. The frames differ to much, to try to "move" one part of the frame, as a sort of encoding. Instead I need full frames throughout. I am thinking something along the lines of M-JPEG would work really well. JPEGs tend to take 1/10th the size of a bitmap. It should allow a reasonable size, with almost no processing power required by the CPU, since it is image compression not video compression which we are doing. I wish I had a good library for this.
Related
The basic issue I am trying to solve is to delay what is sent to a virtual display by a second or so. So basically, I am trying to shift all frames by 1 second after the initial recording. Note that a surface is used as an input and another surface is used as an output through this virtual display. My initial hunch is to explore a few ideas, given that modification of the Android framework or use of non-public APIs is fine. Java or native C/C++ is fine.
a) I tried delaying frames posted to the virtual display or output surface by a second or two in SurfaceFlinger. This does not work as it causes all surfaces to be delayed by the same amount of time (synchronous processing of frames).
b) MediaCodec uses a surface as an input to encode, and then produce the decoded data. Is there anyway to use MediaCodec such that it does not actually encode and only produce unencoded raw frames? Seems unlikely. Moreover, how does MediaCodec do this under the hood? Process things frame by frame. If I can extrapolate the method I might be able to extract frame by frame from my input surface and create a ring buffer delayed by the amount of time I require.
c) How do software decoders, such as FFmpeg, actually do this in Android? I assume they take in a surface but how would they extrapolate and process frame by frame
Note that I can certainly encode and decode to retrieve the frames and post them but I want to avoid actually decoding. Note that modifying the Android framework or using non-public APIs is fine.
I also found this: Getting a frame from SurfaceView
It seems like option d) could be using a SurfaceTexture but I would like to avoid the process of encoding/decoding.
As I understand it, you have a virtual display that is sending its output to a Surface. If you just use a SurfaceView for output, frames output by the virtual display appear on the physical display immediately. The goal is to introduce one second of latency between when the virtual display generates a frame and when the Surface consumer receives it, so that (again using SurfaceView as an example) the physical display shows everything a second late.
The basic concept is easy enough: send the virtual display output to a SurfaceTexture, and save the frame into a circular buffer; meanwhile another thread is reading frames out of the tail end of the circular buffer and displaying them. The trouble with this is what #AdrianCrețu pointed out in the comments: one second of full-resolution screen data at 60fps will occupy a significant fraction of the device's memory. Not to mention that copying that much data around will be fairly expensive, and some devices might not be able to keep up.
(It doesn't matter whether you do it in the app or in SurfaceFlinger... the data for up to 60 screen-sized frames has to be held somewhere for a full second.)
You can reduce the volume of data in various ways:
Reduce the resolution. Scaling 2560x1600 to 1280x800 removes 3/4 of the pixels. The loss of quality should be difficult to notice on most displays, but it depends on what you're viewing.
Reduce the color depth. Switching from ARGB8888 to RGB565 will cut the size in half. This will be noticeable though.
Reduce the frame rate. You're generating the frames for the virtual display, so you can choose to update it more slowly. Animation is still reasonably smooth at 30fps, halving the memory requirements.
Apply image compression, e.g. PNG or JPEG. Fairly effective, but too slow without hardware support.
Encode inter-frame differences. If not much is changing from frame to frame, the incremental changes can be very small. Desktop-mirroring technologies like VNC do this. Somewhat slow to do in software.
A video codec like AVC will both compress frames and encode inter-frame differences. That's how you get 1GByte/sec down to 10Mbit/sec and still have it look pretty good.
Consider, for example, the "continuous capture" example in Grafika. It feeds the Camera output into a MediaCodec encoder, and stores the H.264-encoded output in a ring buffer. When you hit "capture", it saves the last 7 seconds. This could just as easily play the camera feed with a 7-second delay, and it only needs a few megabytes of memory to do it.
The "screenrecord" command can dump H.264 output or raw frames across the ADB connection, though in practice ADB is not fast enough to keep up with raw frames (even on tiny displays). It's not doing anything you can't do from an app (now that we have the mediaprojection API), so I wouldn't recommend using it as sample code.
If you haven't already, it may be useful to read through the graphics architecture doc.
Does anyone know how to load a high resolution video in android programmatically, such as 3000 x 3000, to display only a portion of this video, such as 1000 x 1000?
I tried to use the MediaPlayer android sdk official in a TextureView, but this method has media size limitations, I think, because the video plays but and texture view is black ..
I appreciate the help.
Media size limitations are for a reason. 3k x 3k resolution video is huge for such small device like phone. Consider that you are not able to decrypt only small portion of video frame, that is not like video works. So you need do decrypt whole big frame (wchich is calculated from I-Frame and following P-frames ) than take screenshot of it after that take part which you are interested in and present on your textureView, and all of this in realtime. Think about device memory and CPU. In my opinion it's not possible with such small resources
I tried searching in a ton of places about doing this, with no results. I did read that the only (as far as I know) way to obtain image frames was to use a ImageReader, which gives me a Image to work with. However, a lot of work must be done before I have a nice enough image (converting Image to byte array, then converting between formats - YUV_420_888 to ARGB_8888 - using RenderScript, then turning it into a Bitmap and rotating it manually - or running the application on landscape mode). By this point a lot of processing is made, and I haven't even started the actual processing yet (I plan on running some native code on it). Additionally, I tried to lower the resolution, with no success, and there is a significant delay when drawing on the surface.
Is there a better approach to this? Any help would be greatly appreciated.
Im not sure what exactly you are doing with the images, but a lot of times only a grayscale image is actually needed (again depending on your exact goal) If your camera outputs YUV, the grayscale information is in the Y channel. The nice thing,is you don't need to convert to numerous colorspaces and working with only one layer (as opposed to three) decreases the size of your data set greatly.
If you need color images then this wouldn't help
I have been posting quite some questions recently without getting a response. Hope this one gets.
Iv decided to go with mp4 as its compression is much better than gif sequences. Trying movie.decodestream(InputStream) helps to get the stream. But when movie.duration() is aquired, a null pointer exception is thrown. Going around the web, I found that duration works if the movies are frame by frame(with durations for each).
So, is the mp4 sequence a bad way to read the stream and get the duration?
Is there any way to convert the mp4 into a "frame by frame" sequence?
What is required is an easy way to play some 30 frame animations on screen.
The problems of each solution Iv found are: Jpeg sequences are pretty much memory consuming, Gif sequences when used have jagged edges and bigger file sizes and mp4 sequences cannot be acquired for frame by frame playback.
- Since the animations have a dimension of 800*600, I guess Jpeg sequences are out of the question, or please do tell if there are any good options for this.
- Presently Im using gif sequences,with half the dimension. The edges are jagged and are displayed very bad in different devices.
-Mp4 Sequences are just too good. But I need the control over the frames so that I can play it forward and reverse, and display the first and last frames at the resting stage of the animation(the animations are triggered on touch).
From experiments and from reading other posts like this one it seems that it's hard to process high resolution images on Android because there is a limit on how much memory the VM will allow to allocate.
Loading a 8MP camera pictures takes around 20 MB of memory.
I understand that the easy solution is to downsample the image when loading it (BitmapFactory offers such an option) but I still would like to process the image in full resolution: the camera shoots 8MP, why would I only use 4MP and reduce the quality.
Does anyone know good workarounds for that?
In a resource-constrained environment I think that your only solution is to divide and conquer: e.g. caching/tiling (as in: tiles)
Instead of loading and processing the image all at once you load/save manageable chunks of the image from a raw data file to do your processing. This is not trivial and could get really complex depending on the type of processing you want to do, but it's the only way if you don't want to comprise on image quality.
Indeed, this is hard. But in case image is in some continuous raster format, you can mmap it
( see java.nio.ByteBuffer ) - this way you get byte buffer without allocating it.
2 things:
Checkout the gallery in Honeycomb. It does this tiled based rendering. You can zoom in on an image and you see then that the current part is higher res then the other parts. If you pan around you see it rendering.
When using native code (NDK) there is not a resource limit. So you could try to load all the data native and somehow get parts of it using JNI, but I doubt it's better then the gallery of honeycom.