We're working on an Android App that requires resizing (frame size) and compressing videos. We tested the code sample below and it's currently slow:
https://github.com/hoolrory/AndroidVideoSamples/blob/master/CommonVideoLibrary/src/com/roryhool/commonvideolibrary/VideoResampler.java
The output video frame size is reduced (e.g., 480x320), and the bit-rate is also reduced to achieve compression. The final video looks very good and the compression ratios are good, too. It's just that the process is slow. I tested on a Galaxy S4 running Android 4.4 and Galaxy Note 5 running Android 6.0. The later is faster, but not by much. On Galaxy S4, a 30-second video takes about a minute to compress (on average).
The code above decods the input video on an input surface, reduces frame size, and outputs to an output surface. MediaMuxer is used to mux-in the audio. The example is using an MPEG container and H264 encoder. Some relevant questions:
Are there some parameters we can use to speed up the compression?
How is the video compression speed affected by the target bit rate and frame size, if any?
We didn't use FFMpeg. Is that faster?
Any pointers or hints, even if not related to the code sample above, would be highly appreciated.
Thank you very much!
Omar
Your problem is with how you synchronously wait for events on one of the components (encoder or decoder). Either rebuild the code to run with asynchronous callbacks, or lower the timeouts.
See https://stackoverflow.com/a/37513916/3115956 for a longer explanation with more references, and https://github.com/mstorsjo/android-decodeencodetest for an example on how to use the asynchronous mode effectively.
Related
The basic issue I am trying to solve is to delay what is sent to a virtual display by a second or so. So basically, I am trying to shift all frames by 1 second after the initial recording. Note that a surface is used as an input and another surface is used as an output through this virtual display. My initial hunch is to explore a few ideas, given that modification of the Android framework or use of non-public APIs is fine. Java or native C/C++ is fine.
a) I tried delaying frames posted to the virtual display or output surface by a second or two in SurfaceFlinger. This does not work as it causes all surfaces to be delayed by the same amount of time (synchronous processing of frames).
b) MediaCodec uses a surface as an input to encode, and then produce the decoded data. Is there anyway to use MediaCodec such that it does not actually encode and only produce unencoded raw frames? Seems unlikely. Moreover, how does MediaCodec do this under the hood? Process things frame by frame. If I can extrapolate the method I might be able to extract frame by frame from my input surface and create a ring buffer delayed by the amount of time I require.
c) How do software decoders, such as FFmpeg, actually do this in Android? I assume they take in a surface but how would they extrapolate and process frame by frame
Note that I can certainly encode and decode to retrieve the frames and post them but I want to avoid actually decoding. Note that modifying the Android framework or using non-public APIs is fine.
I also found this: Getting a frame from SurfaceView
It seems like option d) could be using a SurfaceTexture but I would like to avoid the process of encoding/decoding.
As I understand it, you have a virtual display that is sending its output to a Surface. If you just use a SurfaceView for output, frames output by the virtual display appear on the physical display immediately. The goal is to introduce one second of latency between when the virtual display generates a frame and when the Surface consumer receives it, so that (again using SurfaceView as an example) the physical display shows everything a second late.
The basic concept is easy enough: send the virtual display output to a SurfaceTexture, and save the frame into a circular buffer; meanwhile another thread is reading frames out of the tail end of the circular buffer and displaying them. The trouble with this is what #AdrianCrețu pointed out in the comments: one second of full-resolution screen data at 60fps will occupy a significant fraction of the device's memory. Not to mention that copying that much data around will be fairly expensive, and some devices might not be able to keep up.
(It doesn't matter whether you do it in the app or in SurfaceFlinger... the data for up to 60 screen-sized frames has to be held somewhere for a full second.)
You can reduce the volume of data in various ways:
Reduce the resolution. Scaling 2560x1600 to 1280x800 removes 3/4 of the pixels. The loss of quality should be difficult to notice on most displays, but it depends on what you're viewing.
Reduce the color depth. Switching from ARGB8888 to RGB565 will cut the size in half. This will be noticeable though.
Reduce the frame rate. You're generating the frames for the virtual display, so you can choose to update it more slowly. Animation is still reasonably smooth at 30fps, halving the memory requirements.
Apply image compression, e.g. PNG or JPEG. Fairly effective, but too slow without hardware support.
Encode inter-frame differences. If not much is changing from frame to frame, the incremental changes can be very small. Desktop-mirroring technologies like VNC do this. Somewhat slow to do in software.
A video codec like AVC will both compress frames and encode inter-frame differences. That's how you get 1GByte/sec down to 10Mbit/sec and still have it look pretty good.
Consider, for example, the "continuous capture" example in Grafika. It feeds the Camera output into a MediaCodec encoder, and stores the H.264-encoded output in a ring buffer. When you hit "capture", it saves the last 7 seconds. This could just as easily play the camera feed with a 7-second delay, and it only needs a few megabytes of memory to do it.
The "screenrecord" command can dump H.264 output or raw frames across the ADB connection, though in practice ADB is not fast enough to keep up with raw frames (even on tiny displays). It's not doing anything you can't do from an app (now that we have the mediaprojection API), so I wouldn't recommend using it as sample code.
If you haven't already, it may be useful to read through the graphics architecture doc.
I am trying to create a video from series of images in android.
I have come across these three options MediaCodec, ffmpeg using ndk and jcodec. Can someone let me know which one of them is best and easiest. I didn't find any proper documentation so can somebody please post their working example?
If you are talking about API 4.3+ in general you need to get input surface from encoder, copy image to the texture that comes along with the surface, put correct timestamp and send it back to encoder. and do it
fps (frames per second) * resulted video duration in seconds
times. encoder bitstream after encoder should go to the muxer, so finally you will get mp4 file.
It requires rather much coding:)
I suggest you to try free Intel Media Pack: https://software.intel.com/en-us/articles/intel-inde-media-pack-for-android-tutorials
It has a sample - JpegSubstituteEffect, it allows to create videos from images. The idea is to take a dummy video (black video and quiet audio) and to substitute all black frame by coping images. It could be easily enhanced to creating a video from series of images. I know a couple of applications in Google Play making the same using Media Pack
I tried JCodec 1.7 for Android. This is very simple compared to the other two options and works. There is class SequenceEncoder in the android package that accepts Bitmap instances and encodes them in to video. I ended up cloning this class into my app to override some of the settings e.g. fps. Problem with JCodec is that performance is dismal - encoding single 720x480 pixels frame takes just about 45 seconds. I wanted to do timelapse videos possibly at fullHD and was initially thinking any encoder will do as I was not expecting encoding a frame to take more than a second (minimal interval between frames in my app is 3 seconds). As you can guess with 45 seconds per frame JCodec is not a fit.
I am monitoring your question for other answers that may be helpful.
The MediaCodec/MediaMuxer way seems ok but it is insanely complex. I need to learn quite a bit about OpenGL ES, video formats and some Android voodoo to get this going. Ohh and this only works on the latest crop of phones 4.3+. This is real shame for Google with all of their claims to fame. I found some Stackoverflow discussions on the topic. Two sub-paths exist - the OpenGL way is device independent. There is another way which involves transcoding your RGB Bitmap data to YUV. the catch with YUV is that there are 3 flavours of it depending on the device HW - planar, semi planar and semi planar compressed (I am not sure if a 4th way is not coming in the future so...).
Here are couple useful links on the OpenGL way
CTS test - https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/ExtractDecodeEditEncodeMuxTest.java
Many CTS like tests - http://bigflake.com/mediacodec/#EncodeDecodeTest
Two of them seem to be relevant and useful
EncodeAndMuxTest - http://bigflake.com/mediacodec/EncodeAndMuxTest.java.txt
CameraToMpegTest - http://bigflake.com/mediacodec/CameraToMpegTest.java.txt (I believe this to be closest to what I need, just need to understand all the OpenGL voodoo and get my Bitmap in as texture filling up the entire frame i.e. projections, cameras and what not comes into play)
ffmpeg route does not seem direct enough too. Something in C++ accepting stuff from Java...I guess some weird conversions of the Bitmap to byte[]/ByteBuffer will be first needed - cpu intensive and slow. I actually have JPEG byte[] but am not sure this will come handy to the ffmpeg library. I am not sure if ffmpeg is taking leverage of the GPU or other HW acceleration so it may well end up at 10 seconds per frame and literally baking the phone.
FFmpeg can implement this task. You first need compile ffmpeg library in Android (refer to this article "How to use Vitamio with your own FFmpeg build")
You could refer the samples in FFmpeg to figure out how to implement your task.
In Android implement your task in C++; then use JNI to integrate the C++ code into your Android app.
My Android app does live video processing using OpenGL. I'm trying to save it to video using MediaMuxer and MediaCodex.
The performance it not good enough. Each cycle the screen is updated, and it is saved to file. The screen is smooth, the video file is horrible. By this I mean major motion blur when it changes quickly and the frame-rate appears to be 1/2 or 1/3rd of what it should be.
It seems to be a limitation due to clamping of settings internally. I can't get it to spit out a video with a bit rate greater than 288KBPS. I think it is not clamping the requested parameters because there is no difference in frame rate for 1024x1024, 480x480, and 240x240. If it was having trouble keeping up, it should at least improve when the number of pixels drops by a factor > 10.
The app is here : https://play.google.com/store/apps/details?id=com.matthewjmouellette.snapdat.
I would love to post a code sample, but my program is 10K lines of code, with a lot of relevant code just for this problem.
Any help would be greatly appreciated.
EDIT:
I've tried like 10+ different things. I'm out of ideas right now. I wish I could just save the video uncompressed, the hard-drive should be able to keep up with a small enough image and medium fps.
It seems to be that the encoding method just doesn't work for my video. The frames differ to much, to try to "move" one part of the frame, as a sort of encoding. Instead I need full frames throughout. I am thinking something along the lines of M-JPEG would work really well. JPEGs tend to take 1/10th the size of a bitmap. It should allow a reasonable size, with almost no processing power required by the CPU, since it is image compression not video compression which we are doing. I wish I had a good library for this.
I want to create live painting video as export feature for a painting application.
I can create a video with a series of images, with the use of a library ( FFMPEG or MediaCodec). But, this would require too much processing power to compare the images and encode the video.
While drawing, I know exactly which pixels are changed. So, I can save lot of processing if I can pass this info to FFMPEG, instead of having the FFMPEG figure this out from the images.
Is there away to efficiently encode the video for this purpose ?
It should not require "too much processing power" for MediaCodec. Because, for example, device is capable to write video in real time, some of them write full HD video.There's another thing : each MediaCodec's encoder requires pixel data in specific format, you should query API for supported capabilities before using the API. Also it will be tricky to make your app work on all devices with MediaCodec if your app produces only one pixel format, because probably not all of devices will support it(another words: different vendors have different MediaCodec implementation).
I am trying to develop an application in which a Beaglebone platform captures video images from a camera connected to it, and then send them (through an internet socket) to an Android application such the application shows the video images.
I have read that openCV may be a very good option to capture the images from a camera, but then I am not sure how the images can be sent through a socket.
On the other end, I think that the video images received by the Android application could be treated by simple images. With this in mind I think I can refresh the image every second or so.
I am not sure if I am in the right way for the implementation, so I really appreciate any suggestion and help you could provide.
Thanks in advance, Gus.
The folks at OpenROV have done something like you've said. Instead of using a custom Android app, which is certainly possible, they've simply used a web browser to display the images captured.
https://github.com/OpenROV/openrov-software
This application uses OpenCV to perform the capture and analysis, a Node.JS application to transmit the data over socket.io to the web browser and a web client to display the video. An architecture description on how this works is given here:
http://www.youtube.com/watch?v=uvnAYDxbDUo
You can also look at running something like mjpg-streamer:
http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1051&context=cpesp
Note that displaying the video stream as a set of images can have big performance impact. For example, if you are not careful how you encode each frame, you can more than double the traffic between the two systems. ARGB takes 32 bits to encode a pixel, YUV takes 12 bits, so even accounting for the frame compression, you still are doubling the storage per frame. Also, rendering ARGB is much, much slower than rendering YUV, as most of the Android phones actually have hardware-optimized YUV rendering (as in the GPU can directly blit the YUV in the display memory). In addition, rendering separate frames as approach usually make sone take the easy way and render a Bitmap on a Canvas, which works if you are content with something in the order of 10-15 fps, but can never get to 60 fps, and can get to a peak (not sustained) of 30 fps only on very few phones.
If you have a hardware MPEG encoder on the Beaglebone board, you should use it to encode and stream the video. This would allow you to directly pass the MPEG stream to the standard Android media player for rendering. Of course, using the standard media player will not allow you to process the video stream in real time, so depending on your scenario this might not be an option for you.