I have been using the Android NDK Camera sample and with it one is able to read the frames with format AIMAGE_FORMAT_YUV_420_888 by using the yuvreader_ inside DrawFrame at 30 Hz. I validated that 30 Hz is achieved by recording the timestamp in each image and printing it. I am using a Samsung Galaxy S9.
I was now trying to obtain JPEG images instead of the YUV ones also at 30 Hz but have not yet succeed and was wondering if someone could help.
From what I understood, the capture session in this sample creates a request for both "preview" and a "still capture", where the yuv is used for preview and jpeg is used for the still capture. What I have done was to set the jpgReader_ as the preview one as well, and then I checked in the timestamp of the frames captured in the ImageCallback here (I commented out the step of writing to file, and just called AImage_delete(image) to clean the buffer instead). However, the result I get is frames with intervals of 33, 66, 99 and 133 ms, quite evenly distributed, so many frames often get skipped.
Any ideas of what the problem could be?
Many camera devices cannot produce 30 jpeg images per second. That's why the camera API explicitly sets YUV (or private) format for preview or video. Few devices are capable of creating 30 Jpegs per second. That's why typical video recording session involves h246 or vp8 encoders.
Related
We're working on an Android App that requires resizing (frame size) and compressing videos. We tested the code sample below and it's currently slow:
https://github.com/hoolrory/AndroidVideoSamples/blob/master/CommonVideoLibrary/src/com/roryhool/commonvideolibrary/VideoResampler.java
The output video frame size is reduced (e.g., 480x320), and the bit-rate is also reduced to achieve compression. The final video looks very good and the compression ratios are good, too. It's just that the process is slow. I tested on a Galaxy S4 running Android 4.4 and Galaxy Note 5 running Android 6.0. The later is faster, but not by much. On Galaxy S4, a 30-second video takes about a minute to compress (on average).
The code above decods the input video on an input surface, reduces frame size, and outputs to an output surface. MediaMuxer is used to mux-in the audio. The example is using an MPEG container and H264 encoder. Some relevant questions:
Are there some parameters we can use to speed up the compression?
How is the video compression speed affected by the target bit rate and frame size, if any?
We didn't use FFMpeg. Is that faster?
Any pointers or hints, even if not related to the code sample above, would be highly appreciated.
Thank you very much!
Omar
Your problem is with how you synchronously wait for events on one of the components (encoder or decoder). Either rebuild the code to run with asynchronous callbacks, or lower the timeouts.
See https://stackoverflow.com/a/37513916/3115956 for a longer explanation with more references, and https://github.com/mstorsjo/android-decodeencodetest for an example on how to use the asynchronous mode effectively.
The basic issue I am trying to solve is to delay what is sent to a virtual display by a second or so. So basically, I am trying to shift all frames by 1 second after the initial recording. Note that a surface is used as an input and another surface is used as an output through this virtual display. My initial hunch is to explore a few ideas, given that modification of the Android framework or use of non-public APIs is fine. Java or native C/C++ is fine.
a) I tried delaying frames posted to the virtual display or output surface by a second or two in SurfaceFlinger. This does not work as it causes all surfaces to be delayed by the same amount of time (synchronous processing of frames).
b) MediaCodec uses a surface as an input to encode, and then produce the decoded data. Is there anyway to use MediaCodec such that it does not actually encode and only produce unencoded raw frames? Seems unlikely. Moreover, how does MediaCodec do this under the hood? Process things frame by frame. If I can extrapolate the method I might be able to extract frame by frame from my input surface and create a ring buffer delayed by the amount of time I require.
c) How do software decoders, such as FFmpeg, actually do this in Android? I assume they take in a surface but how would they extrapolate and process frame by frame
Note that I can certainly encode and decode to retrieve the frames and post them but I want to avoid actually decoding. Note that modifying the Android framework or using non-public APIs is fine.
I also found this: Getting a frame from SurfaceView
It seems like option d) could be using a SurfaceTexture but I would like to avoid the process of encoding/decoding.
As I understand it, you have a virtual display that is sending its output to a Surface. If you just use a SurfaceView for output, frames output by the virtual display appear on the physical display immediately. The goal is to introduce one second of latency between when the virtual display generates a frame and when the Surface consumer receives it, so that (again using SurfaceView as an example) the physical display shows everything a second late.
The basic concept is easy enough: send the virtual display output to a SurfaceTexture, and save the frame into a circular buffer; meanwhile another thread is reading frames out of the tail end of the circular buffer and displaying them. The trouble with this is what #AdrianCrețu pointed out in the comments: one second of full-resolution screen data at 60fps will occupy a significant fraction of the device's memory. Not to mention that copying that much data around will be fairly expensive, and some devices might not be able to keep up.
(It doesn't matter whether you do it in the app or in SurfaceFlinger... the data for up to 60 screen-sized frames has to be held somewhere for a full second.)
You can reduce the volume of data in various ways:
Reduce the resolution. Scaling 2560x1600 to 1280x800 removes 3/4 of the pixels. The loss of quality should be difficult to notice on most displays, but it depends on what you're viewing.
Reduce the color depth. Switching from ARGB8888 to RGB565 will cut the size in half. This will be noticeable though.
Reduce the frame rate. You're generating the frames for the virtual display, so you can choose to update it more slowly. Animation is still reasonably smooth at 30fps, halving the memory requirements.
Apply image compression, e.g. PNG or JPEG. Fairly effective, but too slow without hardware support.
Encode inter-frame differences. If not much is changing from frame to frame, the incremental changes can be very small. Desktop-mirroring technologies like VNC do this. Somewhat slow to do in software.
A video codec like AVC will both compress frames and encode inter-frame differences. That's how you get 1GByte/sec down to 10Mbit/sec and still have it look pretty good.
Consider, for example, the "continuous capture" example in Grafika. It feeds the Camera output into a MediaCodec encoder, and stores the H.264-encoded output in a ring buffer. When you hit "capture", it saves the last 7 seconds. This could just as easily play the camera feed with a 7-second delay, and it only needs a few megabytes of memory to do it.
The "screenrecord" command can dump H.264 output or raw frames across the ADB connection, though in practice ADB is not fast enough to keep up with raw frames (even on tiny displays). It's not doing anything you can't do from an app (now that we have the mediaprojection API), so I wouldn't recommend using it as sample code.
If you haven't already, it may be useful to read through the graphics architecture doc.
I am trying to get specific frames at specific times as images from a movie using MediaExtractor and MediaCodec. I can do it successfully if:
I use extractor.seekTo(time, MediaExtractor.SEEK_TO_PREVIOUS_SYNC); , however, this only gives the nearest sync frame not the target frame.
I sequentially extract all frames using extractor.advance(); , but I need to get the target frame not all.
So, I try the following:
extractor.seekTo(time, MediaExtractor.SEEK_TO_PREVIOUS_SYNC);
while(extractor.getSampleTime()<time /*target time*/) extractor.advance();
This provides the correct frame, but for some reason the image is corrupted. It looks like the correct image (the one I get from the successful cases), but with some pixelation and a strange haze.
The while-loop is the only thing that is different between the successful cases and the corrupted ones. What to do to advance MediaExtractor to a specific time (not just sync time) without getting a corrupted image?
Thanks to fadden comment, I have to keep feeding the encoder since the I-frame has the full picture and the P and B frames have differences (this is how compression is achieved). So I need to start with an I-frame (it was same as sync frame) and keep feeding the other frames to the decoder to receive the full image.
I am facing a programming problem
I am trying to encode video from camera frames that I have merged with other frames which were retrieved from other layer(like bitmap/GLsurface)
When I use 320X240 .I can make the merge in real time with fine FPS(~10),but when I try to increase the pixels size I am getting less than 6 FPS.
It is sensible as my merging function depend on the pixels size.
So what I ask is, how to store that Arrays of frames for after processing (encode)?
I don't know how to store this large arrays.
Just a quick calculation:
If i need to store 10 frame per second
and each frame is 960X720 pixel
so i need to store for 40 second video : 40X10X960X720X(3/2-android factor)=~ 276 MB
it is to much for heap
any idea?
You can simply record the camera input as video - 40 sec will not be a large file even at 720p resolution - and then offline you can decode, merge, and encode again. The big trick is that MediaRecorder will use hardware, so encoding will be really fast. Being compressed, the video can be written to sdcard or local file system in real time, and reading it for decoding is not an issue, too.
I am trying to develop an application in which a Beaglebone platform captures video images from a camera connected to it, and then send them (through an internet socket) to an Android application such the application shows the video images.
I have read that openCV may be a very good option to capture the images from a camera, but then I am not sure how the images can be sent through a socket.
On the other end, I think that the video images received by the Android application could be treated by simple images. With this in mind I think I can refresh the image every second or so.
I am not sure if I am in the right way for the implementation, so I really appreciate any suggestion and help you could provide.
Thanks in advance, Gus.
The folks at OpenROV have done something like you've said. Instead of using a custom Android app, which is certainly possible, they've simply used a web browser to display the images captured.
https://github.com/OpenROV/openrov-software
This application uses OpenCV to perform the capture and analysis, a Node.JS application to transmit the data over socket.io to the web browser and a web client to display the video. An architecture description on how this works is given here:
http://www.youtube.com/watch?v=uvnAYDxbDUo
You can also look at running something like mjpg-streamer:
http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1051&context=cpesp
Note that displaying the video stream as a set of images can have big performance impact. For example, if you are not careful how you encode each frame, you can more than double the traffic between the two systems. ARGB takes 32 bits to encode a pixel, YUV takes 12 bits, so even accounting for the frame compression, you still are doubling the storage per frame. Also, rendering ARGB is much, much slower than rendering YUV, as most of the Android phones actually have hardware-optimized YUV rendering (as in the GPU can directly blit the YUV in the display memory). In addition, rendering separate frames as approach usually make sone take the easy way and render a Bitmap on a Canvas, which works if you are content with something in the order of 10-15 fps, but can never get to 60 fps, and can get to a peak (not sustained) of 30 fps only on very few phones.
If you have a hardware MPEG encoder on the Beaglebone board, you should use it to encode and stream the video. This would allow you to directly pass the MPEG stream to the standard Android media player for rendering. Of course, using the standard media player will not allow you to process the video stream in real time, so depending on your scenario this might not be an option for you.