I am developing an application in android that streams the live video from android to pc. I am capturing frame by frame video on Camera.onPreviewFrame() and then sending acquired byte[] YUV data to server using socket.
This method is working fine. Only the problem I am facing is the no. of frames per second. It is now 4-5 fps and I want to achieve 15-16 fps.
To achieve this, I am thinking of compressing this YUV data. Currently my app gives me frame of resolution 320 X 240. I want it to scale it down so that I can reduce the no. of bytes to send on the network. Is there any library or algorithm which can do this?
Is there any other way of streaming live video from android phone to pc?
I recommand resizing YUV data. But the MAXIMUM resolution of your phone is better then others.
YUV -> anoter Color space ( RGBA, BGRA, ARGB.. etc...)
resizing RGBA using openCV or meth...
processing
resizing (up)
Related
I've got an Android application which does motion detection and video recording. It supports both the Camera and Camera2 APIs in order to provide backwards compatibility. I'm using an ImageReader with the Camera2 API in order to do motion detection. I'm currently requesting JPEG format images, which are very slow. I understand that requesting YUV images would be faster, but is it true that the YUV format varies depending on which device is being used? I just wanted to check before I give up on optimizing this.
All devices will support NV21 and YV12 formats for the old camera API (since API 12), and for camera2, all devices will support YUV_420_888.
YUV_420_888 is a flexible YUV format, so it can represent multiple underlying formats (including NV21 and YV12). So you'll need to check the pixel and row strides in the Images from the ImageReader to ensure you're reading through the 3 planes of data correctly.
If you need full frame rate, you need to work in YUV - JPEG has a lot of encoding overhead and generally won't run faster than 2-10fps, while YUV will run at 30fps at least at preview resolutions.
I solved this problem by using the luminance (Y) values only, the format for which doesn't vary between devices. For the purposes of motion detection, a black and white image is fine. This also gets around the problem on API Level 21 where some of the U and V data is missing when using the ImageReader.
I'm using PreviewDisplay to create custom camera app, and onPreviewFrame callback to manipulate each frame (in my case, send image to server once in pre-defined number of frames while keep displaying smooth video stream to the user).
The highest resolution returned by getSupportedPreviewSizes is lower than the best resolution of images captured by built in camera application.
Is there any way to get the frames in best resolution as achieved by built in camera application?
Try getSupportedVideoSizes(), also getPreferredPreviewSizeForVideo().
Note that in some cases, the camera may be able to produce higher res frames, and pipe them to the hardware encoder for video recording, but not have bandwidth to push them to onPreviewFrame() callback.
I am doing an Android project about dealing with video frame, I need to handle every frame before display it. The process includes scaling up frames from 1920x1080 to 2560x1440 resolution, color space conversion and some necessary image processing based on RGB, and all these works should be finished within 33ms~40ms.
I have optimized the yuv->rgb and other processing with arm neon, they worked well. But I have to scale up frame firstly from 1080p to 2k resolution, it's the bottleneck of performance now.
My question is how to efficiently scale up image from 1080p to 2k resolution within 20ms, I don't have much experience about scaling algorithm, so any suggestions are helpful.
Could I use arm neon to optimize the existing algorithm?
The hardware environment:
CPU: Samsung Exynos 5420
Memory: 3GB
Display: 2560X1600 px
Update:
I will describe my decoding process, I use MediaCodec to decode the normal video(H.264) to YUV(NV12), the default decoder is hardware, it's very fast. Then I use arm neon to convert NV12 to RGBW, and then send RGBW frame to surfaceflinger to display. I just use normal SurfaceView rahter than GLSurfaceView.
The bottleneck is how to scale up YUV from 1080p to 2K fast.
I find that examples work well, so allow me to lead with this example program that uses OpenGL shaders to convert from YUV -> RGB: http://www.fourcc.org/source/YUV420P-OpenGL-GLSLang.c
What I envision for your program is:
Hardware video decodes H.264 stream -> YUV array
Upload that YUV array as a texture to OpenGL; actually, you will upload 3 different textures-- Y, U, and V
Run a fragment shader that converts those Y, U, and V textures into an RGB(W) image; this will produce a new texture in video memory
Run a new fragment shader against the texture generated in previous step in order to scale the image
There might be a bit of a learning curve involved here, but I think it's workable, given your problem description. Take it one step at a time: get the OpenGL framework in place, try uploading just the Y texture and writing a naive fragment shader that just emits a grayscale pixel based on the Y sample, then move onto correctly converting the image, then get a really naive upsampler working, then put a more sophisticated upsampler into service.
I'd also recommend opengl es too, mainly because of the project I'm currently working on, also playing video. For me, the display is 1920 x 1080, so the texture I'm using is 2048 x 1024. I get approx 35 fps on a quad core arm7.
Use a GLSurfaceView and your own custom renderer. If you're using ffmpeg then once you've decoded your video frames, use sws_scale to scale your frame and then just upload it into the opengl texture. The larger your texture/display, the less fps you will get because it a lot of time taken uploading large images to the gpu every frame.
Depending on your needs for decoding your video input is what you will have to research. For me, I had to compile ffmpeg for android and start from there.
my apologies for putting this in an answer. i dont have enough points to make a comment.
I'd like to add that you might run into OGL texture limitations. I have tried to use OGL for the opposite problem; scaling down from the camera in real time. the problem is that the max OGL texture is 2048x2048. Not sure if this is true for all devices. this limit was true on newer kit like N72013 and LG2. in the end, i had to write in in the NDK without OGL by optimising the hell out of it by hand.
good luck, though.
I am facing a programming problem
I am trying to encode video from camera frames that I have merged with other frames which were retrieved from other layer(like bitmap/GLsurface)
When I use 320X240 .I can make the merge in real time with fine FPS(~10),but when I try to increase the pixels size I am getting less than 6 FPS.
It is sensible as my merging function depend on the pixels size.
So what I ask is, how to store that Arrays of frames for after processing (encode)?
I don't know how to store this large arrays.
Just a quick calculation:
If i need to store 10 frame per second
and each frame is 960X720 pixel
so i need to store for 40 second video : 40X10X960X720X(3/2-android factor)=~ 276 MB
it is to much for heap
any idea?
You can simply record the camera input as video - 40 sec will not be a large file even at 720p resolution - and then offline you can decode, merge, and encode again. The big trick is that MediaRecorder will use hardware, so encoding will be really fast. Being compressed, the video can be written to sdcard or local file system in real time, and reading it for decoding is not an issue, too.
I am trying to develop an application in which a Beaglebone platform captures video images from a camera connected to it, and then send them (through an internet socket) to an Android application such the application shows the video images.
I have read that openCV may be a very good option to capture the images from a camera, but then I am not sure how the images can be sent through a socket.
On the other end, I think that the video images received by the Android application could be treated by simple images. With this in mind I think I can refresh the image every second or so.
I am not sure if I am in the right way for the implementation, so I really appreciate any suggestion and help you could provide.
Thanks in advance, Gus.
The folks at OpenROV have done something like you've said. Instead of using a custom Android app, which is certainly possible, they've simply used a web browser to display the images captured.
https://github.com/OpenROV/openrov-software
This application uses OpenCV to perform the capture and analysis, a Node.JS application to transmit the data over socket.io to the web browser and a web client to display the video. An architecture description on how this works is given here:
http://www.youtube.com/watch?v=uvnAYDxbDUo
You can also look at running something like mjpg-streamer:
http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1051&context=cpesp
Note that displaying the video stream as a set of images can have big performance impact. For example, if you are not careful how you encode each frame, you can more than double the traffic between the two systems. ARGB takes 32 bits to encode a pixel, YUV takes 12 bits, so even accounting for the frame compression, you still are doubling the storage per frame. Also, rendering ARGB is much, much slower than rendering YUV, as most of the Android phones actually have hardware-optimized YUV rendering (as in the GPU can directly blit the YUV in the display memory). In addition, rendering separate frames as approach usually make sone take the easy way and render a Bitmap on a Canvas, which works if you are content with something in the order of 10-15 fps, but can never get to 60 fps, and can get to a peak (not sustained) of 30 fps only on very few phones.
If you have a hardware MPEG encoder on the Beaglebone board, you should use it to encode and stream the video. This would allow you to directly pass the MPEG stream to the standard Android media player for rendering. Of course, using the standard media player will not allow you to process the video stream in real time, so depending on your scenario this might not be an option for you.