Real-time image processing with Camera2

Real-time image processing with Camera2 - android

I tried searching in a ton of places about doing this, with no results. I did read that the only (as far as I know) way to obtain image frames was to use a ImageReader, which gives me a Image to work with. However, a lot of work must be done before I have a nice enough image (converting Image to byte array, then converting between formats - YUV_420_888 to ARGB_8888 - using RenderScript, then turning it into a Bitmap and rotating it manually - or running the application on landscape mode). By this point a lot of processing is made, and I haven't even started the actual processing yet (I plan on running some native code on it). Additionally, I tried to lower the resolution, with no success, and there is a significant delay when drawing on the surface.
Is there a better approach to this? Any help would be greatly appreciated.

Im not sure what exactly you are doing with the images, but a lot of times only a grayscale image is actually needed (again depending on your exact goal) If your camera outputs YUV, the grayscale information is in the Y channel. The nice thing,is you don't need to convert to numerous colorspaces and working with only one layer (as opposed to three) decreases the size of your data set greatly.
If you need color images then this wouldn't help

Related

Wrong rotation of images using Android camera2 API in Google's example code?

I have tried writing my own code for accessing camera via the camera2 API on Android instead of using the Google's example. On one hand, I've wasted way too much time understanding what exactly is going on, but on the other hand, I have noticed something quite weird:
I want the camera to produce vertical images. However, despite the fact that the ImageReader is initialized with height larger than width, the Image that I get in the onCaptureCompleted has the same size, except it is rotated 90 degrees. I was struggling trying to understand my mistake, so I went exploring Google's code. And what I have found is that their images are rotated 90 degrees as well! They compensate for that by setting a JPEG_ORIENTATION key in the CaptureRequestBuilder (if you comment that single line, the images that get saved will be rotated). This is unrelated to device orientation - in my case, screen rotation is disabled for the app entirely.
The problem is that for the purposes of the app I am making I need a non-compressed precise data from camera, so since JPEG a) compresses images b) with losses, I cannot use it. Instead I use the YUV_420_888 format which I later convert to a Bitmap. But while the JPEG_ORIENTATION flag can fix the orientation for JPEG images, it seems to do nothing for YUV ones. So how do I get the images to be correctly rotated?
One obvious solution is to rotate the resulting Bitmap, but I'm unsure what angle should I rotate it by on different devices. And more importantly, what causes such strange behavior?
Update: rotating the Bitmap and scaling it to proper size takes way too much time for the preview (the context is as follows: I need both high-res images from camera to process and a downscaled version of these same images in preview. Let's just say I'm making something similar to QR code recognition). I have even tried using RenderScripts to manipulate the image efficiently, but this is still too long. Also, I've read here that when I set multiple output surfaces simultaneously, the same resolution will be used for all of them, which is quite bad for me.

Android stores every image, no matter if it is taken in landscape or in portrait, in landscape mode. It also stores metadata that tells you if the image should be displayed in portrait or landscape.
If you don'r turn the image according to the metadata, you will end up with every image in landscape. I had that problem too (but I wanted compression so my solution doesn't work for you).
You need to read the metadata and turn it accordingly.
I hope this helps at least a bit.

Camera2 - most efficient way to obtain a still capture Bitmap

To start with the question: what is the most efficient way to initialize and use ImageReader with the camera2 api, knowing that I am always going to convert the capture into a Bitmap?
I'm playing around with the Android camera2 samples, and everything is working quite nicely. However, for my purposes I always need to perform some post processing on captured still images, for which I require a Bitmap object. Presently I am using BitmapFactory.decodeByteArray(...) using the bytes coming from the ImageReader.acquireNextImage().getPlanes()[0].getBuffer() (I'm paraphrasing). While this works acceptably, I still feel like there should be a way to improve performance. The captures are encoded in ImageFormat.Jpeg and need to be decoded again to get the Bitmap, which seems redundant. Ideally I'd obtain them in PixelFormat.RGB_888 and just copy that to a Bitmap using Bitmap.copyPixelsFromBuffer(...), but it doesn't seem like initializing an ImageReader with that format has reliable device support. YUV_420_888 could be another option, but looking around SO it seems that it requires jumping through some hoops to decode into a Bitmap. Is there a recommended way to do this?

The question is what you are optimizing for.
Jpeg is without doubt the easiest format supported by all devices. Decoding it to bitmap is not redundant as it seems because encoding the picture into jpeg he is usually performed by kind of hardware. This means that uses minimal bandwidth to transmit the image from the sensor to your application. on some devices this is the only way to get maximum resolution. BitmapFactory.decodeByteArray(...) is often performed by special hardware decoder too. The major problem with this call is that may cause out of memory exception, because the output bitmap is too big. So you will find many examples the do subsampled decoding, tuned for the use case where the bitmap must be displayed on the phone screen.
If your device supports required resolution with RGB_8888, go for it: this needs minimal post-processing. But scaling such image down may be more CPU intensive then dealing with Jpeg, and memory consumption may be huge. Anyways, only few devices support this format for camera capture.
As for YUV_420_888 and other YUV formats,
the advantages over Jpeg are even smaller than for RGB.
If you need the best quality image and don't have memory limitations, you should go for RAW images which are supported on most high-end devices these days. You will need your own conversion algorithm, and probably make different adaptations for different devices, but at least you will have full command of the picture acquisition.

After a while I now sort of have an answer to my own question, albeit not a very satisfying one. After much consideration I attempted the following:
Setup a ScriptIntrinsicYuvToRGB RenderScript of the desired output size
Take the Surface of the used input allocation, and set this as the target surface for the still capture
Run this RenderScript when a new allocation is available and convert the resulting bytes to a Bitmap
This actually worked like a charm, and was super fast. Then I started noticing weird behavior from the camera, which happened on other devices as well. As it would turn out, the camera HAL doesn't really recognize this as a still capture. This means that (a) the flash / exposure routines don't fire in this case when they need to and (b) if you have initiated a precapture sequence before your capture auto-exposure will remain locked unless you manage to unlock it using AE_PRECAPTURE_TRIGGER_CANCEL (API >= 23) or some other lock / unlock magic which I couldn't get to work on either device. Unless you're fine with this only working in optimal lighting conditions where no exposure adjustment is necessary, this approach is entirely useless.
I have one more idea, which is to setup an ImageReader with a YUV_420_888 output and incorporating the conversion routine from this answer to get RGB pixels from it. However, I'm actually working with Xamarin.Android, and RenderScript user scripts are not supported there. I may be able to hack around that, but it's far from trivial.
For my particular use case I have managed to speed up JPEG decoding to acceptable levels by carefully arranging background tasks with subsampled decodes of the versions I need at multiple stages of my processing, so implementing this likely won't be worth my time any time soon. If anyone is looking for ideas on how to approach something similar though; that's what you could do.

Change the Imagereader instance using a different ImageFormat like this:
ImageReader.newInstance(width, height, ImageFormat.JPEG, 1)

Extract pointclouds WITH colour using the Project Tango; i.e. getting the current camera frame

I am trying to produce a point cloud where each point has a colour. I can get just the point cloud or I can get the camera to take a picture, but I need them to be as simultaneous as possible. If I could look up an RGB image with a timestamp or call a function to get the current frame when onXYZijAvailable() is called I would be done. I could just go over the points, find out where it would intersect with the image plane and get the colour of that pixel.
As it is now I have not found any way to get the pixel info of an image or get coloured points. I have seen AR apps where the camera is connected to the CameraView and then things are rendered on top, but the camera stream is never touched by the application.
According to this post it should be possible to get the data I want and synchronize the point cloud and the image plane by a simple transformation. This post is also saying something similar. However, I have no idea how to get the RGB data. I cant find any open source projects or tutorials.
The closest I have gotten is finding out when a frame is ready by using this:
public void onFrameAvailable(final int cameraId) {
if (cameraId == TangoCameraIntrinsics.TANGO_CAMERA_COLOR) {
//Get the new rgb frame somehow.
}
}
I am working with the Java API and I would very much like to not delve into JNI and the NDK if at all possible. How can I get the frame that most closely matches the timestamp of my current point cloud?
Thank you for your help.
Update:
I implemented a CPU version of it and even after optimising it a bit I only managed to get .5 FPS on a small point cloud. This is also due to the fact that the colours have to be converted from the android native NV21 colour space to the GPU native RGBA colour space. I could have optimized it further, but I am not going to get a real time effect with this. The CPU on the android device simply can not perform well enough. If you want to do this on more than a few thousand points, go for the extra hassle of using the GPU or do it in post.

Tango normally delivers color pixel data directly to an OpenGLES texture. In Java, you create the destination texture and register it with Tango.connectTextureId(), then in the onFrameAvailable() callback you update the texture with Tango.updateTexture(). Once you have the color image in a texture, you can access it using OpenGLES drawing calls and shaders.
If your goal is to color a Tango point cloud, the most efficient way to do this is in the GPU. That is, instead of pulling the color image out of the GPU and accessing it in Java, you instead pass the point data into the GPU and use OpenGLES shaders to transform the 3D points into 2D texture coordinates and look up the colors from the texture. This is rather tricky to get right if you're doing it for the first time but may be required for acceptable performance.
If you really want direct access to pixel data without using the C API,
you need to render the texture into a buffer and then read the color data from the buffer. It's kind of tricky if you aren't used to OpenGL and writing shaders, but there is an Android Studio app that demonstrates that here, and is further described in this answer. This project demonstrates both how to draw the camera texture to the screen, and how to draw to an offscreen buffer and read RGBA pixels.
If you really want direct access to pixel data but decide that the NDK might be less painful than OpenGLES, the C API has TangoService_connectOnFrameAvailable() which gives you pixel data directly, i.e. without going through OpenGLES. Note, however, that the format of the pixel data is NV21, not RGB or RGBA.

I am doing this now by capturing depth with onXYZijAvailable() and images with onFrameAvailable(). I am using native code, but the same should work in Java. For every onFrameAvailable() I get the image data and put it in a preallocated ring buffer. I have 10 slots and a counter/pointer. Each new image increments the counter, which loops back from 9 to 0. The counter is an index into an array of images. I save the image timestamp in a similar ring buffer. When I get a depth image, onXYZijAvailable(), I grab the data and the timestamp. Then I go back through the images, starting with the most recent and moving backwards, until I find the one with the closest timestamp to the depth data. As you mentioned, you know that the image data will not be from the same frame as the depth data because they use the same camera. But, using these two calls (in JNI) I get within +/- 33msec, i.e. the previous or next frame, on a consistent basis.
I have not checked how close it would be to just naively use the most recently updated rgb image frame, but that should be pretty close.
Just make sure to use the onXYZijAvailable() to drive the timing, because depth updates more slowly than rgb.
I have found that writing individual images to the file system using OpenCV::imwrite() does not keep up with the real time of the camera. I have not tried streaming to a file using the video codec. That should be much faster. Depending on what you plan to do with the data in the end you will need to be careful how you store your results.

Android MediaMux & MediaCodec too slow for saving video

My Android app does live video processing using OpenGL. I'm trying to save it to video using MediaMuxer and MediaCodex.
The performance it not good enough. Each cycle the screen is updated, and it is saved to file. The screen is smooth, the video file is horrible. By this I mean major motion blur when it changes quickly and the frame-rate appears to be 1/2 or 1/3rd of what it should be.
It seems to be a limitation due to clamping of settings internally. I can't get it to spit out a video with a bit rate greater than 288KBPS. I think it is not clamping the requested parameters because there is no difference in frame rate for 1024x1024, 480x480, and 240x240. If it was having trouble keeping up, it should at least improve when the number of pixels drops by a factor > 10.
The app is here : https://play.google.com/store/apps/details?id=com.matthewjmouellette.snapdat.
I would love to post a code sample, but my program is 10K lines of code, with a lot of relevant code just for this problem.
Any help would be greatly appreciated.
EDIT:
I've tried like 10+ different things. I'm out of ideas right now. I wish I could just save the video uncompressed, the hard-drive should be able to keep up with a small enough image and medium fps.
It seems to be that the encoding method just doesn't work for my video. The frames differ to much, to try to "move" one part of the frame, as a sort of encoding. Instead I need full frames throughout. I am thinking something along the lines of M-JPEG would work really well. JPEGs tend to take 1/10th the size of a bitmap. It should allow a reasonable size, with almost no processing power required by the CPU, since it is image compression not video compression which we are doing. I wish I had a good library for this.

Real Time Image Processing in Android using the NDK

Using an Android (2.3.3) phone, I can use the camera to retrieve a preview with the onPreviewFrame(byte[] data, Camera camera) method to get the YUV image.
For some image processing, I need to convert this data to an RGB image and show it on the device. Using the basic java / android method, this runs at a horrible rate of less then 5 fps...
Now, using the NDK, I want to speed things up. The problem is: How do I convert the YUV array to an RGB array in C? And is there a way to display it (using OpenGL perhaps?) in the native code? Real-time should be possible (the Qualcomm AR demos showed us that).
I cannot use the setTargetDisplay and put an overlay on it!
I know Java, recently started with the Android SDK and have zero experience in C

Have you considered using OpenCV's Android port? It can do a lot more than just color conversion, and it's quite fast.

A Google search returned this page for a C implementation of YUV->RGB565. The author even included the JNI wrapper for it.

You can also succeed by staying with Java. I did this for the imagedetectíon of the androangelo-app.
I used the sample code which you find here by searching "decodeYUV".
For processing the frames, the essential part to consider is the image-size.
Depending on the device you may get quite large images. i.e. for the Galaxy S2
the smallest supported previewsize is 640*480. This is a big amount of pixels.
What I did, is to use only every second row and every second column, after yuvtorgb decoding. So processing a 320*240 image works quite well and allowed me to get frame-rates of 20fps. (including some noise-reduction, a color-conversion from rgb to hsv and a circledetection)
In addition You should carefully check the size of the image-buffer provided to the setPreview function. If it is too small, the garbage-collection will spoil everything.
For the result you can check the calibration-screen of the androangelo-app. There I had an overlay of the detected image over the camera-preview.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.