Multiple Android Camera preview buffers for motion detection?

Multiple Android Camera preview buffers for motion detection? - android

I want to try to do motion detection by comparing consecutive camera preview frames, and I'm wondering if I'm interpreting the android docs correctly. Tell me if this is right:
If I want the camera preview to use buffers I allocate myself, I have to call addCallbackBuffer(), at least twice to get two separate buffers to compare.
Then I have to use the setPreviewCallbackWithBuffer() form of the callback so the preview will be filled in to the buffers I allocated.
Once I get to at least the 2nd callback, I can do whatever lengthy processing I like to compare the buffers, and the camera will leave me alone, not doing any more callbacks or overwriting my buffers till I return the oldest buffer back to the camera by calling allCallbackBuffer() once again (and the newest buffer will sit around unchanged for me to use in the next callback for comparison).
That last one is the one I'm least clear on. I won't get errors or anything because it ran out of buffers will I? It really will just silently drop preview frames and not do the callback?

Well, I went and implemented the above algorithms and they actually worked, so I guess I was interpreting the docs correctly :-).
If anyone wants to see my heavily modified CameraPreview code that does this, it is on my web page at:
http://home.comcast.net/~tomhorsley/hardware/scanner/android-scanner.html

Related

Double buffering slows frame rendering | systrace analysis

Im working at a simple 2D Game with Custom View canvas drawing (postInvalidate()) and HardwareAcceleration. After weeks of performance analysis i decided to sync my update and drawing operations with the VSYNC pulse over the Interface Choreographer.FrameCallback. Im thinking thats the right way to get smooth movements.
However im still experiencing choppy movements. I analyzed it with systrace and noticed that is has something to do with my BufferQueue. As soon as double buffering sets in, the frame time exceeds the 16ms. I made a screenshot of my trace with some explanations:
The whole draw operation waits for the buffer release of the SurfaceFlinger (consumer) to dequeue its own new empty Buffer.
Can you tell me if this is a regular behavior or what could be the reason for this?

On your graph, you have a note, "SurfaceFlinger misses VSYNC".
However, if you look at the BufferQueue row, you can see that the buffer arrived after the VSYNC deadline. SurfaceFlinger woke up, but there was nothing to do.
Your app then provided an additional buffer, which meant you had two buffers pending. Since you continued to provide a buffer on every VSYNC, the queue never got back down to zero buffers. With the queue stuffed full, every attempt to add additional buffers results in blocking.
FWIW, your BufferQueue is triple-buffered: two are in the queue, one is on the display.
There are a few things you can do:
Have the app drop frames if you've missed the deadline.
Specify a presentation time for the frames so SurfaceFlinger will drop them if the time is passed.
Deliberately drop a frame every once in a while to let the queue empty. (Not the preferred approach.)
#2 only works with GLES on a SurfaceView, so we can ignore that one.
#1 might work for you; you can see an example in Grafika. It essentially says, "if the next VSYNC is firing in less than 2ms, or has already fired, don't bother rendering the current frame." The View/invalidate approach doesn't give you the same fine-grained control that GLES does though, so I'm not sure how well that will work.
The key to smooth animation on a busy device isn't hitting every frame at 60fps. The key is to make your updates based on delta time, so things look smooth even if you drop a frame or two.
For additional details on the graphics architecture, see this doc.

Lowest overhead camera to CPU to GPU approach on android

My application needs to do some processing on live camera frames on the CPU, before rendering them on the GPU. There's also some other stuff being rendered on the GPU which is dependent on the results of the CPU processing; therefore it's important to keep everything synchronised so we don't render the frame itself on the GPU until the results of the CPU processing for that frame are also available.
The question is what's the lowest overhead approach for this on android?
The CPU processing in my case just needs a greyscale image, so a YUV format where the Y plane is packed is ideal (and tends to be a good match to the native format of the camera devices too). NV12, NV21 or fully planar YUV would all provide ideal low-overhead access to greyscale, so that would be preferred on the CPU side.
In the original camera API the setPreviewCallbackWithBuffer() was the only sensible way to get data onto the CPU for processing. This had the Y plane separate so was ideal for the CPU processing. Getting this frame available to OpenGL for rendering in a low overhead way was the more challenging aspect. In the end I wrote a NEON color conversion routine to output RGB565 and just use glTexSubImage2d to get this available on the GPU. This was first implemented in the Nexus 1 timeframe, where even a 320x240 glTexSubImage2d call took 50ms of CPU time (poor drivers trying to do texture swizzling I presume - this was significantly improved in a system update later on).
Back in the day I looked into things like eglImage extensions, but they don't seem to be available or well documented enough for user apps. I had a little look into the internal android GraphicsBuffer classes but ideally want to stay in the world of supported public APIs.
The android.hardware.camera2 API had promise with being able to attach both an ImageReader and a SurfaceTexture to a capture session. Unfortunately I can't see any way of ensuring the right sequential pipeline here - holding off calling updateTexImage() until the CPU has processed is easy enough, but if another frame has arrived during that processing then updateTexImage() will skip straight to the latest frame. It also seems with multiple outputs there will be independent copies of the frames in each of the queues that ideally I'd like to avoid.
Ideally this is what I'd like:
Camera driver fills some memory with the latest frame
CPU obtains pointer to the data in memory, can read Y data without a copy being made
CPU processes data and sets a flag in my code when frame is ready
When beginning to render a frame, check if a new frame is ready
Call some API to bind the same memory as a GL texture
When a newer frame is ready, release the buffer holding the previous frame back into the pool
I can't see a way of doing exactly that zero-copy style with public API on android, but what's the closest that it's possible to get?
One crazy thing I tried that seems to work, but is not documented: The ANativeWindow NDK API can accept data NV12 format, even though the appropriate format constant is not one of the ones in the public headers. That allows a SurfaceTexture to be filled with NV12 data by memcpy() to avoid CPU-side colour conversion and any swizzling that happens driver side in glTexImage2d. That is still an extra copy of the data though that feels like it should be unnecessary, and again as it's undocumented might not work on all devices. A supported sequential zero-copy Camera -> ImageReader -> SurfaceTexture or equivalent would be perfect.

The most efficient way to process video is to avoid the CPU altogether, but it sounds like that's not an option for you. The public APIs are generally geared toward doing everything in hardware, since that's what the framework itself needs, though there are some paths for RenderScript. (I'm assuming you've seen the Grafika filter demo that uses fragment shaders.)
Accessing the data on the CPU used to mean slow Camera APIs or working with GraphicBuffer and relatively obscure EGL functions (e.g. this question). The point of ImageReader was to provide zero-copy access to YUV data from the camera.
You can't really serialize Camera -> ImageReader -> SurfaceTexture as ImageReader doesn't have a "forward the buffer" API. Which is unfortunate, as that would make this trivial. You could try to replicate what SurfaceTexture does, using EGL functions to package the buffer as an external texture, but again you're into non-public GraphicBuffer-land, and I worry about ownership/lifetime issues of the buffer.
I'm not sure how the parallel paths help you (Camera2 -> ImageReader, Camera2 -> SurfaceTexture), as what's being sent to the SurfaceTexture wouldn't have your modifications. FWIW, it doesn't involve an extra copy -- in Lollipop or thereabouts, BufferQueue was updated to allow individual buffers to move through multiple queues.
It's entirely possible there's some fancy new APIs I haven't seen yet, but from what I know your ANativeWindow approach is probably the winner. I suspect you'd be better off with one of the Camera formats (YV12 or NV21) than NV12, but I don't know for sure.
FWIW, you will drop frames if your processing takes too long, but unless your processing is uneven (some frames take much longer than others) you'll have to drop frames no matter what. Getting into the realm of non-public APIs again, you could switch the SurfaceTexture to "synchronous" mode, but if your buffers fill up you're still dropping frames.

Syncing openGL rendering with c++ game loop on android

I am building a game like application using android NDK and openGL ES 2.0
So far I understand the concept of vertices and shaders and programs.
The main game loop would be a loop in a single thread as follows
step 1. Read all the user input
step 2. Update game objects (if needed) based on the input
step 3. make draw calls for all the objects
step 4. call glSwapBuffers
then loop back to step 1
But I ran into various confusions regarding sync and threading so I'm listing all the question together.
1.Since open GL draw calls are asynchronous the draw calls and glSwapBuffers may get called many times before the gpu has even rendered actually a single frame from calls from last iteration of loop. Will this be problematic? buffer overflow or tearing ?
2.Assuming VSYNC is enabled then does point 1 still causes problem?
3.Since all calls are async how do I measure the time spent rendering each frame? glSwapBuffers would return immediately so how can I know when was the frame actually done?
4.loading textures will occupy space in the ram is checking free memory before loading texture standard way or I should keep loading textures until I reach OUT_OF_MEMORY_ERROR?
5.If I switch to multithreaded approach calling just glswapbuffers at a fixed 60 times per second without any regard to the thread which is processing input and giving out draw calls then what is supposed to happen?
Also how do I control the fps in game loop? I know the exact fps depends on a large no of factors but how can you go close to that

The SwapBuffers() will not be executed out of order. Issuing it after all of the draw commands for the frame is fine. The driver will take care about it, you don't need to sync anything. You can only screw this about by using multiple threads or multiple contexts, but even that would take a lot of effort.
There is no problem with 1, and VSYNC does not directly change anything here.
The calls might be asynchronous, but the driver will not queue up an unlimit amount of work. Sooner or later, it will have to block, if you try to issue too many calls in advance. When vsync is on, the typicial behavior is that the driver will queue up at most a few frames (or just one, depending on the driver settings), and SwapBuffers() will block when that limit is reached. So the timing statistics you get there are accurate, after the first few frames. Note that this is still much better than completely flushing the queue, as the driver unblocks as soon as the first pending buffer swap was carried out.
This is a totally new topic, which probably belongs into another question. However: It is very unlikely that you get any of the current desktop GL implementations to ever generate GL_OUT_OF_MEMORY. The driver will automatically page textures (and other objects) between VRAM and system RAM (and the OS might even page that to disk). The GL also provides no means to query the available memory.
In that scenario, you will need to synchronize manually. That approach does not make the slightest sense and seems like trying to solve a problem which does not exist. If you want your game to use multithreading, still put all the gl rendering (and swapbuffers) into the same thread. You can use different threads for input processing, sound, physics, update of the scene, general game logic and whatever. But you should just use a single thread/single context approach for the GL. That way, it also won't hurt you when SwapBuffers() blocks your render thread, as your game logic and input handling is still done, and the render thread will just render new frames with the newest available data in the frequency the display needs (with vsync on) or as fast as the CPU and GPU can work (if vsync is off).

Concurrency with android GLES glBufferData

I'm trying to use Android and OpenGL 2.0 to create a sort-of desert racing game. At least that's the end goal. For the time being I'm really just working with generating an endless desert, through the use of a Perlin noise algorithm. However, I'm coming across a lot of problems with regard to concurrency and synchronization. The program consists of three threads: a "render" thread, a "geometry" thread which essentially sits in the background generating tiles of perlin noise (eventually sending them through to the render thread to process in its own time) and a "main" thread which updates the camera's position and updates the geometry thread if new perlin noise tiles need to be created.
Aforementioned perlin tiles are stored in VBOs and only rendered when they're within a certain distance of the camera. Buffer initialization always begins immediately.
This all works well, without any noticeable problems.
HOWEVER.
When the tiles are uploaded to the GPU through glBufferData() (after processing by the separate geometry thread), the render thread always appears to block. I presume this is because Android implicitly calls glFinish() before the screen buffer is rendered. Obviously, I'd like the data uploading to be performed in the background while everything else is being drawn - even taking place over multiple frames if necessary.
I've looked on google and the only solution I could find is to use glMapBuffer/glMapBufferRange(), but these two methods aren't supported in GLES2.0. Neither are any of the synchronization objects - glFenceSync etc. so...
....
any help?
P.S. I haven't provided any code as I didn't think it was necessary, as the problem seems more theoretical to me. However I can certainly produce some on request.
A screenshot of the game so far:
http://i.stack.imgur.com/Q6S0k.png

Android does not call glFinish() (glFinish() is actually a no-op on IMG's GPUs). The problem is that glBufferData() is not an asynchronous API. What you really want is PBOs which are only available in OpenGL ES 3.0 and do offer the ability to perform asynchronous copies (including texture uploads.)
Are you always using glBufferData()? You should use glBufferSubData() as much as possible to avoid reallocating your VBO every time.

Splitting cameracontrol sample OpenCv4Android in two threads

I have imported into Eclipse Juno the sample of OpenCv4Android, ver. 2.4.5, called "cameracontrol". It can be found here:Camera Control OpenCv4Android Sample.
Now I want to use this project as base for mine. I want to process every frame with image-processing techniques, so, in order to improve performance, I want to split the main activity of the project in two classes: one that is merely the activity and one (a thread) that is responsible for preview.
How can I do? Are there any examples about this?

This might not be a complete answer as I'm only learning about this myself at the moment, but I'll provide as much info as I can.
You will likely have to grab the images from the camera yourself and dispatch it to threads. This is because your activity in the example gets called with a frame from the camera and has to return the frame to be displayed (immediately) as the return value. You can't get 2+ frames to process in parallel without showing a blank screen in the meantime or some other hacky stuff. You'll probably want to allocate a (fixed sized) buffer somewhere, then start processing a frame with a worker thread when you get one (this would be the dispatcher). Once your worker thread is done he notifies the dispatcher who gives the image to the view. If frames come from the camera while all worker threads are busy (i.e. there are no free slots in the buffer), the frame is dropped. Once a space in the buffer frees up again, the next frame is accepted and processed.
You can look at the code of the initialitzeCamera function of JavaCameraView and NativeCameraView to get an idea of how to do this (also google should help, as this is how apps without OpenCV have to do it as well). For me the native camera performs significantly better though (even without heavy processing it's just much smoother), but ymmv...
I can't help with actual details about the implementation since I'm not that far into it myself yet. I dope this provides some ideas though.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.