Graphic architecture of android - android

As we know, during a program, it calls functions from the opengl es library (and of course, libegl). I would like to understand this in more detail. That is, what kind of library calls another library and so to the GPU. How do surfaceflinger, garrots interacts with all of this.
I know there are a lot of pictures depicting the approximate scheme. But there is no clear tree of calls. I'll be glad to any answer. Maybe there are some useful resources that I could not find.

Your Question is too broad. Still I will try to make some of its clear.
Application will either draw on Canvas or Its an OpenGL ES based app. Canvas based app may or may not use Hardware Rendering. In case of Hardware Rendering and OpenGL app final image is written to a buffer called "Surface" using GPU. Same buffer is written using CPU in case of Canvas and Software Rendering.
There can be multiple such buffers. Which are sent to Surface Flinger for compositing. Surface Flinger again; may or may not use OpenGL(or GPU) for compositing. SurfaceFlinger can also offload this compositing task to HardwareComposer depending upon different various conditions.
GrAlloc is used to allocate contiguous chunk of memory for graphics purpose.
Thus Final composited buffer is sent to LCD Display for final Display.
Edit
How OpenGL Works ?
So Open GL is just a soecification. GPU venodors provide implementation of that specification in GPU drivers. LibGLES will hav all function declarations and its Graphics Drivers job to convert libgl calls to GPU instructions.
If you want in depth understanding of Surface Flinger and Hardware Composer read about Android Graphics Architecture on android source code site.

Related

Lowest overhead camera to CPU to GPU approach on android

My application needs to do some processing on live camera frames on the CPU, before rendering them on the GPU. There's also some other stuff being rendered on the GPU which is dependent on the results of the CPU processing; therefore it's important to keep everything synchronised so we don't render the frame itself on the GPU until the results of the CPU processing for that frame are also available.
The question is what's the lowest overhead approach for this on android?
The CPU processing in my case just needs a greyscale image, so a YUV format where the Y plane is packed is ideal (and tends to be a good match to the native format of the camera devices too). NV12, NV21 or fully planar YUV would all provide ideal low-overhead access to greyscale, so that would be preferred on the CPU side.
In the original camera API the setPreviewCallbackWithBuffer() was the only sensible way to get data onto the CPU for processing. This had the Y plane separate so was ideal for the CPU processing. Getting this frame available to OpenGL for rendering in a low overhead way was the more challenging aspect. In the end I wrote a NEON color conversion routine to output RGB565 and just use glTexSubImage2d to get this available on the GPU. This was first implemented in the Nexus 1 timeframe, where even a 320x240 glTexSubImage2d call took 50ms of CPU time (poor drivers trying to do texture swizzling I presume - this was significantly improved in a system update later on).
Back in the day I looked into things like eglImage extensions, but they don't seem to be available or well documented enough for user apps. I had a little look into the internal android GraphicsBuffer classes but ideally want to stay in the world of supported public APIs.
The android.hardware.camera2 API had promise with being able to attach both an ImageReader and a SurfaceTexture to a capture session. Unfortunately I can't see any way of ensuring the right sequential pipeline here - holding off calling updateTexImage() until the CPU has processed is easy enough, but if another frame has arrived during that processing then updateTexImage() will skip straight to the latest frame. It also seems with multiple outputs there will be independent copies of the frames in each of the queues that ideally I'd like to avoid.
Ideally this is what I'd like:
Camera driver fills some memory with the latest frame
CPU obtains pointer to the data in memory, can read Y data without a copy being made
CPU processes data and sets a flag in my code when frame is ready
When beginning to render a frame, check if a new frame is ready
Call some API to bind the same memory as a GL texture
When a newer frame is ready, release the buffer holding the previous frame back into the pool
I can't see a way of doing exactly that zero-copy style with public API on android, but what's the closest that it's possible to get?
One crazy thing I tried that seems to work, but is not documented: The ANativeWindow NDK API can accept data NV12 format, even though the appropriate format constant is not one of the ones in the public headers. That allows a SurfaceTexture to be filled with NV12 data by memcpy() to avoid CPU-side colour conversion and any swizzling that happens driver side in glTexImage2d. That is still an extra copy of the data though that feels like it should be unnecessary, and again as it's undocumented might not work on all devices. A supported sequential zero-copy Camera -> ImageReader -> SurfaceTexture or equivalent would be perfect.
The most efficient way to process video is to avoid the CPU altogether, but it sounds like that's not an option for you. The public APIs are generally geared toward doing everything in hardware, since that's what the framework itself needs, though there are some paths for RenderScript. (I'm assuming you've seen the Grafika filter demo that uses fragment shaders.)
Accessing the data on the CPU used to mean slow Camera APIs or working with GraphicBuffer and relatively obscure EGL functions (e.g. this question). The point of ImageReader was to provide zero-copy access to YUV data from the camera.
You can't really serialize Camera -> ImageReader -> SurfaceTexture as ImageReader doesn't have a "forward the buffer" API. Which is unfortunate, as that would make this trivial. You could try to replicate what SurfaceTexture does, using EGL functions to package the buffer as an external texture, but again you're into non-public GraphicBuffer-land, and I worry about ownership/lifetime issues of the buffer.
I'm not sure how the parallel paths help you (Camera2 -> ImageReader, Camera2 -> SurfaceTexture), as what's being sent to the SurfaceTexture wouldn't have your modifications. FWIW, it doesn't involve an extra copy -- in Lollipop or thereabouts, BufferQueue was updated to allow individual buffers to move through multiple queues.
It's entirely possible there's some fancy new APIs I haven't seen yet, but from what I know your ANativeWindow approach is probably the winner. I suspect you'd be better off with one of the Camera formats (YV12 or NV21) than NV12, but I don't know for sure.
FWIW, you will drop frames if your processing takes too long, but unless your processing is uneven (some frames take much longer than others) you'll have to drop frames no matter what. Getting into the realm of non-public APIs again, you could switch the SurfaceTexture to "synchronous" mode, but if your buffers fill up you're still dropping frames.

OpenGL ES settings for 2D games to perform better and/or save battery

I read for example that the depth buffer is often not needed in 2D games and disabling it can increase performance quite a bit. Are there any other features I can disable or settings I can tweak?
I talk about OpenGL ES in Android, but i'm quite sure it's similar in IOS environment. You can operate different optimization operation to work with Opengl ES.
Optimization during OpenGL ES context
When you create an OpenGL context, many "buffer" are created:
Color buffer
Depth buffer
Stencil buffer
You can optimize OpenGL context creation with different operation:
reduce color buffer memory occupation: to do this, you must reduce pixel format: RBG_565 it's a good format
don't create depth buffer: you don't need it.
don't create stencil buffer:you don't need it.
This kind of optimization mainly reduce memory occupation. But you known, less memory you have to manage, faster you will be.
Optimization during draw operations
You probably work with OpenGL ES 2.+, so you have to write shader and vertex programs. There are different operation you can do:
Keep these GPU programs as simple as you can. To do this in Android i use NVIDIA Tegra Debugger. But it's because i have a tablet with Tegra X1 chipset.
Prefer VBO at client buffers
Precalculate everything you can: you have to draw every frame at max speed, so if something is statically defined (i.e.: matrix projection) calculate it before start to draw frames.
Actually, there is nothing to do in this case. OpenGL is not a 3D Game Engine.
In the default, every options you worry about are already disabled.
You won't make depth buffers because you don't need it.
You won't use many matrices because you don't need a projection matrix, a view matrix and so on.
In short, just don't enable what you don't need.
Don't make something in the typical rendering pipeline if you don't need it.
Don't make long shader codes because normally, you don't need that long shader codes in 2D game.

Accessing the memory of the default framebuffer on Android

I have a setup with OpenGL ES 2.0 and EGL on Android 4.4.2 (API level 19).
My goal is to access the buffer of the window (the default framebuffer in OpenGL terms) directly from the CPU / user space.
I have tried using ANativeWindow_fromSurface to get ANativeWindow from the Surface of a GLSurfaceView. Then trying to get access to the buffer with ANativeWindow_lock fails with status -22. Logcat gives
03-25 10:50:25.363: E/BufferQueue(171): [SurfaceView](this:0xb8d5d978,id:32,api:1,p:6488,c:171) connect: already connected (cur=1, req=2)
From this discussion it seems you can't do that with GLSurfaceView, because EGL has already acquired the surface.
How could you get to the memory of the window? Can you somehow do it through an EGLSurface? I am willing to use android::GraphicBuffer, even tough it is not part of the NDK.
If this is not possible, can you use the other direction, by first creating an android::GraphicBuffer and then binding it to an EGLSurface and the displayed window?
Android devices may not have a framebuffer (i.e. /dev/graphics/fb). It's still widely used by the recovery UI, but it's being phased out.
If it does have a framebuffer, it will be opened and held by the Hardware Composer unless the app framework has been shut down. Since you're trying to use the NDK, I assume the framework is still running.
If your NDK code is running as root or system, you can request a top window from SurfaceFlinger. The San Angeles demo provides an example.
Additional information can be found here, here, and here. If you want to work with graphics at a low level, you should also read the graphics architecture doc.
This is not doable with just NDK API, you will need to pull-in some OS headers, that are not guaranteed to be stable.
You will need to subclass ANativeWindow, similarly to what is done in frameworks/native/include/ui/FramebufferNativeWindow.h.
However you may need to construct your own buffer queue using own-created android::GraphicBuffer objects, and properly respond to all dequeue() and enqueue() requests.
On enqueue() you will need to sync (GPU renders asynchronously) and than map enqueued buffer to CPU memory.
Note that this approach may be underperformant, due to explicit GPU<->CPU sync needed.

How to write bench mark tool for OpenGL ES on Android

Let's say I wanted to write my own software for Android that would benchmark rendering performance. Something along the lines of 3Dmark basically. What sorts of factors should the different test cases measure? Would it simply be rendering a ton of verts? Running a ton of textures?
Are there any resources out there that are either books or online guides that might help with developing specific test cases that would exercise specific portions of a phone/tablet's GPU?
Thanks,
Besides vertex count and texture sizes, the other major variables you should cover in an OpenGL ES benchmark are the display resolution and the complexity of the shader programs. You might also want to evaluate the compatibility of the OpenGL ES and EGL drivers, including extensions. There is a real need for that on Android.

What is the best method to render video frames?

what is the best choice for rendering video frames obtained from a decoder bundled into my app (FFmpeg, etc..) ?
I would naturally tend to choose OpenGL as mentioned in Android Video Player Using NDK, OpenGL ES, and FFmpeg.
But in OpenGL in Android for video display, a comment notes that OpenGL isn't the best method for rendering video.
What then? The jnigraphics native library? And a non-GL SurfaceView?
Please note that I would like to use a native API for rendering the frames, such as OpenGL or jnigraphics. But Java code for setting up a SurfaceView and such is ok.
PS: MediaPlayer is irrelevant here, I'm talking about decoding and displaying the frames by myself. I can't rely on the default Android codecs.
I'm going to attempt to elaborate on and consolidate the answers here based on my own experiences.
Why openGL
When people think of rendering video with openGL, most are attempting to exploit the GPU to do color space conversion and alpha blending.
For instance converting YV12 video frames to RGB. Color space conversions like YV12 -> RGB require that you calculate the value of each pixel individually. Imagine for a frame of 1280 x 720 pixels how many operations this ends up being.
What I've just described is really what SIMD was made for - performing the same operation on multiple pieces of data in parallel. The GPU is a natural fit for color space conversion.
Why !openGL
The downside is the process by which you get texture data into the GPU. Consider that for each frame you have to Load the texture data into memory (CPU operation) and then you have to Copy this texture data into the GPU (CPU operation). It is this Load/Copy that can make using openGL slower than alternatives.
If you are playing low resolution videos then I suppose it's possible you won't see the speed difference because your CPU won't bottleneck. However, if you try with HD you will more than likely hit this bottleneck and notice a significant performance hit.
The way this bottleneck has been traditionally worked around is by using Pixel Buffer Objects (allocating GPU memory to store texture Loads). Unfortunately GLES2 does not have Pixel Buffer Objects.
Other Options
For the above reasons, many have chosen to use software-decoding combined with available CPU extensions like NEON for color space conversion. An implementation of YUV 2 RGB for NEON exists here. The means by which you draw the frames, SDL vs openGL should not matter for RGB since you are copying the same number of pixels in both cases.
You can determine if your target device supports NEON enhancements by running cat /proc/cpuinfo from adb shell and looking for NEON in the features output.
I have gone down the FFmpeg/OpenGLES path before, and it's not very fun.
You might try porting ffplay.c from the FFmpeg project, which has been done before using an Android port of the SDL. That way you aren't building your decoder from scratch, and you won't have to deal with the idiosyncracies of AudioTrack, which is an audio API unique to Android.
In any case, it's a good idea to do as little NDK development as possible and rely on porting, since the ndk-gdb debugging experience is pretty lousy right now in my opinion.
That being said, I think OpenGLES performance is the least of your worries. I found the performance to be fine, although I admit I only tested on a few devices. The decoding itself is fairly intensive, and I wasn't able to do very aggressive buffering (from the SD card) while playing the video.
Actually I have deployed a custom video player system and almost all of my work was done on the NDK side. We are getting full frame video 720P and above including our custom DRM system. OpenGL is not your answer as on Android Pixbuffers are not supported, so you are bascially blasting your textures every frame and that screws up OpenGLESs caching system. You frankly need to shove the video frames through the Native supported Bitmap on Froyo and above. Before Froyo your hosed. I also wrote a lot of NEON intrinsics for color conversion, rescaling, etc to increase throughput. I can push 50-60 frames through this model on HD Video.

Categories

Resources