Crash with SurfaceView in Android NDK when pausing/resuming app fast - android

When I pause/unpause my app really fast then I get the following problem:
E/BufferQueueProducer( 177): [SurfaceView] connect(P): already connected (cur=1 req=1)
E/libEGL (25863): eglCreateWindowSurface: native_window_api_connect (win=0xb4984508) failed (0xffffffea) (already connected to another API?)
E/libEGL (25863): eglCreateWindowSurface:416 error 3003 (EGL_BAD_ALLOC)
Im pretty sure that I am stopping/starting my render thread correctly and this issue really only occurs when I pause/resume the app really fast (like when you mash the open-apps button).
Any ideas what might be the cause for eglCreateWindowSurface returning EGL_NO_SURFACE here? My guess would be it has to do with something still being connected to the SurfaceView.

It sounds like you're trying to create an EGLSurface for a Surface that already has one. If speed is an issue it's usually because of the lag in Surface callback handling -- the SurfaceView Surface is partially handled by the Window Manager, which requires inter-process communication.
Perhaps your native code still has a handle to the old SurfaceHolder, and if you moved more slowly the handle would be replaced by an upcoming surfaceCreated()? It's hard to say without knowing exactly what your code does. One way to approach these sorts of problems is by adding logging at all the interesting state change points, and comparing the logs from "slow" pause/resume and "fast" pause/resume.
It should be possible to avoid these situations by managing the SurfaceView state carefully. This appendix to the graphics arch doc talks about the difference between the Activity and SurfaceView lifecycles, and two ways to structure an app to avoid issues.

Related

AOSP / Android 7: How is EGL utilized in detail?

I am trying to understand the Android (7) Graphics System from the system integrators point of view. My main focus is the minimum functionality that needs to be provided by libegl.
I understand that surfaceflinger is the main actor in this domain. Surfaceflinger initialized EGL, creates the actual EGL surface and acts as a consumer for buffers (frames) created by the app. The app again is executing the main part of required GLES calls. Obviously, this leads to restrictions as surfaceflinger and apps live in separate processes which is not the typical use case for GLES/EGL.
Things I do not understand:
Do apps on Android 7 always render into EGL_KHR_image buffers which are send to surfaceflinger? This would mean there's always an extra copy step (even when no composition is needed), as far as I understand... Or is there also some kind of optimized fullscreen mode, where apps do directly render into the final EGL surface?
Which inter-process sharing mechanisms are used here? My guess is that EGL_KHR_image, used with EGL_NATIVE_BUFFER_ANDROID, defines the exact binary format, so that an image object may be created in each process, where the memory is shared via ashmem. Is this already the complete/correct picture or do I miss something here?
I'd guess these are the main points I am lacking confident knowledge about, at the moment. For sure, I have some follow-up questions about this (like, how do gralloc/composition fit into this?), but, in accordance to this platform, I'd like to keep this question as compact as possible. Still, besides the main documentation page, I am missing documentation clearly targeted at system integrators. So further links would be really appreciated.
My current focus are typical use cases which would cover the vast majority of apps compatible with Android 7. If there are corner cases like long deprecated compatibility shims, I'd like to ignore them for now.

Android: how to forcefully reproduce "OpenGL context loss" issue?

There's some possibility for an android OpenGL application of GL context loss while on background. So, to keep things simple, if you're got unexpected Renderer.onSurfaceCreated call - whoah, you're lucky. System wiped you out and you have to recover all you GL stuff from scratch.
One thing that rather bothers me and google documentation seems to keep silence about - how could one firmly and efficiently reproduce the issue during development?

On Android, can I detect screen jank without looking at the screen?

I'm trying to use the output of systrace to detect janky scrolling during automated tests: I want to notice it early, without having to sit there watching.
I spent some time trying to fathom the trace, and found this ebook very helpful: https://www.safaribooksonline.com/library/view/high-performance-android/9781491913994/ch04.html
The most promising hypothesis was checking whether VSYNC-sf ever stopped ticking on phones displaying VSYNC-sf.
On other machines, SurfaceFlinger seems to be started by either HW_SYNC_0 or VSYNC (sometimes one or both of those VSYNCs stop) but SurfaceFlinger also seems to be involved with VsyncOn, which sometimes appears to keep track of whether there are activity buffers outstanding, and sometimes whether there are input events that need delivering. Confusingly, sometimes input events are delivered during half-second pauses when there's no surface flinger activity, no application drawing, and when even the VSYNC and HW_VSYNC signals decide to pause.
Does anyone know what's going on there?
Should I simply expect to see Surface Flinger always busy - not alternately busy and idle with each tick - and always aligned with one or other of the VSYNCs?
I also sometimes see SurfaceFlinger taking longer than a tick to complete its processing - is that the application's fault for having a very complicated display, or is it just something that happens because some queue isn't empty enough?
I'd prefer to miss a possible jank than claim to have found one which isn't there.
Thanks!
Testing Display Performance Lists how to use the new framestats command from dumpsys to get this type of information. It will provide information on what frames you've missed, and how many of them you've missed.
It's also worth noting that SurfaceFlinger isn't always busy. It's only active when part of the screen needs to be updated. If nothing on the screen needs updating, then no new rendering occurs, and such, SurfaceFlinger should be idle.
You can get a bigger-picture view of the Android rendering pipeline with the Rendering Performance 101 video from Android Performance Patterns.

Accessing the memory of the default framebuffer on Android

I have a setup with OpenGL ES 2.0 and EGL on Android 4.4.2 (API level 19).
My goal is to access the buffer of the window (the default framebuffer in OpenGL terms) directly from the CPU / user space.
I have tried using ANativeWindow_fromSurface to get ANativeWindow from the Surface of a GLSurfaceView. Then trying to get access to the buffer with ANativeWindow_lock fails with status -22. Logcat gives
03-25 10:50:25.363: E/BufferQueue(171): [SurfaceView](this:0xb8d5d978,id:32,api:1,p:6488,c:171) connect: already connected (cur=1, req=2)
From this discussion it seems you can't do that with GLSurfaceView, because EGL has already acquired the surface.
How could you get to the memory of the window? Can you somehow do it through an EGLSurface? I am willing to use android::GraphicBuffer, even tough it is not part of the NDK.
If this is not possible, can you use the other direction, by first creating an android::GraphicBuffer and then binding it to an EGLSurface and the displayed window?
Android devices may not have a framebuffer (i.e. /dev/graphics/fb). It's still widely used by the recovery UI, but it's being phased out.
If it does have a framebuffer, it will be opened and held by the Hardware Composer unless the app framework has been shut down. Since you're trying to use the NDK, I assume the framework is still running.
If your NDK code is running as root or system, you can request a top window from SurfaceFlinger. The San Angeles demo provides an example.
Additional information can be found here, here, and here. If you want to work with graphics at a low level, you should also read the graphics architecture doc.
This is not doable with just NDK API, you will need to pull-in some OS headers, that are not guaranteed to be stable.
You will need to subclass ANativeWindow, similarly to what is done in frameworks/native/include/ui/FramebufferNativeWindow.h.
However you may need to construct your own buffer queue using own-created android::GraphicBuffer objects, and properly respond to all dequeue() and enqueue() requests.
On enqueue() you will need to sync (GPU renders asynchronously) and than map enqueued buffer to CPU memory.
Note that this approach may be underperformant, due to explicit GPU<->CPU sync needed.

eglSwapBuffers is erratic/slow

I have a problem with very low rendering time on an android tablet using the NDK and the egl commands. I have timed calls to eglSwapBuffers and is taking a variable amount of time, frequently exceeded the device frame rate. I know it synchronizes to the refresh, but that is around 60FPS, and the times here drop well below that.
The only command I issue between calls to swap is glClear, so I know it isn't anything that I'm drawing causing the problem. Even just by clearing the frame rate drops to 30FPS (erratic though).
On the same device a simple GL program in Java easily renders at 60FPS, thus I know it isn't fundamentally a hardware issue. I've looked through the Android Java code for setting up the GL context and can't see any significant difference. I've also played with every config attribute, and while some alter the speed slightly, none (that I can find) change this horrible frame rate drop.
To ensure the event polling wasn't an issue I moved the rendering into a thread. That thread now only does rendering, thus just calls clear and swap repeatedly. The slow performance still persists.
I'm out of ideas what to check and am looking for suggestions as to what the problem might be.
There's really not enough info (like what device you are testing on, what was you exact config etc) to answer this 100% reliable but this kind of behavior is usually caused by window and surface pixel format mismatch eg. 16bit (RGB565) vs 32bit.
FB_MULTI_BUFFER=3 environment variable will enable the multi buffering on Freescale i.MX 6 (Sabrelite) board with some recent LTIB build (without X). Your GFX driver may needs something like this.

Categories

Resources