I am doing an Android project about dealing with video frame, I need to handle every frame before display it. The process includes scaling up frames from 1920x1080 to 2560x1440 resolution, color space conversion and some necessary image processing based on RGB, and all these works should be finished within 33ms~40ms.
I have optimized the yuv->rgb and other processing with arm neon, they worked well. But I have to scale up frame firstly from 1080p to 2k resolution, it's the bottleneck of performance now.
My question is how to efficiently scale up image from 1080p to 2k resolution within 20ms, I don't have much experience about scaling algorithm, so any suggestions are helpful.
Could I use arm neon to optimize the existing algorithm?
The hardware environment:
CPU: Samsung Exynos 5420
Memory: 3GB
Display: 2560X1600 px
Update:
I will describe my decoding process, I use MediaCodec to decode the normal video(H.264) to YUV(NV12), the default decoder is hardware, it's very fast. Then I use arm neon to convert NV12 to RGBW, and then send RGBW frame to surfaceflinger to display. I just use normal SurfaceView rahter than GLSurfaceView.
The bottleneck is how to scale up YUV from 1080p to 2K fast.
I find that examples work well, so allow me to lead with this example program that uses OpenGL shaders to convert from YUV -> RGB: http://www.fourcc.org/source/YUV420P-OpenGL-GLSLang.c
What I envision for your program is:
Hardware video decodes H.264 stream -> YUV array
Upload that YUV array as a texture to OpenGL; actually, you will upload 3 different textures-- Y, U, and V
Run a fragment shader that converts those Y, U, and V textures into an RGB(W) image; this will produce a new texture in video memory
Run a new fragment shader against the texture generated in previous step in order to scale the image
There might be a bit of a learning curve involved here, but I think it's workable, given your problem description. Take it one step at a time: get the OpenGL framework in place, try uploading just the Y texture and writing a naive fragment shader that just emits a grayscale pixel based on the Y sample, then move onto correctly converting the image, then get a really naive upsampler working, then put a more sophisticated upsampler into service.
I'd also recommend opengl es too, mainly because of the project I'm currently working on, also playing video. For me, the display is 1920 x 1080, so the texture I'm using is 2048 x 1024. I get approx 35 fps on a quad core arm7.
Use a GLSurfaceView and your own custom renderer. If you're using ffmpeg then once you've decoded your video frames, use sws_scale to scale your frame and then just upload it into the opengl texture. The larger your texture/display, the less fps you will get because it a lot of time taken uploading large images to the gpu every frame.
Depending on your needs for decoding your video input is what you will have to research. For me, I had to compile ffmpeg for android and start from there.
my apologies for putting this in an answer. i dont have enough points to make a comment.
I'd like to add that you might run into OGL texture limitations. I have tried to use OGL for the opposite problem; scaling down from the camera in real time. the problem is that the max OGL texture is 2048x2048. Not sure if this is true for all devices. this limit was true on newer kit like N72013 and LG2. in the end, i had to write in in the NDK without OGL by optimising the hell out of it by hand.
good luck, though.
Related
I would like to know if there is any kind of limitation on the texture size that can be used in any Android Opengl Es 2.0 projects. I understand that having a huge texture of size 4096x4096 is a bit meaning less as it is rendered on a small screen. But What if the requirement is to switch between many textures at run time? And If I want to have a texture atlas to do a quick single upload instead of multiple smaller texture upload. Please let me know your ideas in this regards.
Also I am sure there has to be a limitation on the size of image that can be processed by a device, as the memory on the device is limited. But I would like to know if it is resolution based or is it size based. I mean if a device has a limitation of 1024x1024 image size can it handle a compressed texture of size 2048x2048 that would be of same size approx as uncompressed 1024x1024.
Also please let me know on an general basis usually how much the limitation on texture size or resolution normal devices running android 2.2 and above would be.
Also please let me know if there are any best practices when handling high resolution images in opengles 2.0 to get best performance in both load time and also run time.
There is a hardware limitation on the texture sizes. To manually look them up, you can go to a site such as glbenchmark.com (Here displaying details about google galaxy nexus).
To automatically find the maximum size from your code, you can use something like:
int[] max = new int[1];
gl.glGetIntegerv(GL10.GL_MAX_TEXTURE_SIZE, max, 0); //put the maximum texture size in the array.
(For GL10, but the same method exists for GLES20)
When it comes to the processing or editing of an image you usually use an instance of Bitmap when working in android. This holds the uncompressed values of your image and is thus resolution dependant. However, it is recommended that you use compressed textures for your openGL applications as this improves the memory-use efficiency (note that you cannot modify these compressed textures).
From the previous link:
Texture compression can significantly increase the performance of your
OpenGL application by reducing memory requirements and making more
efficient use of memory bandwidth. The Android framework provides
support for the ETC1 compression format as a standard feature [...]
You should take a look at this document which contains many good practices and hints about texture loading and usage. The author explicitly writes:
Best practice: Use ETC for texture compression.
Best practice: Make sure your geometry and texture resolutions are
appropriate for the size they're displayed at. Don't use a 1k x 1k
texture for something that's at most 500 pixels wide on screen. The
same for geometry.
I am using the GL_OES_EGL_image_external extension to play a video with OpenGL. The problem is that on some devices the video dimensions are exceeding the maximum texture size of OpenGL. Is there any way how I can dynamically deal with this issue, e.g. downscale the frames on the fly or do I have to reduce the video size beforehand?
If you are really hitting the max texture size in OpenGL ES (FWIW I believe this is about 2048x2048 with recent devices) then you could do a few things:
You could set setVideoScalingMode(VIDEO_SCALING_MODE_SCALE_TO_FIT) on your MediaPlayer. I believe this will scale the video resoltion to the size of the SurfaceTexture/Surface that it is attached to.
You could alternatively have four videos playing and render them to seperate TEXTURE_EXTERNAL_OES's then render these four textures seperately in GL. However that could kill your performance.
If I saw the error message and some context of the code I could maybe provide some more information.
I created a movie player based on FFmpeg. It works fine. The decoding is quite fast, on LG P970 (Cortex A8 with Neon) I have an average 70 fps with 640 x 424 resolution video stream including YUV2RGB conversion. However, there is one bottleneck. It is drawing on Canvas.
I use jnigraphics native library to fill picture data into the bitmap in the native side and then I draw this bitmap on Canvas in SurfaceView. It is quite simple and common approach, but the drawing takes 44 ms for bitmap with 640 x 424 resolution which reduces fps to 23 and makes this technique unusable... It takes a lot more then the whole A/V frame decoding!
Is there any method how to draw bitmaps significantly faster? I would prefer to render completely in the native code using OpenGLES 2, but I have read it also could be slow. So what now?...
How can I render bitmaps as fast as possible?
Draw them in GLES1.x. You do not need to use GLES2 as you will have no use, or at least not in the context of your question, for shaders which would be the general primary reason of using GLES2.x. So for simplicity sake, GLES1.x would be ideal. All you need to do is draw the bytebuffer to the screen. On my Galaxy S (Vibrant) this takes about 3ms. The size of the byte[] in my wallpaper is 800x480x3 or 1152000 which is significantly larger than what you are working with.
I believe this guide should point you in the correct direction.
http://qdevarena.blogspot.com/2009/02/how-to-load-texture-in-android-opengl.html
As for the notion of accessing canvas from native code, I would just avoid that altogether and follow an OpenGL implementation by offloading everything to GPU as much as possible.
I recall during the Replica Island presentation during GoogleIO the designer said that using the OpenGL 'draw_texture' extension glDrawTexfOES was the fastest way to blit to the screen, and significantly faster than drawing just regular quads with textures attached (I'm assuming you're using OpenGL).
You can't rotate the texture, but it doesn't sound like you need that.
I would like to know if there is any kind of limitation on the texture size that can be used in any Android Opengl Es 2.0 projects. I understand that having a huge texture of size 4096x4096 is a bit meaning less as it is rendered on a small screen. But What if the requirement is to switch between many textures at run time? And If I want to have a texture atlas to do a quick single upload instead of multiple smaller texture upload. Please let me know your ideas in this regards.
Also I am sure there has to be a limitation on the size of image that can be processed by a device, as the memory on the device is limited. But I would like to know if it is resolution based or is it size based. I mean if a device has a limitation of 1024x1024 image size can it handle a compressed texture of size 2048x2048 that would be of same size approx as uncompressed 1024x1024.
Also please let me know on an general basis usually how much the limitation on texture size or resolution normal devices running android 2.2 and above would be.
Also please let me know if there are any best practices when handling high resolution images in opengles 2.0 to get best performance in both load time and also run time.
There is a hardware limitation on the texture sizes. To manually look them up, you can go to a site such as glbenchmark.com (Here displaying details about google galaxy nexus).
To automatically find the maximum size from your code, you can use something like:
int[] max = new int[1];
gl.glGetIntegerv(GL10.GL_MAX_TEXTURE_SIZE, max, 0); //put the maximum texture size in the array.
(For GL10, but the same method exists for GLES20)
When it comes to the processing or editing of an image you usually use an instance of Bitmap when working in android. This holds the uncompressed values of your image and is thus resolution dependant. However, it is recommended that you use compressed textures for your openGL applications as this improves the memory-use efficiency (note that you cannot modify these compressed textures).
From the previous link:
Texture compression can significantly increase the performance of your
OpenGL application by reducing memory requirements and making more
efficient use of memory bandwidth. The Android framework provides
support for the ETC1 compression format as a standard feature [...]
You should take a look at this document which contains many good practices and hints about texture loading and usage. The author explicitly writes:
Best practice: Use ETC for texture compression.
Best practice: Make sure your geometry and texture resolutions are
appropriate for the size they're displayed at. Don't use a 1k x 1k
texture for something that's at most 500 pixels wide on screen. The
same for geometry.
In my live wallpaper I'm drawing 3 textured quads that covers the whole screen. On Nexus One I get 40fps. I'm looking for ways to improving performance.
The quads are blended on top of each other, textures are loaded from RGB_8888 bitmaps. Textures are 1024x1024.
I've got
glDisable(GL_DITHER);
glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_FASTEST);
glDisable(GL10.GL_LIGHTING);
glDisable(GL_DEPTH_TEST);
glEnable(GL_TEXTURE_2D);
glEnable(GL_BLEND);
glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);
Things I've tried, all resulting in the same 40fps:
Reduce texture size to 512x512 and 256x256
Use draw_texture extension
Disable blending
Change texture filtering from GL_LINEAR to GL_NEAREST
Use VBOs (desperate try, since there are just 3 quads...)
Run the drawing code in standalone activity (in case being a live wallpaper somehow affects performance)
If I draw 2 layers instead of 3, fps rises to 45 or so,
then drawing just 1 layer sees fps rise to 55. I guess I'm getting limited by fill rate, since turning off and on the potentially costly features result in the same fps, and the only thing that seems to improve fps is just drawing less...
I'm mulling over the idea of texture compression, but supporting the different compression formats doesn't seem like fun. ETC1 has no alpha channel which I need, and I'm not sure if PVRTC and ATITC can even be used from Java and OpenGL ES 1.0 or 1.1.
I'd be glad to hear ideas on what else to try.
I can give pointer to the current version of wallpaper and screenshots if that's of use.
You probably already thought of this, but just in case you didn't:
calling glClear at the start of your frame probably isn't necessary
you could do the first pass with blending disabled
Also, did you try doing it in 1 pass with a multi-texturing approach?
edit: And another thing: writing to the z-buffer is not needed, so either use a context without z-buffer, or disable depth writing with glDepthMask(GL_FALSE).
glCompressedTexImage2D is available in Java (and NDK), however available compression format depends on GPU.
The AndroidManifest.xml File > supports-gl-texture
PowerVR SGX - Nexus S, Galaxy S/Tab, DROID - PVRTC
Adreno - Nexus One, EVO - ATITC
Tegra2 - Xoom, Atrix - S3TC
If you use these compression format and want to support various Android devices, you must prepare for many compressed textures, but the GPU native compression texture should improve rendering performance.
GPU Profiling and callbacks in OpenGL ES
OpenGL ES, Changing texture format
from RGBA8888 to RGBA4444 will
improve fill rate?
The android opengl framework min3d is able to draw these objects or scenes at a full 60fps.
The framework is opensource and is available for download and use at: http://code.google.com/p/min3d/
I would recommend comparing your code to it to see what you have done wrong/differently in order to improve your performance.