As a continuation from my previous question (GLSL : Accessing an array in a for-loop hinders performance), I have encountered an entirely new and annoying problem.
So, I have a shader that performs a black hole effect.
The shader works perfectly on my computer, the android emulator, and ShaderToy – but for some reason, even though the code is exactly the same, does not work on my Android device.
The problem occurs when I zoom in too far. For whatever reason, when my zoom reaches a certain point – the whole background zooms in and then zooms out and goes crazy. Like this :
When it should look like this :
However, it does work on my device if I change this :
#ifdef GL_ES
precision mediump float;
#endif
to this :
#ifdef GL_ES
precision highp float;
#endif
The problem with this is that it also decreases my FPS from 60 down to ~40.
I believe the problem is that my Android device's OpenGL version is "OpenGL ES 3.0" according to Gdx.gl.glGetString(GL20.GL_VERSION).
But I cannot figure out how to set my version to OpenGL 2.0 since the AndroidApplicationConfiguration class is giving me little to no options.
I've tried putting <uses-feature android:glEsVersion="0x00020000" android:required="true" /> in the manifest, but it still prints "OpenGL ES 3.0".
And I still don't even know if this is actually the cause of the problem or not, so that's why I'm asking here. Thank you for taking the time to read/answer my question :).
P.S. Here's the Shader code:
#ifdef GL_ES
precision mediump float;
#endif
const int MAX_HOLES = 4;
uniform sampler2D u_sampler2D;
varying vec2 vTexCoord0;
struct BlackHole {
vec2 position;
float radius;
float deformRadius;
};
uniform vec2 screenSize;
uniform vec2 cameraPos;
uniform float cameraZoom;
uniform BlackHole blackHole[MAX_HOLES];
void main() {
vec2 pos = vTexCoord0;
float black = 0.0;
for (int i = 0; i < MAX_HOLES; i++) {
BlackHole hole = blackHole[i];
vec2 position = (hole.position - cameraPos.xy) / cameraZoom + screenSize*0.5;
float radius = hole.radius / cameraZoom;
float deformRadius = hole.deformRadius / cameraZoom;
vec2 deltaPos = vec2(position.x - gl_FragCoord.x, position.y - gl_FragCoord.y);
float dist = length(deltaPos);
float distToEdge = max(deformRadius - dist, 0.0);
float dltR = max(sign(radius - dist), 0.0);
black = min(black+dltR, 1.0);
pos += (distToEdge * normalize(deltaPos) / screenSize);
}
gl_FragColor = (1.0 - black) * texture2D(u_sampler2D, pos) + black * vec4(0, 0, 0, 1);
}
As you have found the issue is down to a lack of precision in fp16 (mediump), which is fixed by using fp32 (highp). Most maths units will have double the throughput for fp16 vs fp32, which also explains the drop in performance.
Querying the driver GLES version will return maximum supported version, not the version of the current EGL context, so what you are seeing is expected.
Also please note that "highp" is optional in OpenGL ES 2.0 fragment shaders, so there is no guarantee that your shader will work on some GPUs in an OpenGL ES 2.0 context. The Mali-4xx series only support fp16 fragment shaders, for example (I think also some of the OpenGL ES 2.0 Vivante GPUs based on past experience).
In OpenGL ES 3.0 highp is mandatory in fragment shaders, so it would be guaranteed to work there.
Related
I would like to be able to pass more per-vertex-data to my own custom shaders in kivy than the usual vertex coords + texture coords. Specifically, I would like to pass a value that says which animation frame should be used in selecting the texture coords.
I found an example (http://shadowmint.blogspot.com/2013/10/kivy-textured-quad-easy-right-no.html), and succeeded in changing the format of the vertices passed to a mesh using an argument to the constructor of the Mesh, like this:
Mesh(mode = 'triangles', fmt=[('v_pos', 2, 'float'),
('v_tex0', 2, 'float'),
('v_frame_i', 1, 'float')]
I can then set the vertices to be drawn to something like this:
vertices = [x-r,y-r, uvpos[0],uvpos[1],animationFrame,
x-r,y+r, uvpos[0],uvpos[1]+uvsize[1],animationFrame,
x+r,y-r, uvpos[0]+uvsize[0],uvpos[1],animationFrame,
x+r,y+r, uvpos[0]+uvsize[0],uvpos[1]+uvsize[1],animationFrame,
x+r,y-r, uvpos[0]+uvsize[0],uvpos[1],animationFrame,
x-r,y+r, uvpos[0],uvpos[1]+uvsize[1],animationFrame,
]
..this works well when I run in Ubuntu, but when I run on my android device the drawn texture either doesn't draw, or it looks like the vertex or texture coordinate data is corrupt / not aligned or something.
Here is my shader code in case that is relevant. Again, this all behaves as I want it to when I run in ubuntu, but not when I run on android device.
---VERTEX SHADER---
#ifdef GL_ES
precision highp float;
#endif
/* vertex attributes */
attribute vec2 v_pos;
attribute vec2 v_tex0;
attribute float v_frame_i; // for animation
/* uniform variables */
uniform mat4 modelview_mat;
uniform mat4 projection_mat;
uniform vec4 color;
uniform float opacity;
uniform float sqrtNumFrames; // the width/height of the sprite-sheet
uniform float frameWidth;
/* Outputs to the fragment shader */
varying vec4 frag_color;
varying vec2 tc;
void main() {
frag_color = color * vec4(1.0, 1.0, 1.0, opacity);
gl_Position = projection_mat * modelview_mat * vec4(v_pos.xy, 0.0, 1.0);
float f = round(v_frame_i);
tc = v_tex0;
float w = (1.0/sqrtNumFrames);
tc *= w;
tc.x += w*mod(f,sqrtNumFrames); //////////// I think that the problem might
tc.y += w*round(f / sqrtNumFrames); ///////////// be related to this code, here?
}
---FRAGMENT SHADER---
#ifdef GL_ES
precision highp float;
#endif
/* Outputs from the vertex shader */
varying vec4 frag_color;
varying vec2 tc;
/* uniform texture samplers */
uniform sampler2D texture0;
uniform vec2 player_pos;
uniform vec2 window_size; // in pixels
void main (void){
gl_FragColor = frag_color * texture2D(texture0, tc);
}
I wonder if it may have to do with a version of GLSL and int / float math (in particular in identifying which image from the sprite sheet to draw, see the comments in the glsl code. One version is running on my desktop and another on the device?
Any suggestions for things to experiment with would be much appreciated!
After looking at the log from the running version on the android device (a Moto X phone), I saw that the custom shader was not linking. This appeared to be due to the use of the function round(x), which I replaced with floor(x+0.5) in both cases, and the shader now works on the phone and my desktop properly.
I think the problem is that the version of GLSL supported on the phone and on my PC are different..but I am not 100% certain about this.
I'm attempting to create an alpha radial gradient effect (kind of lighting) using a simple shader.
The effect is created correctly, however the gradient is not smooth.
The precision is set to highp, so I don't really know where to look.
This shader is currently running on Android, using OpenGL ES 2.0.
This is how the gradient currently looks like:
And this is my current shader:
Vertex:
precision highp float;
attribute vec4 vPosition;
attribute vec2 vStaticInterpolation;
varying vec2 interpolator;
void main() {
interpolator = vStaticInterpolation;
gl_Position = vPosition;
}
Fragment:
precision highp float;
uniform float alphaFactor;
varying vec2 interpolator;
float MAX_ALPHA = 0.75;
void main() {
float x = distance(interpolator, vec2(0.0, 0.0));
float alpha = MAX_ALPHA - MAX_ALPHA * x;
alpha = max(alpha, 0.0);
gl_FragColor = vec4(0.925, 0.921, 0.843, alpha);
gl_FragColor.a *= alphaFactor;
}
The shader receives constant attributes for interpolation (from -1.0 to 1.0) in vStaticInterpolation.
The actual color is currently hard-coded in the shader.
It looks to be related to a dithering problem.
This could depend on the OpenGL driver implementation of your mobile device (though I don't know which model you are currently using). In the past it used to be an issue.
Possible tests you could perform are:
Disable Opengl dithering:
GLES20.glDisable(GLES20.GL_DITHER);
Impose an RGB888 surface when you create the surface. It is usually done in the ConfigChooser function. I try to remember by hard, this is part of the code of my application:
new AndroidGL.ConfigChooser(8, 8, 8, 8, depth, stencil) :
I have modified a working grey_scale fragment shader to change all of the non-transparent pixels to purple. For some reason it works great on iOS but on Android the transparent parts of the image are visible (although mostly transparent). Can anybody see what I am doing wrong?
The working grey_scale shader contains the lines that are commented out. I added the last line.
#ifdef GL_ES
precision mediump float;
#endif
varying vec4 v_fragmentColor;
varying vec2 v_texCoord;
void main(void)
{
vec4 c = texture2D(CC_Texture0, v_texCoord);
//gl_FragColor.xyz = vec3(0.2126*c.r + 0.7152*c.g + 0.0722*c.b);
//gl_FragColor.w = c.w;
gl_FragColor = vec4(0.5, 0.0, 0.4, c.w);
}
It turns out that I need to apply the alpha to all of the colors:
gl_FragColor = vec4(0.5*c.w, 0.0*c.w, 0.4*c.w, c.w);
I am sure why the old method didn't work. The image uses premultiplied alpha, so I guess the cocos render assumes it (or was somehow told to use it by TexturePacker). So the shader needs to re-multiply the color values by the alpha in order to behave the same way?
A screenshot will be worth a thousands of words.
Are you sure GL_ES is defined? If not, you will have unspecified precision for float (according to specs, it is unspecified for fragment shaders) which can lead even to compilation errors on certain OpenGL ES drivers. I'd play around with that float precision first and see the difference on Android.
I'm not sure about vec3() w/ single parameter. Does the following notation work:
float a = 0.2126*c.r + 0.7152*c.g + 0.0722*c.b;
gl_FragColor.xyz = vec3(a, a, a);
Or this, with a single assignment of gl_FragColor:
float a = 0.2126*c.r + 0.7152*c.g + 0.0722*c.b;
gl_FragColor.xyz = vec4(a, a, a, c.w);
On a side note, you may want to declare numeric literals as constants in order not to consume uniform space. More info: Declaring constants instead of literals in vertex shader. Standard practice, or needless rigor?
Like this:
const float COEFF_R = 0.2126;
const float COEFF_G = 0.7152;
const float COEFF_B = 0.0722;
float a = COEFF_R*c.r + COEFF_G*c.g + COEFF_B*c.b;
gl_FragColor.xyz = vec4(a, a, a, c.w);
I am using a plasma shader in my Android (libGDX) app, which I found from here:
http://www.bidouille.org/prog/plasma
Here is my shader (slightly modified):
#define LOWP lowp
precision mediump float;
#else
define LOWP
#endif
#define PI 3.1415926535897932384626433832795
uniform float time;
uniform float alpha;
uniform vec2 scale;
void main() {
float v = 0.0;
vec2 c = gl_FragCoord.xy * scale - scale/2.0;
v += sin((c.x+time));
v += sin((c.y+time)/2.0);
v += sin((c.x+c.y+time)/2.0);
c += scale/2.0 * vec2(sin(time/3.0), cos(time/2.0));
v += sin(sqrt(c.x*c.x+c.y*c.y+1.0)+time);
v = v/2.0;
vec3 col = vec3(1, sin(PI*v), cos(PI*v));
gl_FragColor = vec4(col *.5 + .5, alpha);
}
I am rendering it via an ImmediateModeRendererGL20, as a quad.
However, it simply appears to be way too slow for my needs. I am trying to fill almost the whole screen on my Nexus 7 (first gen) with the shader, and I cannot get even close to 60 FPS.
This is really my first real trip onto the GLSL world, and I have no idea how these things usually should perform!
I am wondering how one could optimize this shader? I really don't care about the image quality, I can sacrifice it. I came to a conclusion that somekind of lookup table might be what I am after and/or dropping the resolution of the shader..? But I am not quite sure where to begin. I am still very new to GLSL and "low-level" programming has never really been my cup of tea, but I am eager to at least try!
The ultimate way to speed this up is to pre-calculate computation-heavy stuff (sin() ans cos() stuff) and bake it into texture(s), then get ready values from them. This will make it super-fast because your shader won't make any heavy computations but consume pre-calculated values.
As was stated in comments, you can optimize performance by moving certain calculations to vertex shader. For example, you can move this line
vec2 c = gl_FragCoord.xy * scale - scale/2.0;
to vertex shader without sacrificing any quality because this is a linear function so interpolation won't distort it.
Do it like this:
// vertex
...
uniform float scale;
varying mediump vec2 c;
const float TWO = 2.0;
...
void main() {
...
gl_FragCoord = ...
c = gl_FragCoord.xy * scale - scale / TWO;
...
}
// fragment
...
varying mediump vec2 c;
...
void main() {
...
// just use `c` as usual
...
}
Also, please use constants instead of literals - literals use your uniform space. While this may not affect performance it is still bad (if you use too much of them you can run out of max uniforms on certain GPUs). More on this: Declaring constants instead of literals in vertex shader. Standard practice, or needless rigor?
I have the following code in the Fragment Shader:
precision lowp float;
varying vec2 v_texCoord;
uniform sampler2D s_texture;
uniform bool color_tint;
uniform float color_tint_amount;
uniform vec4 color_tint_color;
void main(){
float gradDistance;
vec4 texColor, gradColor;
texColor = texture2D(s_texture, v_texCoord);
if (color_tint){
gradColor = color_tint_color;
gradColor.a = texColor.a;
texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
}
gl_FragColor = texColor;
}
The code works fine, but it is interesting that even all color_tint I passed in is false, the above code still cause serious drag in performance. When comparing to:
void main(){
float gradDistance;
vec4 texColor, gradColor;
texColor = texture2D(s_texture, v_texCoord);
if (false){
gradColor = color_tint_color;
gradColor.a = texColor.a;
texColor = gradColor * color_tint_amount + texColor * (1.0 - color_tint_amount);
}
gl_FragColor = texColor;
}
Which the later one can achieve 40+ fps while the first one is about 18 fps. I double checked and all color_tint passed in the first one are false so the block should never executed.
BTW, I am programming the above in Android 2.2 using GLES20.
Could any expert know what's wrong with the shader?
I am not an expert in fragment shaders, but I assume the second one would be faster because the entire if statement could be removed at compile time because it is never true. In the first one it can't tell that color_tint is always false until runtime so will need to check that and branch every time. Branches can be expensive, especially on graphics hardware that is often designed for predictable serial programming.
I suggest you try rewriting it to be branchless - Darren's answer has some good suggestions in that direction.
Branches are very slow on fragment shaders avoid them if possible. Use color_tint_amount of 0 for no tint. Premultiply the color_tint_color and save a multiply per pixel. Make color_tint_amount = 1.0 - color_tint_amount. (so now 1.0 means no gradColor) These shaders and run millions upon millions of times a second, you have to save every cycle you can.