Issue with shader execution flow on PowerVR GPU - android

I've encountered a problem which I believe is related to optimization of GLSL compilation of PowerVR GPUs. On Adreno and Tegra GPUs fragment shader works just fine, but on PowerVR (Motorola Droid) it produces incorrect result in conditional statement.
I've fixed the problem by changing conditional statement in fragment shader code. Instead of calling return in block for if statement I've added else block and it works OK on PowerVR now.
Logic of both shaders is absolutely identical, gl_FragColor is set in both cases.
Please explain this behavior of PowerVR OpenGL driver so I can avoid problems in future. Why does it handle conditional statements this way?
Here is the old fragment shader, which works incorrectly on PowerVR GPUs:
precision mediump float;
varying vec3 vNormal;
varying vec3 vViewVec;
varying vec2 vTextureCoord;
uniform sampler2D sTexturePumpkin;
void main(void)
{
const float sheen = 0.68;
const float noiseScale = 0.05;
const float furriness = 10.0;
const vec4 lightDir = vec4(0.267260, 0.267260, -0.925820, 0.0);
vec4 color = texture2D(sTexturePumpkin, vTextureCoord/*vec2(0.0,0.0)*/);
if(vTextureCoord.y > 0.7) { // in this case PowerVR displays incorrect color
gl_FragColor = color;
return;
}
float diffuse = 0.5 * (1.0 + dot(vNormal, vec3(lightDir.x, lightDir.y, -lightDir.z)));
float cosView = clamp(dot(normalize(vViewVec), vNormal), 0.0, 1.0);
float shine = pow(1.0 - cosView * cosView, furriness);
gl_FragColor = (color + sheen * shine) * diffuse; // in this case PowerVR works correctly
}
The new fragment shader code, which works fine on both Adreno and PowerVR GPU:
precision mediump float;
varying vec3 vNormal;
varying vec3 vViewVec;
varying vec2 vTextureCoord;
uniform sampler2D sTexturePumpkin;
void main(void)
{
const float sheen = 0.68;
const float noiseScale = 0.05;
const float furriness = 10.0;
const vec4 lightDir = vec4(0.267260, 0.267260, -0.925820, 0.0);
vec4 color = texture2D(sTexturePumpkin, vTextureCoord/*vec2(0.0,0.0)*/);
if(vTextureCoord.y > 0.7) {
gl_FragColor = color;
}
else {
float diffuse = 0.5 * (1.0 + dot(vNormal, vec3(lightDir.x, lightDir.y, -lightDir.z)));
float cosView = clamp(dot(normalize(vViewVec), vNormal), 0.0, 1.0);
float shine = pow(1.0 - cosView * cosView, furriness);
gl_FragColor = (color + sheen * shine) * diffuse;
}
}

OK so after deeper investigation I have found that it's not a bug of shader compiler but a specific way of processing fragment shaders execution. It is generally a bad idea to put a simple discard; or return; statement with some code coming after it. It still can be executed and cause unpredictable result.
There are some articles that explains this behavior applicable to discard; statement, and as I see the similar behavior can happen with return; too.
Please read one here: http://people.freedesktop.org/~idr/OpenGL_tutorials/03-fragment-intro.html#infinite-loop
As is it said here, in certain cases you can even achieve infinite loop by incorrect usage of discard;.
Fragment shaders are executed by GPU not for a single texel at a time, usually in batches of 2x2 pixels. And this parallel running of fragment shader can cause code after return;.
So, for fragment shader to handle if statements correctly, you have to always use else in if operators, not simply exiting function by return; or discard;. This is exactly what I've done.

Related

opengl-es pre pixel lighting issue

there is a problem that i just can't seem to get a handle on..
i have a fragment shader:
precision mediump float;
uniform vec3 u_AmbientColor;
uniform vec3 u_LightPos;
uniform float u_Attenuation_Constant;
uniform float u_Attenuation_Linear;
uniform float u_Attenuation_Quadradic;
uniform vec3 u_LightColor;
varying vec3 v_Normal;
varying vec3 v_fragPos;
vec4 fix(vec3 v);
void main() {
vec3 color = vec3(1.0,1.0,1.0);
vec3 vectorToLight = u_LightPos - v_fragPos;
float distance = length(vectorToLight);
vec3 direction = vectorToLight / distance;
float attenuation = 1.0/(u_Attenuation_Constant +
u_Attenuation_Linear * distance + u_Attenuation_Quadradic * distance * distance);
vec3 diffuse = u_LightColor * attenuation * max(normalize(v_Normal) * direction,0.0);
vec3 d = u_AmbientColor + diffuse;
gl_FragColor = fix(color * d);
}
vec4 fix(vec3 v){
float r = min(1.0,max(0.0,v.r));
float g = min(1.0,max(0.0,v.g));
float b = min(1.0,max(0.0,v.b));
return vec4(r,g,b,1.0);
}
i've been following some tutorial i found on the web,
anyways, the ambientColor and lightColor uniforms are (0.2,0.2,0.2), and (1.0,1.0,1.0)
respectively. the v_Normal is calculated at the vertex shader using the
inverted transposed matrix of the model-view matrix.
the v_fragPos is the model result of multiplying the position with the normal model-view matrix.
now, i expect that when i move the light position closer to the cube i render, it will just appear brighter, but the resulting image is very different:
(the little square there is an indicator for the light position)
now, i just don't understand how this can happen?
i mean, i multiply the color components each by the SAME value..
so, how is it that it seems to vary so??
EDIT: i noticed that if i move the camera in front of the cube, the light is just shades of blue.. which is the same problem but maybe it's a clue i don't know..
The Lambertian reflectance is computed with the Dot product of the normal vector and the vector to the light source, instead of the component wise product.
See How does the calculation of the light model work in a shader program?
Use the dot function instead of the * (multiplication) operator:
vec3 diffuse = u_LightColor * attenuation * max(normalize(v_Normal) * direction,0.0);
vec3 diffuse = u_LightColor * attenuation * max(dot(normalize(v_Normal), direction), 0.0);
You can simplify the code in the fix function. min and max can be substituted with clamp. This functions work component wise, so they do not have to be called separately for each component:
vec4 fix(vec3 v)
{
return vec4(clamp(v, 0.0, 1.0), 1.0);
}

Anisotropic lighting in OpenGL ES 2.0/3.0. Black artifacts

I am trying to implement anisotropic lighting.
Vertex shader:
#version 300 es
uniform mat4 u_mvMatrix;
uniform mat4 u_vMatrix;
in vec4 a_position;
in vec3 a_normal;
...
out lowp float v_DiffuseIntensity;
out lowp float v_SpecularIntensity;
const vec3 lightPosition = vec3(-1.0, 0.0, 5.0);
const lowp vec3 grainDirection = vec3(15.0, 2.8, -1.0);
const vec3 eye_positiion = vec3(0.0, 0.0, 0.0);
void main() {
// transform normal orientation into eye space
vec3 modelViewNormal = mat3(u_mvMatrix) * a_normal;
vec3 modelViewVertex = vec3(u_mvMatrix * a_position);
vec3 lightVector = normalize(lightPosition - modelViewVertex);
lightVector = mat3(u_vMatrix) * lightVector;
vec3 normalGrain = cross(modelViewNormal, grainDirection);
vec3 tangent = normalize(cross(normalGrain, modelViewNormal));
float LdotT = dot(tangent, normalize(lightVector));
float VdotT = dot(tangent, normalize(mat3(u_mvMatrix) * eye_position));
float NdotL = sqrt(1.0 - pow(LdotT, 2.0));
float VdotR = NdotL * sqrt(1.0 - pow(VdotT, 2.0)) - VdotT * LdotT;
v_DiffuseIntensity = max(NdotL * 0.4 + 0.6, 0.0);
v_SpecularIntensity = max(pow(VdotR, 2.0) * 0.9, 0.0);
...
}
Fragment shader:
...
in lowp float v_DiffuseIntensity;
in lowp float v_SpecularIntensity;
const lowp vec3 default_color = vec3(0.1, 0.7, 0.9);
void main() {
...
lowp vec3 resultColor = (default_color * v_DiffuseIntensity)
+ v_SpecularIntensity;
outColor = vec4(resultColor, 1.0);
}
Overall, the lighting works well on different devices. But an artifact appears on the SAMSUNG tablet, as shown in the figure:
It seems that the darkest place is becoming completely black. Can anyone please suggest why this is happening? Thanks for any answer/comment!
You've got a couple of expressions that risk undefined behaviour:
sqrt(1.0 - pow(LdotT, 2.0))
sqrt(1.0 - pow(VdotT, 2.0))
The pow function is undefined if x is negative. I suspect you're getting away with this because y is 2.0 so they're probably optimised to just be x * x.
The sqrt function is undefined if x is negative. Mathematically it never should be since the magnitude of the dot product of two normalized vectors should never be more than 1, but computations always have error. I think this is causing your rendering artifacts.
I'd change those two expressions to:
sqrt(max(0.0, 1.0 - pow(max(0.0, LdotT), 2.0)))
sqrt(max(0.0, 1.0 - pow(max(0.0, VdotT), 2.0)))
The code looks a lot uglier, but it's safer and max(0.0, x) is a pretty cheap operation.
Edit: Just noticed pow(VdotR, 2.0), I'd change that too.

After call glLinkProgram the app freezes

UPDATED
I'm trying to draw texture with openGL ES3 and used instanced drawing for my drawing application. This is my vertex shader
#version 300 es
precision highp float;
uniform mat3 u_Matrix;
in vec2 Position;
in vec2 TexPosition;
struct Data {
vec2 pos;
vec2 scale;
uint color;
float rotation;
};
layout(std140) uniform InstanceData {
Data data[256];
};
out vec4 v_Color;
out vec2 v_TexPosition;
void main() {
vec2 endPos = Position * data[gl_InstanceID].scale;
if (data[gl_InstanceID].rotation != 0.0) {
float cos = cos(data[gl_InstanceID].rotation);
float sin = sin(data[gl_InstanceID].rotation);
endPos = vec2(endPos.x * cos - endPos.y * sin, endPos.x * sin +
endPos.y * cos) + data[gl_InstanceID].pos;
} else {
endPos = endPos + data[gl_InstanceID].pos;
}
uint fColor = data[gl_InstanceID].color;
v_Color.r = float((fColor & 0x00FF0000U) >> 16) / 255.0;
v_Color.g = float((fColor & 0x0000FF00U) >> 8) / 255.0;
v_Color.b = float(fColor & 0x000000FFU) / 255.0;
v_Color.a = float((fColor & 0xFF000000U) >> 24) / 255.0;
v_TexPosition = TexPosition;
gl_Position = vec4(u_Matrix * vec3(endPos, 1.0), 1.0);
}
and this is my fragment_shader
#version 300 es
precision highp float;
in vec2 v_TexPosition;
in vec4 v_Color;
uniform sampler2D u_Texture2D;
out vec4 fragColor;
void main() {
vec4 color = texture(u_Texture2D, v_TexPosition);
fragColor = vec4(v_Color.rgb, color.a * v_Color.a);
}
When I created the program, attached shaders and try to link to program, the app is freezes on line glLinkProgram. Shaders and program have normal id.
This work normal on some devices (sony xperia Z -android 5.0, smasung s7 edge android 7, nexus 5x - android 7, nexus 6p - android 7) but this doesn't work on other part of devices(motX- android 5.1, smasung s5 android 6.0). All devices have android version greater then 5.0, and in code I checked for opengl ES3 supporting.
Is there some reason for this? Is it from device(how to check it)? or did I did something wrong?
I'm passed data to instanceBuffer with this way :
instanceBuffer.putFloat(posX);
instanceBuffer.putFloat(posY);
instanceBuffer.putFloat(scaleX);
instanceBuffer.putFloat(scaleY);
instanceBuffer.putInt(colorARGB);
instanceBuffer.putFloat((float) Math.toRadians(rotate));
instanceBuffer.position(instanceBuffer.position() + 8);
used +8 offsets because struct data read elements with vec4 size (16byte)
When I write my struct only with one vec4 :
struct Data {
vec4 posAndScale;
};
and pass data:
instanceBuffer.putFloat(posX);
instanceBuffer.putFloat(posY);
instanceBuffer.putFloat(scaleX);
instanceBuffer.putFloat(scaleY);
This works on all devices,
But when I added one more vec4 :
struct Data {
vec4 posAndScale;
vec4 color;
};
and pass data
instanceBuffer.putFloat(posX);
instanceBuffer.putFloat(posY);
instanceBuffer.putFloat(scaleX);
instanceBuffer.putFloat(scaleY);
instanceBuffer.putFloat(color.r);
instanceBuffer.putFloat(color.g);
instanceBuffer.putFloat(color.b);
instanceBuffer.putFloat(color.a);
app not freezes but nothing happened when I try to draw. It's seems like on some devices std140 work with different way or like data not passed to shader when wrote struct with 2 vec4-s
Ok I found some solution. This work normally since opengl es versionl 3.1. I think 3.0 version doesn't support struct data which contains float or int element.
I experienced the same issue. The Nexus 7 (2013) was freezing when I called gllinkprogram(). I found that this only happened when I had 'if statements' in my shader. I was able to change both of my 'if statements' into 'conditional operators' and it worked.
E.g. (cond)? cond1:cond2

Bilinear filter on Android using OpenGL ES 2.0

Here is my code.It runs well on PC/Windows,but jagged on Android 4.42 when I magnify the image.
#ifdef GL_ES
precision highp float;
#endif
varying vec4 v_fragmentColor;
varying vec2 v_texCoord;
uniform float u_width; //width of image
uniform float u_height; //height of image
void main()
{
float texelSizeX = 1.0/u_width;
float texelSizeY = 1.0/u_height;
//four pixels' color
vec4 p0q0 = texture2D(CC_Texture0, v_texCoord);
vec4 p1q0 = texture2D(CC_Texture0, v_texCoord + vec2(texelSizeX, 0));
vec4 p0q1 = texture2D(CC_Texture0, v_texCoord + vec2(0, texelSizeY));
vec4 p1q1 = texture2D(CC_Texture0, v_texCoord + vec2(texelSizeX , texelSizeY));
//bilinear interpolation
float a = fract(v_texCoord.s * u_width);
float b = fract(v_texCoord.t * u_height);
vec4 color_q0 = mix( p0q0, p1q0, a );
vec4 color_q1 = mix( p0q1, p1q1, a );
vec4 color = mix( color_q0, color_q1, b);
gl_FragColor = v_fragmentColor * color;
}
I'm sorry that I cannot post pictures. I debug the code well with VS2012, and the image seems smooth.
But when I run the program on Android, the image is full of jag. I don't know why.
Obvious question: why are you doing bilinear filtering in your shader and not just using the built-in hardware bilinear filtering? I'm sure there's a good reason, but telling us that might help you avoid a lot of questions along the lines of "have you set your filtering mode appropriately?"
That being said, it's likely to be a precision problem. You probably want to round v_texCoord to the exact sampling site as I'd guess that you have GL_NEAREST filtering set, to disable the hardware bilinear filtering, but due to precision problems e.g. v_texCoord + vec2(texelSizeX, 0) is then sampling the same texel rather than the next one along when v_texCoord is close to 0, or possibly the sample taken at v_texCoord is the next texel along when it's close to 1, or something along those lines.
OpenGL considers the centre of a texel to be its location. So if you were in 1d you could do something like:
r_texCoord.x = v_texCoord.x - mod(v_texCoord.x, 1.0/u_width) + 0.5/u_width;
Or if you were happy to use integral texture coordinates rather than the normal OpenGL [0.0, 1.0) range then you could simplify slightly because floor (and indeed ceil) already knows how to move you to an integral boundary:
(floor(v_texCoord.x) + 0.5) / u_width
... both of which are dependent reads so performance will suffer quite a bit.

Very slow fract operation on Galaxy SII and SIII

My terrain uses shader which itself uses four different textures. It runs fine on windows and linux machines, but on android it gets only ~25FPS on both galaxies. I thought, that textures are the problem, but no, as it appears the problem is with the part where I divide texture coordinates and use frac to get tiled coordinates. Without it, I get 60FPS.
// Material data.
//uniform vec3 uAmbient;
//uniform vec3 uDiffuse;
//uniform vec3 uLightPos[8];
//uniform vec3 uEyePos;
//uniform vec3 uFogColor;
uniform sampler2D terrain_blend;
uniform sampler2D grass;
uniform sampler2D rock;
uniform sampler2D dirt;
varying vec2 varTexCoords;
//varying vec3 varEyeNormal;
//varying float varFogWeight;
//------------------------------------------------------------------
// Name: fog
// Desc: applies calculated fog weight to fog color and mixes with
// specified color.
//------------------------------------------------------------------
//vec4 fog(vec4 color) {
// return mix(color, vec4(uFogColor, 1.0), varFogWeight);
//}
void main(void)
{
/*vec3 N = normalize(varEyeNormal);
vec3 L = normalize(uLightPos[0]);
vec3 H = normalize(L + normalize(uEyePos));
float df = max(0.0, dot(N, L));
vec3 col = uAmbient + uDiffuse * df;*/
// Take color information from textures and tile them.
vec2 tiledCoords = varTexCoords;
//vec2 tiledCoords = fract(varTexCoords / 0.05); // <========= HERE!!!!!!!!!
//vec4 colGrass = texture2D(grass, tiledCoords);
vec4 colGrass = texture2D(grass, tiledCoords);
//vec4 colDirt = texture2D(dirt, tiledCoords);
vec4 colDirt = texture2D(dirt, tiledCoords);
//vec4 colRock = texture2D(rock, tiledCoords);
vec4 colRock = texture2D(rock, tiledCoords);
// Take color information from not tiled blend map.
vec4 colBlend = texture2D(terrain_blend, varTexCoords);
// Find the inverse of all the blend weights.
float inverse = 1.0 / (colBlend.r + colBlend.g + colBlend.b);
// Scale colors by its corresponding weight.
colGrass *= colBlend.r * inverse;
colDirt *= colBlend.g * inverse;
colRock *= colBlend.b * inverse;
vec4 final = colGrass + colDirt + colRock;
//final = fog(final);
gl_FragColor = final;
}
Note: there's some more code for light calculation and fog, but it isn't used. I indicated the line that, when uncommented, causes massive lag. I tried using floor and calculating fractional part manually, but lag is the same. What might be wrong?
EDIT: Now here's what I don't understand.
This:
vec2 tiledCoords = fract(varTexCoords * 2.0);
Runs great.
This:
vec2 tiledCoords = fract(varTexCoords * 10.0);
Runs average on SIII.
This:
vec2 tiledCoords = fract(varTexCoords * 20.0);
Lags...
This:
vec2 tiledCoords = fract(varTexCoords * 100.0);
Well 5FPS is still better than I expected...
So what gives? Why is this happening? To my understanding this shouldn't make any difference. But it does. And a huge one.
I would run your code on a profiler (check Mali-400), but by the looks of it, you are killing the texture cache. For the first pixel computed, all those 4 texture look-ups are fetched but also the contiguous data is also fetched into the texture cache. For the next pixel, you are not accessing data in the cache but looking quite far (10, 20..etc) which completely defies the purpose of such a cache.
This of course a guess, without proper profiling is hard to tell.
EDIT: #harism also pointed you to that direction.

Categories

Resources