I have a fragment shader for blur skin. Input is a YUV texture (it include y texture, u texture, v texture). With frame resolution is 1280x720 run on galaxy A6, it take 80ms-120ms/frame.
I found the bottle neck at greenValue() function call. It take most of time if I call it like the following code. If I only call "sampleColor = greenValue(blurCoordinates[0]);\n" ... (not "sampleColor += greenValue(blurCoordinates[0]);\n" ... it will very fast.
+ "float greenValue(vec2 coord)\n"
+ "{\n"
+ " return texture2D(y_tex, coord).r - 0.344 * (texture2D(u_tex, coord).r - 0.5) - 0.714 * (texture2D(v_tex, coord).r - 0.5);\n"
+ "}\n";
+ "// some code .... if (current pixel is skin color)"
+ "vec2 blurCoordinates[20];\n"
+ "blurCoordinates[0] = interp_tc.xy + singleStepOffset * vec2(0.0, -10.0);\n" +
+ "blurCoordinates[1] = interp_tc.xy + singleStepOffset * vec2(0.0, 10.0);\n" +
+ "blurCoordinates[2] = interp_tc.xy + singleStepOffset * vec2(-10.0, 0.0);\n" +
+ "blurCoordinates[3] = interp_tc.xy + singleStepOffset * vec2(10.0, 0.0);\n" +
+ "blurCoordinates[4] = interp_tc.xy + singleStepOffset * vec2(5.0, -8.0);\n" +
+ "blurCoordinates[5] = interp_tc.xy + singleStepOffset * vec2(5.0, 8.0);\n" +
+ "blurCoordinates[6] = interp_tc.xy + singleStepOffset * vec2(-5.0, 8.0);\n" +
+ "blurCoordinates[7] = interp_tc.xy + singleStepOffset * vec2(-5.0, -8.0);\n" +
+ "blurCoordinates[8] = interp_tc.xy + singleStepOffset * vec2(8.0, -5.0);\n" +
+ "blurCoordinates[9] = interp_tc.xy + singleStepOffset * vec2(8.0, 5.0);\n" +
+ "blurCoordinates[10] = interp_tc.xy + singleStepOffset * vec2(-8.0, 5.0);\n" +
+ "blurCoordinates[11] = interp_tc.xy + singleStepOffset * vec2(-8.0, -5.0);\n" +
+ "blurCoordinates[12] = interp_tc.xy + singleStepOffset * vec2(0.0, -6.0);\n" +
+ "blurCoordinates[13] = interp_tc.xy + singleStepOffset * vec2(0.0, 6.0);\n" +
+ "blurCoordinates[14] = interp_tc.xy + singleStepOffset * vec2(6.0, 0.0);\n" +
+ "blurCoordinates[15] = interp_tc.xy + singleStepOffset * vec2(-6.0, 0.0);\n" +
+ "blurCoordinates[16] = interp_tc.xy + singleStepOffset * vec2(-4.0, -4.0);\n" +
+ "blurCoordinates[17] = interp_tc.xy + singleStepOffset * vec2(-4.0, 4.0);\n" +
+ "blurCoordinates[18] = interp_tc.xy + singleStepOffset * vec2(4.0, -4.0);\n" +
+ "blurCoordinates[19] = interp_tc.xy + singleStepOffset * vec2(4.0, 4.0);\n";
+ "// some code ...."
+ "float sampleColor = centralColor.g * 20.0;\n"
+ "sampleColor += greenValue(blurCoordinates[0]);\n"
+ "sampleColor += greenValue(blurCoordinates[1]);\n"
+ "sampleColor += greenValue(blurCoordinates[2]);\n"
+ "sampleColor += greenValue(blurCoordinates[3]);\n"
+ "sampleColor += greenValue(blurCoordinates[4]);\n"
+ "sampleColor += greenValue(blurCoordinates[5]);\n"
+ "sampleColor += greenValue(blurCoordinates[6]);\n"
+ "sampleColor += greenValue(blurCoordinates[7]);\n"
+ "sampleColor += greenValue(blurCoordinates[8]);\n"
+ "sampleColor += greenValue(blurCoordinates[9]);\n"
+ "sampleColor += greenValue(blurCoordinates[10]);\n"
+ "sampleColor += greenValue(blurCoordinates[11]);\n"
+ "sampleColor += greenValue(blurCoordinates[12]) * 2.0;\n"
+ "sampleColor += greenValue(blurCoordinates[13]) * 2.0;\n"
+ "sampleColor += greenValue(blurCoordinates[14]) * 2.0;\n"
+ "sampleColor += greenValue(blurCoordinates[15]) * 2.0;\n"
+ "sampleColor += greenValue(blurCoordinates[16]) * 2.0;\n"
+ "sampleColor += greenValue(blurCoordinates[17]) * 2.0;\n"
+ "sampleColor += greenValue(blurCoordinates[18]) * 2.0;\n"
+ "sampleColor += greenValue(blurCoordinates[19]) * 2.0;\n";
Are there a better solution to this problem, or any way to optimize this shader?
Update
If frame resoluton is 1280x720, number of texture2D() calling = 1280x720x20x3 = 55,296,000 (calling) / frame
If I replace (u_tex and v_tex) by y_tex (for test), it's very fast. So, maybe access 3 difference texture make bottle neck.
You're doing 60 texture samples per pixel on an entry level phone! Yes, it's going to be slow.
Not sure which device variant you have, but assuming it's the Mali-T830 MP1 one then you can do 1 texture sample per clock cycle. Assuming it's around 550Mhz for sake of argument and easy maths, then:
550M / 55M samples = 100ms / frame
... which is about what you are seeing. This shader is just far too complex for this device.
Flipping this around you can set a budget based on what you want to achieve. I.e. at 60 FPS
550M / 60 = 9.16M cycles/frame.
9.16M / (1280 * 720) = 9.95 samples per pixel.
For YUV data try to pull it directly from an external sampler. For this you can sample directly from the original YUV data including color conversion, on a newer device this will be much faster than hand rolling your own sampling code as most GPUs can sample YUV natively (Note - it won't help much on a Mali-T830 though).
Are there a better solution to this problem, or any way to optimize this shader?
Learn to use a profiler - helps avoid guess work.
For Mali GPUs, see the Streamline profiler in Arm Mobile Studio (https://developer.arm.com/mobile-studio).
If I only call "sampleColor = greenValue(blurCoordinates[0]);\n" ... (not "sampleColor += greenValue(blurCoordinates[0]);\n" ... it will very fast.
Yes, the compiler will optimize out 19 out of the 20 samples, so you're doing 5% of the work, so would expect it to be 20x faster.
If I replace (u_tex and v_tex) by y_tex (for test), it's very fast.
This makes the three texture operations identical, so you remove 2 out of 3 texture samples, and the compiler can merge a lot of the maths.
Related
I'm trying to Gaussian Blur to live video preview and than add a Dilation Effect to it.
I've tried combining the code and it doesn't seem to work right.
It's just doing the dilation effect but I don't see the blur.
// Gaussian Blur code
+" vec2 singleStepOffset = vec2("+UNIFORM_TEXELWIDTH+", "+UNIFORM_TEXELHEIGHT+");\n"
+" int multiplier = 0;\n"
+" vec2 blurStep = vec2(0,0);\n"
+" vec2 blurCoordinates[9];"
+" for(int i = 0; i < 9; i++) {\n"
+" multiplier = (i - 4);\n"
+" blurStep = float(multiplier) * singleStepOffset;\n"
+" blurCoordinates[i] = "+VARYING_TEXCOORD+".xy + blurStep;\n"
+" }\n"
+" vec3 sum = vec3(0,0,0);\n"
+" vec4 color = texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[4]);\n"
+" sum += texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[0]).rgb * 0.05;\n"
+" sum += texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[1]).rgb * 0.09;\n"
+" sum += texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[2]).rgb * 0.12;\n"
+" sum += texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[3]).rgb * 0.15;\n"
+" sum += color.rgb * 0.18;\n"
+" sum += texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[5]).rgb * 0.15;\n"
+" sum += texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[6]).rgb * 0.12;\n"
+" sum += texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[7]).rgb * 0.09;\n"
+" sum += texture2D("+UNIFORM_TEXTURE0+", blurCoordinates[8]).rgb * 0.05;\n"
+" gl_FragColor = vec4(sum, color.a);\n"
// Dilation Effect
+" vec2 step = vec2("+UNIFORM_TEXELWIDTH+", "+UNIFORM_TEXELHEIGHT+");\n"
+" vec4 stepIntensity[dilationSize];\n"
+" for(int i = 0; i < dilationSize; i++) {\n"
+" stepIntensity[i] = texture2D("+UNIFORM_TEXTURE0+", "+VARYING_TEXCOORD+" + step * float(i - dilationRadius));\n"
+" }\n"
+" vec4 minValue = vec4(1.0);\n"
+" for(int i = 0; i < dilationSize; i++) {\n"
+" minValue = min(minValue, stepIntensity[i]);\n"
+" }\n"
+"gl_FragColor = minValue;\n"
I expected it to blur it to than dilate it but it just dilates the image.
Here i am trying to create a video editor with top and bottom text.I can set 2 text on top and bottom with different size, color and fonts for that text and i am converting that text to image and adding as watermark.
But when resolution of video decreases the size of watermark image increase and vise versa.I am using ffmpeg commands.Now i am calculating like as shown below.
if (video_width <= 300) {
Log.e("less than", "300");
water_resolution_bt = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_bt_width / 2 + 30) + ":" + getDPsFromPixels(getApplicationContext(), tv_bt_height / 2 + 15));
water_resolution = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_top_width / 2 + 30) + ":" + getDPsFromPixels(getApplicationContext(), tv_top_height / 2 + 15));
} else if (video_width > 300 && video_width <= 400) {
Log.e("less than", "400");
water_resolution_bt = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_bt_width / 2 + 130) + ":" + getDPsFromPixels(getApplicationContext(), tv_bt_height / 2 + 70));
water_resolution = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_top_width / 2 + 130) + ":" + getDPsFromPixels(getApplicationContext(), tv_top_height / 2 + 70));
} else if (video_width > 400 && video_width <= 600) {
Log.e("btw ", "400 and 600");
water_resolution_bt = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_bt_width + 80) + ":" + getDPsFromPixels(getApplicationContext(), tv_bt_height + 40));
water_resolution = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_top_width + 80) + ":" + getDPsFromPixels(getApplicationContext(), tv_top_height + 40));
} else if (video_width > 600 && video_width <= 1000) {
Log.e("btw ", "600 and 1000");
water_resolution_bt = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_bt_width + 100) + ":" + getDPsFromPixels(getApplicationContext(), tv_bt_height + 100));
water_resolution = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_top_width + 100) + ":" + getDPsFromPixels(getApplicationContext(), tv_top_height + 100));
} else if (video_width > 1000) {
Log.e("grthr than ", "1000");
water_resolution = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_top_width * 2 + 20) + ":" + getDPsFromPixels(getApplicationContext(), tv_top_height * 2 + 20));
water_resolution_bt = String.valueOf(getDPsFromPixels(getApplicationContext(), tv_bt_width * 2 + 20) + ":" + getDPsFromPixels(getApplicationContext(), tv_bt_height * 2 + 20));
}
This will not give accurate results.Can anyone suggest any other calculation methods.I have also tried with "drawtext" method .It also have same issue.
My code is
vec4 textureColor = texture2D(uTextureSampler, vTextureCoord);
if(textureColor.r* 0.299 + textureColor.g * 0.587 + textureColor.b * 0.114 < 0.1) {
gl_FragColor = vec4(0.0, 0.0, 0.0, 0.0);
} else {
gl_FragColor = vec4(textureColor.r, textureColor.g, textureColor.b, textureColor.w);
}
My problem is how to judge the pixel is black? How can I do that, should change rgb to hsv?
return "precision mediump float; \n"+
" varying highp vec2 " + VARYING_TEXTURE_COORD + ";\n" +
" \n" +
" uniform sampler2D " + TEXTURE_SAMPLER_UNIFORM + ";\n" +
" \n" +
" void main()\n" +
" {\n" +
" vec3 keying_color = vec3(0.0, 0.0, 0.0);\n" +
" float thresh = 0.45; // [0, 1.732]\n" +
" float slope = 0.1; // [0, 1]\n" +
" vec3 input_color = texture2D(" + TEXTURE_SAMPLER_UNIFORM + ", " + VARYING_TEXTURE_COORD + ").rgb;\n" +
" float d = abs(length(abs(keying_color.rgb - input_color.rgb)));\n" +
" float edge0 = thresh * (1.0 - slope);\n" +
" float alpha = smoothstep(edge0, thresh, d);\n" +
" gl_FragColor = vec4(input_color,alpha);\n" +
" }";
In the keying_color variable is stored the actual color we want to replace. It is using classic RGB model, but intensity is not expressed as 0-255 integer. It is a float value in range 0-1. (So 0 = 0, 255 = 0, 122 = 0.478…) In our case, the green color has value (0.647, 0.941, 0.29), but if you are using different video, measure the color yourself.
Note: Make sure you have the right color. Some color measurement software automatically converts colors to slightly different formats, such as AdobeRGB.
So where’s the magic?
We load current pixel color in the input_color, then calculate difference between input and keying color. Based on this difference, alpha value is calculated and used for specific pixel.
You can control how strict the comparison is using the slope and threshold values. It is a bit more complicated, but the most basic rule is: The more threshold you have, the bigger tolerance.
So, we are done, right? Nope.
You can look this link: http://blog.csdn.net/u012847940/article/details/47441923
I built video effect application using OpenGl ES fragment shader.
But, this application had 1282 error and that occurred at glUniformMatrix4fv() in my case.
GLES20.glViewport(0, 0, getWidth(), getHeight());
GLES20.glClearColor(0.0f, 1.0f, 0.0f, 1.0f);
GLES20.glClear(GLES20.GL_DEPTH_BUFFER_BIT | GLES20.GL_COLOR_BUFFER_BIT);
GLES20.glUseProgram(program);
GLToolbox.checkGlError("glUseProgram");
GLES20.glActiveTexture(GLES20.GL_TEXTURE0);
GLES20.glBindTexture(GLES11Ext.GL_TEXTURE_EXTERNAL_OES, textureID[0]);
triangleVertices.position(TRIANGLE_VERTICES_DATA_POS_OFFSET);
GLES20.glVertexAttribPointer(aPositionHandle, 3, GLES20.GL_FLOAT, false,
TRIANGLE_VERTICES_DATA_STRIDE_BYTES, triangleVertices);
GLToolbox.checkGlError("glVertexAttribPointer aPosition");
GLES20.glEnableVertexAttribArray(aPositionHandle);
GLToolbox.checkGlError("glEnableVertexAttribArray aPositionHandle");
triangleVertices.position(TRIANGLE_VERTICES_DATA_UV_OFFSET);
GLES20.glVertexAttribPointer(aTextureHandle, 3, GLES20.GL_FLOAT, false,
TRIANGLE_VERTICES_DATA_STRIDE_BYTES, triangleVertices);
GLToolbox.checkGlError("glVertexAttribPointer aTextureHandle");
GLES20.glEnableVertexAttribArray(aTextureHandle);
GLToolbox.checkGlError("glEnableVertexAttribArray aTextureHandle");
Matrix.setIdentityM(mVPMatrix, 0);
GLES20.glUniformMatrix4fv(uMVPMatrixHandle, 1, false, mVPMatrix, 0);
GLToolbox.checkGlError("uMVPMatrixHandle glUniformMatrix4fv");
GLES20.glUniformMatrix4fv(uSTMatrixHandle, 1, false, sTMatrix, 0);
GLToolbox.checkGlError("uSTMatrixHandle glUniformMatrix4fv");
GLES20.glDrawArrays(GLES20.GL_TRIANGLE_STRIP, 0, 4);
GLToolbox.checkGlError("glDrawArrays");
GLES20.glFinish();
I found problem at the gl_FragColor = vec4(color.rgb * weight, color.a); in my fragment shader source.
String shader = "#extension GL_OES_EGL_image_external : require\n"
+ "precision mediump float;\n"
+ "vec2 seed;\n"
+ "varying vec2 vTextureCoord;\n"
+ "uniform samplerExternalOES tex_sampler_0;\n"
+ "uniform samplerExternalOES tex_sampler_1;\n\n"
+ "float scale;\n"
+ "float stepX;\n"
+ "float stepY;\n\n"
+ "float rand(vec2 loc) {\n"
+ " float theta1 = dot(loc, vec2(0.9898, 0.233));\n"
+ " float theta2 = dot(loc, vec2(12.0, 78.0));\n"
+ " float value = cos(theta1) * sin(theta2) + sin(theta1) * cos(theta2);\n"
+ " float temp = mod(197.0 * value, 1.0) + value;\n"
+ " float part1 = mod(220.0 * temp, 1.0) + temp;\n"
+ " float part2 = value * 0.5453;\n"
+ " float part3 = cos(theta1 + theta2) * 0.43758;\n"
+ " float sum = (part1 + part2 + part3);\n" +
"\n"
+ " return fract(sum)*scale;\n"
+ "}\n"
+ "\n"
+ "void main() {\n"
+ seedString[0]
+ seedString[1]
+ scaleString
+ stepX
+ stepY
+ "\n"
+ " float noise = texture2D(tex_sampler_1, vTextureCoord + vec2(-stepX, -stepY)).r * 0.224;\n"
+ " noise += texture2D(tex_sampler_1, vTextureCoord + vec2(-stepX, stepY)).r * 0.224;\n"
+ " noise += texture2D(tex_sampler_1, vTextureCoord + vec2(stepX, -stepY)).r * 0.224;\n"
+ " noise += texture2D(tex_sampler_1, vTextureCoord + vec2(stepX, stepY)).r * 0.224;\n"
+ " noise += 0.4448;\n"
+ " noise *= scale;\n"
+ "\n"
+ " vec4 color = texture2D(tex_sampler_0, vTextureCoord);\n"
+ " float energy = 0.33333 * color.r + 0.33333 * color.g + 0.33333 * color.b;\n"
+ " float mask = (1.0 - sqrt(energy));\n"
+ " float weight = 1.0 - 1.333 * mask * noise;\n"
+ "\n"
+ " gl_FragColor = vec4(color.rgb * weight, color.a);\n"
+ "}\n";
Problem is.. original fragment shader was working in android 4.4. however, that was not working in android 6.0.
So I changed gl_FragColor = vec4(color.rgb * weight, color.a); in my fragment shader source.
String shader = "#extension GL_OES_EGL_image_external : require\n"
+ "precision mediump float;\n"
+ "vec2 seed;\n"
+ "varying vec2 vTextureCoord;\n"
+ "uniform samplerExternalOES tex_sampler_0;\n"
+ "uniform samplerExternalOES tex_sampler_1;\n\n"
+ "float scale;\n"
+ "float stepX;\n"
+ "float stepY;\n\n"
+ "float rand(vec2 loc) {\n"
+ " float theta1 = dot(loc, vec2(0.9898, 0.233));\n"
+ " float theta2 = dot(loc, vec2(12.0, 78.0));\n"
+ " float value = cos(theta1) * sin(theta2) + sin(theta1) * cos(theta2);\n"
+ " float temp = mod(197.0 * value, 1.0) + value;\n"
+ " float part1 = mod(220.0 * temp, 1.0) + temp;\n"
+ " float part2 = value * 0.5453;\n"
+ " float part3 = cos(theta1 + theta2) * 0.43758;\n"
+ " float sum = (part1 + part2 + part3);\n" +
"\n"
+ " return fract(sum)*scale;\n"
+ "}\n"
+ "\n"
+ "void main() {\n"
+ seedString[0]
+ seedString[1]
+ scaleString
+ stepX
+ stepY
+ "\n"
+ " float noise = texture2D(tex_sampler_1, vTextureCoord + vec2(-stepX, -stepY)).r * 0.224;\n"
+ " noise += texture2D(tex_sampler_1, vTextureCoord + vec2(-stepX, stepY)).r * 0.224;\n"
+ " noise += texture2D(tex_sampler_1, vTextureCoord + vec2(stepX, -stepY)).r * 0.224;\n"
+ " noise += texture2D(tex_sampler_1, vTextureCoord + vec2(stepX, stepY)).r * 0.224;\n"
+ " noise += 0.4448;\n"
+ " noise *= scale;\n"
+ "\n"
+ " vec4 color = texture2D(tex_sampler_0, vTextureCoord);\n"
+ " float energy = 0.33333 * color.r + 0.33333 * color.g + 0.33333 * color.b;\n"
+ " float mask = (1.0 - sqrt(energy));\n"
+ " float weight = 1.0 - 1.333 * mask * noise;\n"
+ "\n"
+ " // gl_FragColor = vec4(color.rgb * weight, color.a);\n"
+ " gl_FragColor = vec4(color.rgb, color.a);\n"
+ " gl_FragColor = gl_FragColor\n"
+ " + vec4(rand(vTextureCoord + seed),\n"
+ " rand(vTextureCoord + seed),\n"
+ " rand(vTextureCoord + seed), 1);\n"
+ "}\n";
I want to know this problem's cause and solution.
Thank you.
Well 1282 in your case is probably GL_INVALID_OPERATION.
I think your Uniform location is invalid in your line
GLES20.glUniformMatrix4fv(uMVPMatrixHandle, 1, false, mVPMatrix, 0);
//or
GLES20.glUniformMatrix4fv(uSTMatrixHandle, 1, false, sTMatrix, 0);
check your glGetUniformLocation() method where you get your uMVPMatrixHandle as well as uSTMatrixHandle and the problem might be there. Make sure the name matches the one in your vertex shader.
I want to do a fisheye effect on android useing opengl 2.0,i can do it not use the opengl,but this not i want ,because this is inefficient and not support video texture. I also test the fisheye effect using Android Media Effects API,but the effect looks not good.
i also search fishshader as follows:
private static final String FISHEYE_FRAGMENT_SHADER =
"precision mediump float;\n" +
"uniform sampler2D u_Texture;\n" +
"uniform vec2 vScale;\n" +
"const float alpha = float(4.0 * 2.0 + 0.75);\n" +
"varying vec2 v_TexCoordinate;\n" +
"void main() {\n" +
" float bound2 = 0.25 * (vScale.x * vScale.x + vScale.y * vScale.y);\n" +
" float bound = sqrt(bound2);\n" +
" float radius = 1.15 * bound;\n" +
" float radius2 = radius * radius;\n" +
" float max_radian = 0.5 * 3.14159265 - atan(alpha / bound * sqrt(radius2 - bound2));\n" +
" float factor = bound / max_radian;\n" +
" float m_pi_2 = 1.570963;\n" +
" vec2 coord = v_TexCoordinate - vec2(0.5, 0.5);\n" +
" float dist = length(coord * vScale);\n" +
" float radian = m_pi_2 - atan(alpha * sqrt(radius2 - dist * dist), dist);\n" +
" float scalar = radian * factor / dist;\n" +
" vec2 new_coord = coord * scalar + vec2(0.5, 0.5);\n" +
" gl_FragColor = texture2D(u_Texture, new_coord);\n" +
"}\n";
this is i want to ,but i donot know how to use it .Can someone give me some clue.
Android OpenGL ES does (normally) support video textures. It's not strictly part of the OpenGL ES API, but you can normally import video surfaces as EGL External Images via Android SurfaceViews.
There are lots of similar questions on the web, but this SO question should provide a useful starting point:
Android. How play video on Surface(OpenGL)