Renderscript Documentation and Advice - Android

Renderscript Documentation and Advice - Android - android

I have been following this guide on how to use Render-script on Android.
http://www.jayway.com/2014/02/11/renderscript-on-android-basics/
My code is this (I got a wrapper class for the script):
public class PixelCalcScriptWrapper {
private Allocation inAllocation;
private Allocation outAllocation;
RenderScript rs;
ScriptC_pixelsCalc script;
public PixelCalcScriptWrapper(Context context){
rs = RenderScript.create(context);
script = new ScriptC_pixelsCalc(rs, context.getResources(), R.raw.pixelscalc);
};
public void setInAllocation(Bitmap bmp){
inAllocation = Allocation.createFromBitmap(rs,bmp);
};
public void setOutAllocation(Bitmap bmp){
outAllocation = Allocation.createFromBitmap(rs,bmp);
};
public void forEach_root(){
script.forEach_root(inAllocation, outAllocation);
}
}
This methods calls the script:
public Bitmap processBmp(Bitmap bmp, Bitmap bmpCopy) {
pixelCalcScriptWrapper.setInAllocation(bmp);
pixelCalcScriptWrapper.setOutAllocation(bmpCopy);
pixelCalcScriptWrapper.forEach_root();
return bmpCopy;
};
and here is my script:
#pragma version(1)
#pragma rs java_package_name(test.foo)
void root(const uchar4 *in, uchar4 *out, uint32_t x, uint32_t y) {
float3 pixel = convert_float4(in[0]).rgb;
if(pixel.z < 128) {
pixel.z = 0;
}else{
pixel.z = 255;
}
if(pixel.y < 128) {
pixel.y = 0;
}else{
pixel.y = 255;
}
if(pixel.x < 128) {
pixel.x = 0;
}else{
pixel.x = 255;
}
out->xyz = convert_uchar3(pixel);
}
Now where can I find some documentation about this ?
For example, I have these questions:
1) What does this convert_float4(in[0]) do ?
2) What does the rgb return here convert_float4(in[0]).rgb;?
3) What is float3 ?
4) I don't know where to start with this line out->xyz = convert_uchar3(pixel);
5) I am assuming in the parameters, in and out are the Allocations passed?
what are x and y?

http://developer.android.com/guide/topics/renderscript/reference/rs_convert.html#android_rs:convert
What does this convert_float4(in[0]) do?
convert_float4 will convert from a uchar4 to a float4;
.rgb turns it into a float3 of the first 3 elements.
What does the rgb return?
RenderScript vector types have .r .g .b .a or
.x .y .z .w representing the first, second, third and forth element respectively. You can use any combination (e.g. .xy or .xwy)
What is float3?
float3 is a "vector type" sort of like a float but 3 of them.
There are float2, float3 and float4 vector types of float.
(there are uchar4, int4 etc.)
http://developer.android.com/guide/topics/renderscript/reference/overview.html might be helpful
I hope this helps.

1) In the kernel, the in pointer is a 4-element unsigned char, that is, it represents a pixel color with R, G, B and A values in the 0-255 range. So convert_float4 simply casts each of the four uchar as a float. In this particular code you are using, it probably doesn't make much sense to work with floats, since you're doing a simple threshold, and you could just as well had worked with the uchar data directly. Using floats is better aimed when doing other types of image processing algorithms where you do need to have the extra precision (example: blurring an image).
2) The .rgb suffix is a shorthand to return only the first three values of the float4, i.e. the R, G, and B values. If you had used only .r it would give you the first value as a regular float, if you had used .g it would give you the second value as a float, etc... These three values are then assigned to that float3 variable, which now represents the pixel with only three color channels (that is, no A alpha channel).
3) See #2.
4) Now convert_uchar3 is again another cast that converts the float3 pixel variable back to a uchar3 variable. You are assigning the three values to each of the x, y, and z elements in that order. This is probably a good time to mention that X, Y and Z are completely interchangeable with R, G and B. That statement could just as well have used out->rgb, and it would actually have been more readable that way. Note that out is a uchar4, and by doing this, you are assigning only the first three "rgb" or "xyz" elements in that pointer, the fourth element is left undefined here.
5) Yes, in is the input pixel, out is the output pixel. Then x and y are the x, and y coordinates of the pixel in the overall image. This kernel function is going to be called once for every pixel in the image/allocation you're working with, and so it's usually good to know what coordinate you're at when processing an image. In this particular example since it's only thresholding all pixels in the same way, the coordinates are irrelevant.
Good documentation on RenderScript is very hard to find. I would greatly recommend you take a look at these two videos though, as they will give you a much better sense of how RenderScript works:
AnDevCon: A Deep Dive into RenderScript
Google I/O 2013 - High Performance Applications with RenderScript
Keep in mind that both videos are a couple years old, so some minor details may have changed on the recent APIs, but overall, those are probably the best sources of information for RS.

Related

RenderScript low performance on Samsung Galaxy S8

Context
I have an Android app that takes a picture, blurs the picture, removes the blur based on a mask and applies a final layer (not relevant). The last 2 steps, removing the blur based on a mask and applying a final layer is done repeatedly, each time with a new mask (150 masks).
The output get's drawn on a canvas (SurfaceView). This way the app effectively creates a view of the image with an animated blur.
Technical details & code
All of these image processing steps are achieved with RenderScript.
I'm leaving out the code for step 1, blurring the picture, since this is irrelevant for the problem I'm facing.
Step 2: removing the blur based on a mask
I have a custom kernel which takes an in Allocation as argument and holds 2 global variables, which are Allocations as well.
These 3 Allocations all get their data from bitmaps using Allocation.copyFrom(bitmap).
Step 3: applying a final layer
Here I have a custom kernel as well which takes an in Allocation as argument and holds 3 global variables, of which 1 is and Allocation and 2 are floats.
How these kernels work is irrelevant to this question but just to be sure I included some simplified snippets below.
Another thing to note is that I am following all best practices to ensure performance is at its best regarding Allocations, RenderScript and my SurfaceView.
So common mistakes such as creating a new RenderScript instance each time, not re-using Allocations when possible,.. are safe to ignore.
blurMask.rs
#pragma version(1)
#pragma rs java_package_name(com.example.rs)
#pragma rs_fp_relaxed
// Extra allocations
rs_allocation allocBlur;
rs_allocation allocBlurMask;
/*
* RenderScript kernel that performs a masked blur manipulation.
* Blur Pseudo: out = original * blurMask + blur * (1.0 - blurMask)
* -> Execute this for all channels
*/
uchar4 __attribute__((kernel)) blur_mask(uchar4 inOriginal, uint32_t x, uint32_t y) {
// Manually getting current element from the blur and mask allocations
uchar4 inBlur = rsGetElementAt_uchar4(allocBlur, x, y);
uchar4 inBlurMask = rsGetElementAt_uchar4(allocBlurMask, x, y);
// normalize to 0.0 -> 1.0
float4 inOriginalNorm = rsUnpackColor8888(inOriginal);
float4 inBlurNorm = rsUnpackColor8888(inBlur);
float4 inBlurMaskNorm = rsUnpackColor8888(inBlurMask);
inBlurNorm.rgb = inBlurNorm.rgb * 0.7 + 0.3;
float4 outNorm = inOriginalNorm;
outNorm.rgb = inOriginalNorm.rgb * inBlurMaskNorm.rgb + inBlurNorm.rgb * (1.0 - inBlurMaskNorm.rgb);
return rsPackColorTo8888(outNorm);
}
myKernel.rs
#pragma version(1)
#pragma rs java_package_name(com.example.rs)
#pragma rs_fp_relaxed
// Extra allocations
rs_allocation allocExtra;
// Randoms; Values are set from kotlin, the values below just act as a min placeholder.
float randB = 0.1f;
float randW = 0.75f;
/*
* RenderScript kernel that performs a manipulation.
*/
uchar4 __attribute__((kernel)) my_kernel(uchar4 inOriginal, uint32_t x, uint32_t y) {
// Manually getting current element from the extra allocation
uchar4 inExtra = rsGetElementAt_uchar4(allocExtra, x, y);
// normalize to 0.0 -> 1.0
float4 inOriginalNorm = rsUnpackColor8888(inOriginal);
float4 inExtraNorm = rsUnpackColor8888(inExtra);
float4 outNorm = inOriginalNorm;
if (inExtraNorm.r > 0.0) {
outNorm.rgb = inOriginalNorm.rgb * 0.7 + 0.3;
// Separate channel operation since we are using inExtraNorm.r everywhere
outNorm.r = outNorm.r * inExtraNorm.r + inOriginalNorm.r * (1.0 - inExtraNorm.r);
outNorm.g = outNorm.g * inExtraNorm.r + inOriginalNorm.g * (1.0 - inExtraNorm.r);
outNorm.b = outNorm.b * inExtraNorm.r + inOriginalNorm.b * (1.0 - inExtraNorm.r);
}
else if (inExtraNorm.g > 0.0) {
...
}
return rsPackColorTo8888(outNorm);
}
Problem
So the app works great on a range of devices, even on low-end devices. I manually cap the FPS at 15, but when I remove this cap, I get results ranging from 15-20 on low-end devices to 35-40 on high-end devices.
The Samsung Galaxy S8 is where my problem occurs. For some reason I only manage to get around 10 FPS. If I use adb to force RenderScript to use CPU instead:
adb shell setprop debug.rs.default-CPU-driver 1
I get around 12-15 FPS, but obviously I want it to run on the GPU.
An important, weird thing I noticed
If I trigger a touch event, no matter where (even out of the app), the performance dramatically increases to around 35-40 FPS. If I lift my finger from the screen again, FPS drops back to 10 FPS.
NOTE: drawing the result on the SurfaceView can be excluded as an impacting factor since the results are the same with just the computation in RenderScript without drawing the actual result.
Questions
So I have more than one question really:
What could be the reason behind the low performance?
Why would a touch event improve this performance so dramatically?
How could I solve or work around this issue?

The x y parameters definition

I'm trying to understand the mapping kernel in Renderscript.
A sample mapping kernel looks like this
uchar4 RS_KERNEL invert(uchar4 in, uint32_t x, uint32_t y) {
uchar4 out = in;
out.r = 255 - in.r;
out.g = 255 - in.g;
out.b = 255 - in.b;
return out;
}
However, there is not much clarity regarding what the x, y parameters refer to ( whether x points to height or width of the given pixel in a bitmap)
The official documentation only says so much about x, y
A mapping kernel function or a reduction kernel accumulator function may access the coordinates of the current execution using the special arguments x, y, and z, which must be of type int or uint32_t. These arguments are optional.
This is critical information as interchanging and accessing can lead to out of bounds error. If you have worked on it, please give your insights on this.

The x and y (and z, if using a 3D allocation) are the width and height (and depth for 3D). This means that the in parameter of your kernel function corresponds to the data in your allocation at the point x, y.

OpenGL ES write depth data to color

I'm trying to implement DepthBuffer-like functionality using OpenGL ES on Android.
In other words I'm trying to get the 3D point on surface that is rendered on point [x, y] on the user device. In order to make that I need to be able to read the distance of the fragment at that given point.
Answer in different circumstances:
When using normal OpenGL you could achieve this by creating FrameBuffer and then attach either RenderBuffer or Texture with depth component to it.
Both of those approaches use glReadPixels, with internal format of GL_DEPTH_COMPONENT to retrieve the data from the buffer/texture. Unfortunately OpenGL ES only supports GL_ALPHA, GL_RGB, and GL_RGBA as the readback formats, so there's really no way to reach the framebuffer's depth data directly.
The only viable approach that I can think of (and that I have found suggested on the internet) is to create different shaders just for depth buffering. The shader, that is used only for depth rendering, should write gl_FragCoord.z value (=the distance value that we want to read.) on the gl_FragColor. However:
The actual Question:
When I write gl_FragCoord.z value on the gl_FragColor = new Vec4(vec3(gl_FragCoord.z), 1.0); and later use glReadPixels to read back the rgb values, those read values don't match up with the input.
What I have tried:
I realize that there's only 24 bits (r, g, b * 8 bits each) representing the depth data so I tried shifting the returned value by 8 - to get 32 bits, but it didn't seem to work. I also tried to shift distance when applying it to red, green and blue, but that didn't seem to work as expected. I have been trying to figure out what's wrong by observing the bits, results at the bottom.
fragmentShader.glsl(candidate #3):
void main() {
highp float distance = 1.0; //currently just 1.0 to test the results with different values.
lowp float red = distance / exp2(16.0);
lowp float green = distance / exp2(8.0);
lowp float blue = distance / exp2(0.0);
gl_FragColor = vec4(red, green, blue, 1.0);
}
Method to read the values (=glReadPixels)
private float getDepth(int x, int y){
FloatBuffer buffer = GeneralSettings.getFloatBuffer(1); //just creates FloatBuffer with capacity of 1 float value.
terrainDepthBuffer.bindFrameBuffer(); //bind the framebuffer before read back.
GLES20.glReadPixels(x, y, 1, 1, GLES20.GL_RGB, GLES20.GL_UNSIGNED_BYTE, buffer); //read the values from previously bind framebuffer.
GeneralSettings.checkGlError("glReadPixels"); //Make sure there is no gl related errors.
terrainDepthBuffer.unbindCurrentFrameBuffer(); //Remember to unbind the buffer after reading/writing.
System.out.println(buffer.get(0)); //Print the value.
}
Observations in bits using the shader & method above:
Value | Shader input | ReadPixels output
1.0f | 111111100000000000000000000000 | 111111110000000100000000
0.0f | 0 | 0
0.5f | 111111000000000000000000000000 | 100000000000000100000000

Changing colors from specific parts of a bitmap image

I am writing an Android application that must paint determined parts of a loaded bitmap image according to received events.
I need to paint (or change the current color) of a single part of a bitmap image, without changing the rest of the image.
Let's say I have a car, which is divided by many parts: door, windows, wheels, etc.
Each time an event (received from the network) arrives, I need to change the color of that particular part with the color specified by the event data.
What would be the best technique to achieve that?
I first thought on FloodFill, as suggested on many threads in SO, but given that the messages are received quite fast (several per second) I fear it would drag performance down, as it seem to be very CPU intensive algorithm.
I also thought about having multiple segments of the same image, each colored with a different color and show the right one at the right time, but the car has at least 10 different parts and each one could be painted with 4-6 colors, so I would end up with dozens of images and that would be impractical to handle, not to mention the waste of memory.
So, is there any other approach?

The fastest way to do it is with a shader. You'll need to use OpenGL ES 2 for that (some Androids only support ES 1). You'll need a temporary bitmap the same size as the image you want to change. Set it as the target. In the shader, retrieve a pixel from the sampler which is bound to the image you want to change. If it's within a small tolerance of the colour you want to change, set gl_FragColor to the new colour, otherwise just set gl_FragColor to the colour you retrieved from the sampler. You'll need to pass the desired colour and the new colour into the shader as vec4s with al_set_shader_float_vector. The fastest way to do this is to keep 2 bitmaps and swap between them as the "main one" that you're using each time a colour changes.
If you can't use a shader, then you'll have to lock the bitmap and replace the colour. Use al_lock_bitmap to lock it, then you can use al_get_pixel and al_put_pixel to change colours. Then al_unlock_bitmap when you're done. You can also avoid using al_get_pixel/al_put_pixel and access the memory manually which will be faster. If you lock the bitmap with the format ALLEGRO_PIXEL_FORMAT_ABGR_8888_LE then the memory is laid out like so:
int w = al_get_bitmap_width(bitmap);
int h = al_get_bitmap_height(bitmap);
for (int y = 0; y < h; y++) {
unsigned char *p = locked_region->data + locked_region->pitch * y;
for (int x = 0; x < w; x++) {
unsigned char r = p[0];
unsigned char g = p[1];
unsigned char b = p[2];
unsigned char a = p[3];
/* change r, g, b, a here if they match */
p[0] = r;
p[1] = g;
p[2] = b;
p[3] = a;
p += 4;
}
}
It's recommended that you lock the image in the format it was created in. That means pick an easy one like the one I mentioned, or else the inner part of the loop gets more complicated. The ABGR_8888 part of the pixel format describes the layout of the data. ABGR tells the order of the components. If you were to read a pixel into a single storage unit (an int in this case but it works the same with a short) then the bit pattern would be AAAAAAAABBBBBBBBGGGGGGGGRRRRRRRR. However, when you're reading a byte at a time, most machine are little endian so that means the small end comes first. That's why in my sample code p[0] is red. The 8888 part tells how many bits per component.

Allocation.copyTo(Bitmap) corrupting pixel values

I'm new to Renderscript, and am striking issues with my first script. As far as I can see (from debugging statements I've inserted) my code works fine, but the computed values are getting mangled when they are being copied back to the Bitmap by the Allocation.copyTo(Bitmap) method.
I was getting weird colours out, so eventually stripped down my script to this sample which shows the problem:
void root(const uchar4 *v_in, uchar4 *v_out, const void *usrData, uint32_t x, uint32_t y)
{
*v_out = rsPackColorTo8888(1.f, 0.f, 0.f, 1.f);
if (x==0 && y==0) {
rsDebug("v_out ", v_out->x, v_out->y, v_out->z, v_out->w);
}
}
Here we are just writing out an opaque red pixel. The debug line seems to print the right value (255 0 0 255) and indeed I get a red pixel in the bitmap.
However if I change the alpha on the red pixel slightly:
*v_out = rsPackColorTo8888(1.f, 0.f, 0.f, 0.998f);
The debug prints (255 0 0 254) which still seems correct, but the final pixel value ends up being (0 0 0 254) ie. black.
Obviously I suspected it was a premulltiplied alpha issue, but my understanding is that the Allocation routines to copy from and to Bitmaps is supposed to handle that for you. At least that's what Chet Haase suggests in this blog post: https://plus.google.com/u/0/+ChetHaase/posts/ef6Deey6xKA.
Also none of the example compute scripts out there seem to mention any issues with pre-multiplied alpha. My script was based on the HelloComputer example from the SDK.
If I am making a mistake, I would love an RS guru to point it out for me.
It's a shame that after 2+ years the documentation for Renderscript is still so poor.
PS. The Bitmaps I'm using are ARGB_888 and I am building and targetting on SDK18 (Android 4.3)

The example works fine because the example does not modify alpha.
If you are going to modify alpha and then use the Allocation as a normal bitmap you should return (r*a, g*a, b*a, a).
However, if you were sending the Allocation to a GL surface which is not pre-multiplied, your code would work as-is.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.