This question already has answers here:
Can I convert an image into a grid of dots?
(3 answers)
Closed 10 years ago.
I would like to create something similar to this question Can I convert an image into a grid of dots? but I cannot find any answer for my problem. The basic idea is to load a picture from the phone and apply this grid of dots. I would appreciate any suggestions to this.
As others may suggest, your problem can also be solved using a fragment shader in OpenGL Shading Language (GLSL). GLSL might require painful setup.
Here is my solution using Android Renderscript (a lot like GLSL, but specifically designed for Android. It isn't used much). First, setup the Renderscript > Hello Compute sample from inside the official Android SDK samples. Next, replace mono.rs with the following:
#pragma version(1)
#pragma rs java_package_name(com.android.example.hellocompute)
rs_allocation gIn;
rs_allocation gOut;
rs_script gScript;
static int mImageWidth;
const uchar4 *gPixels;
const float4 kBlack = {
0.0f, 0.0f, 0.0f, 1.0f
};
// There are two radius's for each circle for anti-aliasing reasons.
const static uint32_t radius = 15;
const static uint32_t smallerRadius = 13;
// Used so that we have smooth circle edges
static float smooth_step(float start_threshold, float end_threshold, float value) {
if (value < start_threshold) {
return 0;
}
if (value > end_threshold) {
return 1;
}
value = (value - start_threshold)/(end_threshold - start_threshold);
// As defined at http://en.wikipedia.org/wiki/Smoothstep
return value*value*(3 - 2*value);
}
void root(const uchar4 *v_in, uchar4 *v_out, uint32_t u_x, uint32_t u_y) {
int32_t diameter = radius * 2;
// Compute distance from center of the circle
int32_t x = u_x % diameter - radius;
int32_t y = u_y % diameter - radius;
float dist = hypot((float)x, (float)y);
// Compute center of the circle
uint32_t center_x = u_x /diameter*diameter + radius;
uint32_t center_y = u_y /diameter*diameter + radius;
float4 centerColor = rsUnpackColor8888(gPixels[center_x + center_y*mImageWidth]);
float amount = smooth_step(smallerRadius, radius, dist);
*v_out = rsPackColorTo8888(mix(centerColor, kBlack, amount));
}
void filter() {
mImageWidth = rsAllocationGetDimX(gIn);
rsForEach(gScript, gIn, gOut); // You may need a forth parameter, depending on your target SDK.
}
Inside HelloCompute.java, replace createScript() with the following:
private void createScript() {
mRS = RenderScript.create(this);
mInAllocation = Allocation.createFromBitmap(mRS, mBitmapIn,
Allocation.MipmapControl.MIPMAP_NONE,
Allocation.USAGE_SCRIPT);
mOutAllocation = Allocation.createTyped(mRS, mInAllocation.getType());
mScript = new ScriptC_mono(mRS, getResources(), R.raw.mono);
mScript.bind_gPixels(mInAllocation);
mScript.set_gIn(mInAllocation);
mScript.set_gOut(mOutAllocation);
mScript.set_gScript(mScript);
mScript.invoke_filter();
mOutAllocation.copyTo(mBitmapOut);
}
The end result will look like this
ALTERNATIVE
If you don't care about having each dot a solid color, you can do the following:
There is a very easy way to do this. You need a BitmapDrawable for the picture and a BitmapDrawable for the overlay tile (lets call it overlayTile). On overlayTile, call
overlayTile.setTileModeX(Shader.TileMode.REPEAT);
overlayTile.setTileModeY(Shader.TileMode.REPEAT);
Next, combine the two Drawable's into a single Drawable using LayerDrawable. You can use the resulting LayerDrawable as src for some ImageView, if you wish. Or, you can convert the Drawable to a Bitmap and save it to disk.
I think studying OpenGL might help in what you want to achieve.
You may want to go through the basics of Displaying Graphics with OpenGL ES
Hope that helps. :)
Related
Any advise in optimizing the following code? The code first grayscales, inverts and then thresholds the image (code not included, because it is trivial). It then sums the elements of each row and column (all elements are either 1 or 0). It then finds the row and column index of the row and column with the highest value.
The code is supposed to find the centroid of the image and it works, but I want to make it faster
I'm developing for API 23, so a reduction kernel can not be used.
Java snippet:
private int[] sumValueY = new int[640];
private int[] sumValueX = new int[480];
rows_indices_alloc = Allocation.createSized( rs, Element.I32(rs), height, Allocation.USAGE_SCRIPT);
col_indices_alloc = Allocation.createSized( rs, Element.I32(rs), width, Allocation.USAGE_SCRIPT);
public RenderscriptProcessor(RenderScript rs, int width, int height)
{
mScript.set_gIn(mIntermAllocation);
mScript.forEach_detectX(rows_indices_alloc);
mScript.forEach_detectY(col_indices_alloc);
rows_indices_alloc.copyTo(sumValueX);
col_indices_alloc.copyTo(sumValueY);
}
Renderscript.rs snippet:
#pragma version(1)
#pragma rs java_package_name(org.gearvrf.renderscript)
#include "rs_debug.rsh"
#pragma rs_fp_relaxed
const int mImageWidth=640;
const int mImageHeight=480;
int32_t maxsX=-1;
int32_t maxIndexX;
int32_t maxsY=-1;
int32_t maxIndexY;
rs_allocation gIn;
void detectX(int32_t v_in, int32_t x, int32_t y) {
int32_t sum=0;
for ( int i = 0; i < (mImageWidth); i++) {
float4 f4 = rsUnpackColor8888(rsGetElementAt_uchar4(gIn, i, x));
sum+=(int)f4.r;
}
if((sum>maxsX)){
maxsX=sum;
maxIndexX = x;
}
}
void detectY(int32_t v_in, int32_t x, int32_t y) {
int32_t sum=0;
for ( int i = 0; i < (mImageHeight); i++) {
float4 f4 = rsUnpackColor8888(rsGetElementAt_uchar4(gIn, x, i));
sum+=(int)f4.r;
}
if((sum>maxsY)){
maxsY=sum;
maxIndexY = x;
}
}
Any help would be appreciated
float4 f4 = rsUnpackColor8888(rsGetElementAt_uchar4(gIn, x, i));
sum+=(int)f4.r;
This converts from int to float and then back to int again. I think you can simplify by just doing this:
sum += rsGetElementAt_uchar4(gIn, x, i).r;
I don't know exactly how your previous stages work because you haven't posted them, but you should try generating packed values to read here. So either put your grayscale channels in .rgba or use a single channel format and then use rsAllocationVLoad_uchar4 to fetch 4 values at once.
Also, try combining previous stages with this one, if you don't need the intermediate results of those calculations it may be cheaper to do the memory load once and then do those transformations in registers.
You might also play with how many values your threads operate on. You could try having each kernel processing width/2, width/4, width/8 elements and see how they perform. This will give GPUs more threads to play with especially on lower-resolution images but with the trade off of having more reduction steps.
You also have a multiple-writers race condition on the maxsX/maxsY and maxIndexX/maxIndexY variables. All those writes need to use atomics if you care about the exact right answer. I think maybe you posted the wrong code because you don't store to the *_indices_alloc but you copy from them at the end. So, actually you should store all the sums to those and then use either a single threaded function or a kernel with atomics to get the absolute max and max index.
I'm working on a picture based app and I'm blocked on an issue with Renderscript.
My purpose is pretty simple in theory, I want to remove the white background from the images loaded by the user, to show them on another image i've set as background. More specifically what I want to to is to simulate the effect of printing a user uploaded graphic on paper canvas (also a picture) with a realistic effect.
I cannot assume the user is able to upload nice PNGs with alpha channels, and one of the requirement is to operate with JPGs.
I've been trying to solve this with RenderScripts, with something like this which sets alpha 0 to anything with R,G,and B all equals or greater than 240:
#pragma version(1)
#pragma rs java_package_name(mypackagename)
rs_allocation gIn;
rs_allocation gOut;
rs_script gScript;
const static float th = 239.f/256.f;
void root(const uchar4 v_in, uchar4 v_out, const void* usrData, uint32_t x,uint32_t y){
float4 f4 = rsUnpackColor8888(*v_in);
if(f4.r > th && f4.g > th && f4.b > th)
{
f4.a = 0;
}
*v_out = rsPackColorTo8888(f4);
}
void filter() {
rsForEach(gScript, gIn, gOut);
}
but the results are not satisfactory for mainly two reasons:
if a photo has a whiteish gradient not on the background the script causes an ugly noise effect
images with shadows close to the edges get a noise effects close to the edges
I understand that passing from alpha 0 to alpha 1 is too extreme and I've tried different solution involving linear increasing the alpha when the sum of the R,G,B components decrease but I still have noisy pixels and blocks around.
With plain white, or regular background (e.g. a snapshot of the Google home page) it works perfectly but with photos it's very far from anything acceptable.
I think that if I'd be able to process one "line" of pixels or one "block" of pixels instead that a single one it could be easier to detect flat backgrounds and to avoid hitting gradients but I don't know enough about renderscripts to do that.
Can anyone point me in the right direction?
PS
I can't use PorterDuff and multiply because the background and the foreground have different dimensions and moreover since I need to be able to drag the uploaded image around the background canvas once the effect is applied. If I multiply the image with a region of the background moving the result image around would cause a section of the background to move around as well.
If I get it right, you wants to determine whether the current pixel can is a white background based on a line/block of neighboring pixels.
You can try the use rsGetElementAt. For example, to process a line in your original code:
#pragma version(1)
#pragma rs java_package_name(mypackagename)
rs_allocation gIn;
rs_allocation gOut;
rs_script gScript;
const static float th = 239.f/256.f;
void root(const uchar4 v_in, uchar4 v_out, const void* usrData, uint32_t x,uint32_t y){
float4 f4 = rsUnpackColor8888(*v_in);
uint32_t width = rsAllocationGetDimX(gIn);
// E.g: Processing a line from x to x+5.
bool isBackground = true;
for (uint32_t i=0; i<=5 && x+i<width; i++) {
uchar4 nPixel_u4 = rsGetElementAt_uchar4(gIn, x+i, y);
float4 nPixel_f4 = rsUnpackColor8888(nPixel_u4);
if(nPixel_f4.r <= th || nPixel_f4.g <= th || nPixel_f4.b <= th) {
isBackground = false;
break;
}
}
if (isBackground) {
f4.a = 0.0f;
*v_out = rsPackColorTo8888(f4);
}
}
void filter() {
rsForEach(gScript, gIn, gOut);
}
This is just a naive example of how you can use rsGetElementAt to get the data from a given position in a global Allocation. There is a corresponding rsSetElementAt for saving the data to a global Allocation. I am hoping it helps your project.
I managed to write a Kernel that transforms an input-Bitmap to a float[] of Sobel gradients (two separate Kernels for SobelX and SobelY). I did this by assigning the input-Bitmap as a global variable and then passing the Kernel based on the Output allocation and referencing the neighbors of the Input-Bitmap via rsGetElementAt. Since I actually want to calculate the Magnitude (hypot(Sx,Sy) AND the Direction (atan2(Sy,Sx)) it would be nice to do the whole thing in one Kernel-pass. If I only had to calculate the Magnitude Array, this could be done with the same structure (1 intput Bitmap, 1 Output float[]). Now I wonder, whether it is possible to just add an additional Allocation for the Direction Output (also a float[]). I tried this with the rs-function rsSetElementAt() as follows:
#pragma version(1)
#pragma rs java_package_name(com.example.xxx)
#pragma rs_fp_relaxed
rs_allocation gIn, direction;
int32_t width;
int32_t height;
// Sobel, Magnitude und Direction
float __attribute__((kernel)) sobel_XY(uint32_t x, uint32_t y) {
float outX=0, outY=0;
if (x>0 && y>0 && x<(width-1) && y<(height-1)){
uchar4 c11=rsGetElementAt_uchar4(gIn, x-1, y-1); uchar4 c12=rsGetElementAt_uchar4(gIn, x-1, y);uchar4 c13=rsGetElementAt_uchar4(gIn, x-1, y+1);
uchar4 c21=rsGetElementAt_uchar4(gIn, x, y-1);uchar4 c23=rsGetElementAt_uchar4(gIn, x, y+1);
uchar4 c31=rsGetElementAt_uchar4(gIn, x+1, y-1);uchar4 c32=rsGetElementAt_uchar4(gIn, x+1, y);uchar4 c33=rsGetElementAt_uchar4(gIn, x+1, y+1);
float4 f11=rsUnpackColor8888(c11);float4 f12=rsUnpackColor8888(c12);float4 f13=rsUnpackColor8888(c13);
float4 f21=rsUnpackColor8888(c21); float4 f23=rsUnpackColor8888(c23);
float4 f31=rsUnpackColor8888(c31);float4 f32=rsUnpackColor8888(c32);float4 f33=rsUnpackColor8888(c33);
outX= f11.r-f31.r + 2*(f12.r-f32.r) + f13.r-f33.r;
outY= f11.r-f13.r + 2*(f21.r-f23.r) + f31.r-f33.r;
float d = atan2(outY, outX);
rsSetElementAt_float(direction, d, x, y);
return hypot(outX, outY);
}
}
And the corresponding Java code:
ScriptC_sobel script;
script=new ScriptC_sobel(rs);
script.set_gIn(Allocation.createFromBitmap(rs, bmpGray));
Type.Builder TypeOut = new Type.Builder(rs, Element.F32(rs));
TypeOut.setX(width).setY(height);
Allocation outAllocation = Allocation.createTyped(rs, TypeOut.create());
// the following 3 lines shall reflect the allocation to the Direction output
Type.Builder TypeDir = new Type.Builder(rs, Element.F32(rs));
TypeDir.setX(width).setY(height);
Allocation dirAllocation = Allocation.createTyped(rs, TypeDir.create());
script.forEach_sobel_XY(outAllocation);
outAllocation.copyTo(gm) ;
dirAllocation.copyTo(gd);
Unfortunately this does not work. I am not sure, whether the problem is with the structural logic of the rs-kernel or is it because I cannot use a second Type.Builder assignment within the Java code (because the kernel is already tied to the Magnitude Output-allocation). Any help is highly appreciated.
PS: I see that there is no link between the second Type.Builder assignment and the "direction" allocaton in rs - but how can this be achieved?
The outAllocation is passed as a parameter to the kernel. But the existence and location of dirAllocation also has to be communicated to the Renderscript side. Do this just before starting the script:
script.set_direction(dirAllocation);
Also, read about memory allocation in Renderscript.
I found there's lacking good documentation in RenderScript, for what I know, forEach in RS is to execute the root() for each individual item in the allocation.
I am trying to make a library for Renderscript that does Image processing, as a starting point, I reached this great answer. But the problem, is that the blur operation is on Each pixel and each pixel requires another loop (n with blur width) of calculation. Although running on multi-core, it is still a bit too slow.
I am trying to modify it to allow (two-pass) box filter, but that requires working on a single row or column instead of cell. So, is there any way to ask foreach to send an array to root()?
rsForEach can only operate upon Allocations.
If you want to have the rsForEach function call root() for each of the image rows you have to pass in an Allocation that is sized to be the same length as the number of rows and then work out which row you should be operating on inside root() (similarly for operating on each column). RenderScript should then divide up the work to run on the resources available (more than one row being processed at the same time on multi core devices).
One way you could do that is by passing in an Allocation that give the offsets (within the image data array) of the image rows. The v_in argument inside the root() will then be the row offset. Since the Allocations the rsForEach call is operating upon is not the image data you cannot write the image out using the v_out argument and you must bind the output image separately.
Here is some RenderScript that show this:
#pragma version(1)
#pragma rs java_package_name(com.android.example.hellocompute)
rs_allocation gIn;
rs_allocation gOut;
rs_script gScript;
int mImageWidth;
const uchar4 *gInPixels;
uchar4 *gOutPixels;
void init() {
}
static const int kBlurWidth = 20;
//
// This is called per row.
// The row indices are passed in as v_in or you could also use the x argument and multiply it by image width.
//
void root(const int32_t *v_in, int32_t *v_out, const void *usrData, uint32_t x, uint32_t y) {
float3 blur[kBlurWidth];
float3 cur_colour = {0.0f, 0.0f, 0.0f};
for ( int i = 0; i < kBlurWidth; i++) {
float3 init_colour = {0.0f, 0.0f, 0.0f};
blur[i] = init_colour;
}
int32_t row_index = *v_in;
int blur_index = 0;
for ( int i = 0; i < mImageWidth; i++) {
float4 pixel_colour = rsUnpackColor8888(gInPixels[i + row_index]);
cur_colour -= blur[blur_index];
blur[blur_index] = pixel_colour.rgb;
cur_colour += blur[blur_index];
blur_index += 1;
if ( blur_index >= kBlurWidth) {
blur_index = 0;
}
gOutPixels[i + row_index] = rsPackColorTo8888(cur_colour/(float)kBlurWidth);
//gOutPixels[i + row_index] = rsPackColorTo8888(pixel_colour);
}
}
void filter() {
rsDebug("Number of rows:", rsAllocationGetDimX(gIn));
rsForEach(gScript, gIn, gOut, NULL);
}
This would be setup using the following Java:
mBlurRowScript = new ScriptC_blur_row(mRS, getResources(), R.raw.blur_row);
int row_width = mBitmapIn.getWidth();
//
// Create an allocation that indexes each row.
//
int num_rows = mBitmapIn.getHeight();
int[] row_indices = new int[num_rows];
for ( int i = 0; i < num_rows; i++) {
row_indices[i] = i * row_width;
}
Allocation row_indices_alloc = Allocation.createSized( mRS, Element.I32(mRS), num_rows, Allocation.USAGE_SCRIPT);
row_indices_alloc.copyFrom(row_indices);
//
// The image data has to be bound to the pointers within the RenderScript so it can be accessed
// from the root() function.
//
mBlurRowScript.bind_gInPixels(mInAllocation);
mBlurRowScript.bind_gOutPixels(mOutAllocation);
// Pass in the image width
mBlurRowScript.set_mImageWidth(row_width);
//
// Pass in the row indices Allocation as the input. It is also passed in as the output though the output is not used.
//
mBlurRowScript.set_gIn(row_indices_alloc);
mBlurRowScript.set_gOut(row_indices_alloc);
mBlurRowScript.set_gScript(mBlurRowScript);
mBlurRowScript.invoke_filter();
I'm making a simple fractal viewing app for Android, just for fun. I'm also using it as an oppotunity to learn OpenGL since I've never worked with it before. Using the Android port of the NeHe tutorials as a starting point, my approach is to have one class (FractalModel) which does all the math to create the fractal, and FractalView which does all the rendering.
The difficulty I'm having is in getting the rendering to work. Since I'm essentially plotting a graph of points of different colors where each point should correspond to 1 pixel, I thought I'd handle this by rendering 1x1 rectangles over the entire screen, using the dimensions to calculate the offsets so that there's a 1:1 correspondence between the rectangles and the physical pixels. Since the color of each pixel will be calculated independently, I can re-use the same rendering code to render different parts of the fractal (I want to add panning and zooming later on).
Here is the view class I wrote:
public class FractalView extends GLSurfaceView implements Renderer {
private float[] mVertices;
private FloatBuffer[][] mVBuffer;
private ByteBuffer[][] mBuffer;
private int mScreenWidth;
private int mScreenHeight;
private float mXOffset;
private float mYOffset;
private int mNumPixels;
//references to current vertex coordinates
private float xTL;
private float yTL;
private float xBL;
private float yBL;
private float xBR;
private float yBR;
private float xTR;
private float yTR;
public FractalView(Context context, int w, int h){
super(context);
setEGLContextClientVersion(1);
mScreenWidth = w;
mScreenHeight = h;
mNumPixels = mScreenWidth * mScreenHeight;
mXOffset = (float)1.0/mScreenWidth;
mYOffset = (float)1.0/mScreenHeight;
mVertices = new float[12];
mVBuffer = new FloatBuffer[mScreenHeight][mScreenWidth];
mBuffer = new ByteBuffer[mScreenHeight][mScreenWidth];
}
public void onDrawFrame(GL10 gl){
int i,j;
gl.glClear(GL10.GL_COLOR_BUFFER_BIT | GL10.GL_DEPTH_BUFFER_BIT);
gl.glLoadIdentity();
mapVertices();
gl.glColor4f(0.0f,1.0f, 0.0f,.5f);
for(i = 0; i < mScreenHeight; i++){
for(j = 0; j < mScreenWidth; j++){
gl.glFrontFace(GL10.GL_CW);
gl.glVertexPointer(3, GL10.GL_FLOAT, 0, mVBuffer[i][j]);
gl.glEnableClientState(GL10.GL_VERTEX_ARRAY);
gl.glDrawArrays(GL10.GL_TRIANGLE_STRIP, 0, mVertices.length / 3);
gl.glDisableClientState(GL10.GL_VERTEX_ARRAY);
}
}
}
public void onSurfaceChanged(GL10 gl, int w, int h){
if(h == 0) { //Prevent A Divide By Zero By
h = 1; //Making Height Equal One
}
gl.glViewport(0, 0, w, h); //Reset The Current Viewport
gl.glMatrixMode(GL10.GL_PROJECTION); //Select The Projection Matrix
gl.glLoadIdentity(); //Reset The Projection Matrix
//Calculate The Aspect Ratio Of The Window
GLU.gluPerspective(gl, 45.0f, (float)w / (float)h, 0.1f, 100.0f);
gl.glMatrixMode(GL10.GL_MODELVIEW); //Select The Modelview Matrix
gl.glLoadIdentity();
}
public void onSurfaceCreated(GL10 gl, EGLConfig config){
gl.glShadeModel(GL10.GL_SMOOTH); //Enable Smooth Shading
gl.glClearColor(0.0f, 0.0f, 0.0f, 0.5f); //Black Background
gl.glClearDepthf(1.0f); //Depth Buffer Setup
gl.glEnable(GL10.GL_DEPTH_TEST); //Enables Depth Testing
gl.glDepthFunc(GL10.GL_LEQUAL); //The Type Of Depth Testing To Do
//Really Nice Perspective Calculations
gl.glHint(GL10.GL_PERSPECTIVE_CORRECTION_HINT, GL10.GL_NICEST);
}
private void mapVertices(){
int i,j;
xTL = -1;
yTL = 1;
xTR = -1 + mXOffset;
yTR = 1;
xBL = -1;
yBL = 1 - mYOffset;
xBR = -1 + mXOffset;
yBR = 1 - mYOffset;
for(i = 0; i < mScreenHeight; i++){
for (j = 0; j < mScreenWidth; j++){
//assign coords to vertex array
mVertices[0] = xBL;
mVertices[1] = yBL;
mVertices[2] = 0f;
mVertices[3] = xBR;
mVertices[4] = xBR;
mVertices[5] = 0f;
mVertices[6] = xTL;
mVertices[7] = yTL;
mVertices[8] = 0f;
mVertices[9] = xTR;
mVertices[10] = yTR;
mVertices[11] = 0f;
//add doubleBuffer
mBuffer[i][j] = ByteBuffer.allocateDirect(mVertices.length * 4);
mBuffer[i][j].order(ByteOrder.nativeOrder());
mVBuffer[i][j] = mBuffer[i][j].asFloatBuffer();
mVBuffer[i][j].put(mVertices);
mVBuffer[i][j].position(0);
//transform right
transformRight();
}
//transform down
transformDown();
//reset x
xTL = -1;
xTR = -1 + mXOffset;
xBL = -1;
xBR = -1 + mXOffset;
}
}
//transform all the coordinates 1 "pixel" to the right
private void transformRight(){
xTL = xTL + mXOffset; //TL
xBL = xBL + mXOffset; //BL
xBR = xBR + mXOffset; //BR
xTR = xTR + mXOffset; //TR;
}
//transform all of the coordinates 1 pixel down;
private void transformDown(){
yTL = yTL - mYOffset;
yBL = yBL - mYOffset;
yBR = yBR - mYOffset;
yTR = yTR - mYOffset;
}
}
Basically I'm trying to do it the same way as this (the square in lesson 2) but with far more objects. I'm assuming 1 and -1 roughly correspond to screen edges, (I know this isn't totally true, but I don't really understand how to use projection matrices and want to keep this as simple as possible unless there's a good resource out there I can learn from) but I understand that OpenGL's coordinates are separate from real screen coordinates. When I run my code I just get a black screen (it should be green) but LogCat shows the garbage collector working away so I know something is happening. I'm not sure if it's just a bug caused by my just not doing something right, or if it's just REALLY slow. In either case, what should I do differently? I feel like I may be going about this all wrong. I've looked around and most of the tutorials and examples are based on the link above.
Edit: I know I could go about this by generating a texture that fills up the entire screen and just drawing that, though the link I read which mentioned it said it would be slower since you're not supposed to redraw a texture every frame. That said, I only really need to redraw the texture when the perspective changes, so I could write my code to take this into account. The main difficulty I'm having currently is drawing the bitmap, and getting it to display correctly.
I would imagine that the blank screen is due to the fact that you are swapping buffers so many times, and also the fact that you are generating all your vertex buffers every frame. Thousands of buffer swaps AND thousands of buffer creations in a single frame would be INCREDIBLY slow.
One thing to mention is that Android devices have limited memory, so the garbage collector working away is probably an indication that your buffer creation code is eating up a lot of the available memory and the device is trying to free up some for the creation of new buffers.
I would suggest creating a texture that you fill with your pixel data each frame and then render to a single square that fills the screen. This will increase your speed by a huge amount, and also make your program more flexible.
Edit:
Look at the tutorial here : http://www.nullterminator.net/gltexture.html to get an idea on how to create textures and load them. You will basically need to fill BYTE* data with your own data.
If you are changing the data dynamically, you will need to update the texture data. Use the information here : http://www.opengl.org/wiki/Texture : in the section about Texture image modification.