Android and OpenGL "how display faces" - android

I want to display for example an *.obj file.
and normal, in OpenGL I use instruction :
But in Android OpenGL i don't have this functions...
i have an DrawElements(...);
but when I want draw face 34/54/3 ( vertex/texcord/normal index of arrays)
it's drawing linear 34/34/34...
so how I can draw a *.obj file?
I search in the web and I found this topic : So.. I writing an Model editor in C# to my game and I wrote something like that for test :
public void display2()
double[] vertexBuff = new double[faces.Count * 3 * 3];
double[] normalBuff = new double[faces.Count * 3 * 3];
double[] texcorBuff = new double[faces.Count * 3 * 2];
foreach (face f in faces)
for (int i = 0; i < f.vector.Length; i++)
vertexBuff[i_3] = mesh[f.vector[i]].X;
vertexBuff[i_3 + 1] = mesh[f.vector[i]].Y;
vertexBuff[i_3 + 2] = mesh[f.vector[i]].Z;
normalBuff[i_3] = normal[f.normal[i]].X;
normalBuff[i_3 + 1] = normal[f.normal[i]].Y;
normalBuff[i_3 + 2] = normal[f.normal[i]].Z;
texcorBuff[i_2] = texture[f.texCord[i]].X;
texcorBuff[i_2 + 1] = texture[f.texCord[i]].Y;
i_3 += 3;
i_2 += 2;
GL.VertexPointer<double>(3, VertexPointerType.Double, 0, vertexBuff);
GL.TexCoordPointer<double>(2, TexCoordPointerType.Double, 0, texcorBuff);
GL.NormalPointer<double>(NormalPointerType.Double, 0, normalBuff);
GL.DrawArrays(BeginMode.Triangles, 0, faces.Count * 3);
and it's working.. but I think that this could be more optimized?...
I don't want to change my data of model to the arraysbuffer,
because it takes too much space in memory.. any suggestion?

I'm not an Android programmer but I assume it uses OpenGL-ES in which these functions are deprecated (and by the way missing).
Tutorials explaining the good solution are drawn amongst a bunch of others that show how to draw triangles with glVertex3f functions (because it gives easy and fast results but totally pointless). I find it tragic since NOBODY should use those things.
glBegin/glEnd, glVertex3f, glTexcoords2f, and such functions are now deprecated for performance sake (they are "slow" because we have to limit the number of calls to the graphic library). I won't expand much on that since you can search for it if interested.
Instead, make use of Vertex and Indices buffers. I'm sorry because I have no "perfect" link to recommend, but you should easily get what you need on google :)
However, I dug up some come from an ancient C# project:
Note: OpenTK binding change functions name but they remain very close to the OGL ones, for example glVertex3f becomes GL.Vertex3.
The Vertex definition
A simple struct to store your custom vertex's informations (position, normal (if needed), color...)
[System.Runtime.InteropServices.StructLayout(System.Runtime.InteropServices.LayoutKind.Sequential, Pack = 1)]
public struct Vertex
public Core.Math.Vector3 Position;
public Core.Math.Vector3 Normal;
public Core.Math.Vector2 UV;
public uint Coloring;
public Vertex(float x, float y, float z)
this.Position = new Core.Math.Vector3(x, y, z);
this.Normal = new Core.Math.Vector3(0, 0, 0);
this.UV = new Core.Math.Vector2(0, 0);
System.Drawing.Color color = System.Drawing.Color.Gray;
this.Coloring = (uint)color.A << 24 | (uint)color.B << 16 | (uint)color.G << 8 | (uint)color.R;
The Vertex Buffer class
It's a wrapper class around an OpenGL buffer object to handle our vertex format.
public class VertexBuffer
public uint Id;
public int Stride;
public int Count;
public VertexBuffer(Graphics.Objects.Vertex[] vertices)
int size;
// We create an OpenGL buffer object
GL.GenBuffers(1, out this.Id); //note: out is like passing an object by reference in C#
this.Stride = OpenTK.BlittableValueType.StrideOf(vertices); //size in bytes of the VertexType (Vector3 size*2 + Vector2 size + uint size)
this.Count = vertices.Length;
// Fill the buffer with our vertices data
GL.BindBuffer(BufferTarget.ArrayBuffer, this.Id);
GL.BufferData(BufferTarget.ArrayBuffer, (System.IntPtr)(vertices.Length * this.Stride), vertices, BufferUsageHint.StaticDraw);
GL.GetBufferParameter(BufferTarget.ArrayBuffer, BufferParameterName.BufferSize, out size);
if (vertices.Length * this.Stride != size)
throw new System.ApplicationException("Vertex data not uploaded correctly");
The Indices Buffer class
Very similar to the vertex buffer, it stores vertex indices of each face of your model.
public class IndexBuffer
public uint Id;
public int Count;
public IndexBuffer(uint[] indices)
int size;
this.Count = indices.Length;
GL.GenBuffers(1, out this.Id);
GL.BindBuffer(BufferTarget.ElementArrayBuffer, this.Id);
GL.BufferData(BufferTarget.ElementArrayBuffer, (System.IntPtr)(indices.Length * sizeof(uint)), indices,
GL.GetBufferParameter(BufferTarget.ElementArrayBuffer, BufferParameterName.BufferSize, out size);
if (indices.Length * sizeof(uint) != size)
throw new System.ApplicationException("Indices data not uploaded correctly");
Drawing buffers
Then, to render a triangle, you have to create one Vertex Buffer to store vertices' positions. One Indice buffer containing the indices of the vertices [0, 1, 2] (pay attention to the counter-clockwise rule, but it's the same with glVertex3f method)
When done, just call this function with specified buffers. Note you can use multiple sets of indices whith only one vertex buffer to render only some faces each time.
void DrawBuffer(VertexBuffer vBuffer, IndexBuffer iBuffer)
// 1) Ensure that the VertexArray client state is enabled.
// 2) Bind the vertex and element (=indices) buffer handles.
GL.BindBuffer(BufferTarget.ArrayBuffer, vBuffer.Id);
GL.BindBuffer(BufferTarget.ElementArrayBuffer, iBuffer.Id);
// 3) Set up the data pointers (vertex, normal, color) according to your vertex format.
GL.VertexPointer(3, VertexPointerType.Float, vBuffer.Stride, new System.IntPtr(0));
GL.NormalPointer(NormalPointerType.Float, vBuffer.Stride, new System.IntPtr(Vector3.SizeInBytes));
GL.TexCoordPointer(2, TexCoordPointerType.Float, vBuffer.Stride, new System.IntPtr(Vector3.SizeInBytes * 2));
GL.ColorPointer(4, ColorPointerType.UnsignedByte, vBuffer.Stride, new System.IntPtr(Vector3.SizeInBytes * 3 + Vector2.SizeInBytes));
// 4) Call DrawElements. (Note: the last parameter is an offset into the element buffer and will usually be IntPtr.Zero).
GL.DrawElements(BeginMode.Triangles, iBuffer.Count, DrawElementsType.UnsignedInt, System.IntPtr.Zero);
//Disable client state
I hope this can help ;)

See this tutorial on glVertex arrays


I am unable to print this array.The application stops working post this line

I have to print this floatbuffer as array and there is a function for that in the documentation but the function is not working. I cant understand what am I doing wrong?
I have tried using the floatBuffer.toString() but it does print the array that the documentation(ARCore) has described.Thus not proper results.
Camera camera = frame.getCamera();
CameraIntrinsics cameraIntrinsics=camera.getImageIntrinsics();
float[] focal=cameraIntrinsics.getFocalLength();
int [] getDiminsions=cameraIntrinsics.getImageDimensions();
Log.e("Dimensions ", Arrays.toString(getDiminsions));
PointCloud pointCloud=frame.acquirePointCloud();
FloatBuffer floatBuffer=pointCloud.getPoints();
FloatBuffer readonly=floatBuffer.asReadOnlyBuffer();
//final boolean res=readonly.hasArray();
final float[] points=floatBuffer.array();
//what should I do
As per the documentation(ARCore) every point in the floatBuffer has 4 values:x,y,z coordinates and a confidence value.
Depending on the implementation of FloatBuffer, the array() method may not be available if the buffer is not backed by an array. You may not need the array if all you are going to do is iterate through the values.
FloatBuffer floatBuffer = pointCloud.getPoints();
// Point cloud data is 4 floats per feature, {x,y,z,confidence}
for (int i = 0; i < floatBuffer.limit() / 4; i++) {
// feature point
float x = floatBuffer.get(i * 4);
float y = floatBuffer.get(i * 4 + 1);
float z = floatBuffer.get(i * 4 + 2);
float confidence = floatBuffer.get(i * 4 + 3);
// Do something with the the point cloud feature....
But if you do need to use an array, you'll need to call hasArray() and if it does not, allocate an array and copy the data.
FloatBuffer floatBuffer = pointCloud.getPoints().asReadOnlyBuffer();
float[] points;
if (floatBuffer.hasArray()) {
// Access the array backing the FloatBuffer
points = floatBuffer.array();
} else {
// allocate array and copy.
points = new float[floatBuffer.limit()];

Unity native OpenGL texture displayed four times

I'm currently facing a problem I simply don't understand.
I employ ARCore for an inside out tracking task. Since I need to do some additional image processing I use Unitys capability to load a native c++ plugin. At the very end of each frame I pass the image in YUV_420_888 format as raw byte array to my native plugin.
A texture handle is created right at the beginning of the components initialization:
private void CreateTextureAndPassToPlugin()
Texture2D tex = new Texture2D(640, 480, TextureFormat.RGBA32, false);
tex.filterMode = FilterMode.Point;
debug_screen_.GetComponent<Renderer>().material.mainTexture = tex;
// Pass texture pointer to the plugin
SetTextureFromUnity(tex.GetNativeTexturePtr(), tex.width, tex.height);
Since I only need the grayscale image I basically ignore the UV part of the image and only use the y coordinates as displayed in the following:
uchar *p_out;
int channels = 4;
for (int r = 0; r < image_matrix->rows; r++) {
p_out = image_matrix->ptr<uchar>(r);
for (int c = 0; c < image_matrix->cols * channels; c++) {
unsigned int idx = r * y_row_stride + c;
p_out[c] = static_cast<uchar>(image_data[idx]);
p_out[c + 1] = static_cast<uchar>(image_data[idx]);
p_out[c + 2] = static_cast<uchar>(image_data[idx]);
p_out[c + 3] = static_cast<uchar>(255);
then each frame the image data is put into a GL texture:
GLuint gltex = (GLuint)(size_t)(g_TextureHandle);
glBindTexture(GL_TEXTURE_2D, gltex);
glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, 640, 480, GL_RGBA, GL_UNSIGNED_BYTE,;
I know that I use way too much memory by creating and passing the texture as RGBA but since GL_R8 is not supported by OpenGL ES3 and GL_ALPHA always lead to internal OpenGL errors I just pass the greyscale value to each color component.
However in the end the texture is rendered as can be seen in the following image:
At first I thought, that the reason for this may lie in the other channels having the same values, however setting all other channels than the first one to any value does not have any impact.
Am I missing something OpenGL texture creation wise?
YUV_420_888 is a multiplane texture, where the luminance plane only contains a single channel per pixel.
for (int c = 0; c < image_matrix->cols * channels; c++) {
unsigned int idx = r * y_row_stride + c;
Your loop bounds assume c is in multiple of 4 channels, which is right for the output surface, but you then use it also when computing the input surface index. The input surface plane you are using only contains one channel, so idx is wrong.
In general you are also over writing the same memory multiple times - the loop increments c by one each iteration but you then write to c, c+1, c+2, and c+3 so overwrite three of the values you wrote last time.
Shorter answer - your OpenGL ES code is fine, but I think you're filling the texture with bad data.
Untested, but I think you need:
for (int c = 0; c < image_matrix->cols * channels; c += channels) {
unsigned int idx = (r * y_row_stride) + (c / channels);

Superpowered - export to file with Mixer and Decoder problem (different sampleRates and samplesPerFrame)

I am trying to mix user's voice with music and save it to a file.
I created 2 Decoders - 1 for voice and 1 for music and put them into Mixer's input. I decode each frame and save it to file using FILE/createWAV/fwrite.
Everything works perfectly when my song is .wav and have the same sampleRate and samplesPerFrame as recorded voice (48000/1024).
However when I want to use .mp3 file with different parameters (44100/1152) final file is incorrect - it is stretched or has some crackling sounds. I think it's is because we get different sampledDecoded for each decoder and when it is put into Mixer or saved to file - difference between these samples are missing.
As far as I am concerned when we do voiceDecoder->decode(buffer, &samplesDecoded) it moves samplePosition by samplesDecoded.
What I tried to do is to use minimum value from both decoders. However according to above sentence on every loop iteration song will loose (1152 - 1024 = 128) 128 samples so I also tried to seek songDecoder to be the same as voiceDecoder: songDecoder->seek(voiceDecoder->samplePosition, true) but it led to totally incorrect file.
To summarize: How should I handle mixer/offlineProcessing with 2 decoders when each of them have different sampleRate and samplesPerFrame?
void AudioProcessor::startProcessing() {
SuperpoweredStereoMixer *mixer = new SuperpoweredStereoMixer();
float *mixerInputs_[] = {0,0,0,0};
float *mixerOutputs_[] = {0,0};
float inputLevels_[]= {0.5f, 0.5f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f};
float outputLevels_[] = { 1.0f, 1.0f };
SuperpoweredDecoder *voiceDecoder = new SuperpoweredDecoder();
SuperpoweredDecoder *songDecoder = new SuperpoweredDecoder();
if (voiceDecoder->open(voiceInputPath, false) || songDecoder->open(songInputPath, false, songOffset, songLength)) {
delete voiceDecoder;
delete songDecoder;
delete mixer;
callJavaVoidMethodWithBoolParam(jvm, jObject, processingFinishedMethodId, false);
FILE *fd = createWAV(outputPath, songDecoder->samplerate, 2);
if (!fd) {
delete voiceDecoder;
delete songDecoder;
delete mixer;
callJavaVoidMethodWithBoolParam(jvm, jObject, processingFinishedMethodId, false);
// Create a buffer for the 16-bit integer samples coming from the decoder.
short int *voiceIntBuffer = (short int *)malloc(voiceDecoder->samplesPerFrame * 4 * sizeof(short int) + 32768);
short int *songIntBuffer = (short int *)malloc(songDecoder->samplesPerFrame * 4 * sizeof(short int) + 32768);
short int *outputIntBuffer = (short int *)malloc(voiceDecoder->samplesPerFrame * 4 * sizeof(short int) + 32768);
// Create a buffer for the 32-bit floating point samples required by the effect.
float *voiceFloatBuffer = (float *)malloc(voiceDecoder->samplesPerFrame * 4 * sizeof(float) + 32768);
float *songFloatBuffer = (float *)malloc(songDecoder->samplesPerFrame * 4 * sizeof(float) + 32768);
float *outputFloatBuffer = (float *)malloc(voiceDecoder->samplesPerFrame * 4 * sizeof(float) + 32768);
bool isError = false;
// Processing.
while (true) {
if (isCanceled) {
isError = true;
// Decode one frame. samplesDecoded will be overwritten with the actual decoded number of samples.
unsigned int voiceSamplesDecoded = voiceDecoder->samplesPerFrame;
if (voiceDecoder->decode(voiceIntBuffer, &voiceSamplesDecoded) == SUPERPOWEREDDECODER_ERROR) {
if (voiceSamplesDecoded < 1) {
// Decode one frame. samplesDecoded will be overwritten with the actual decoded number of samples.
unsigned int songSamplesDecoded = songDecoder->samplesPerFrame;
if (songDecoder->decode(songIntBuffer, &songSamplesDecoded) == SUPERPOWEREDDECODER_ERROR) {
if (songSamplesDecoded < 1) {
unsigned int samplesDecoded = static_cast<unsigned int>(fmin(voiceSamplesDecoded, songSamplesDecoded));
// Convert the decoded PCM samples from 16-bit integer to 32-bit floating point.
SuperpoweredShortIntToFloat(voiceIntBuffer, voiceFloatBuffer, samplesDecoded);
SuperpoweredShortIntToFloat(songIntBuffer, songFloatBuffer, samplesDecoded);
//setup mixer inputs
mixerInputs_[0] = voiceFloatBuffer;
mixerInputs_[1] = songFloatBuffer;
mixerInputs_[2] = NULL;
mixerInputs_[3] = NULL;
// setup mixer outputs, might have two separate outputs (L/R) if second not null
mixerOutputs_[0] = outputFloatBuffer;
mixerOutputs_[1] = NULL;
mixer->process(mixerInputs_, mixerOutputs_, inputLevels_, outputLevels_, NULL, NULL, samplesDecoded);
// Convert the PCM samples from 32-bit floating point to 16-bit integer.
SuperpoweredFloatToShortInt(outputFloatBuffer, outputIntBuffer, samplesDecoded);
// Write the audio to disk.
fwrite(outputIntBuffer, 1, samplesDecoded * 4, fd);
// songDecoder->seek(voiceDecoder->samplePosition, true);
// Cleanup.
delete voiceDecoder;
delete songDecoder;
delete mixer;
Thanks in advance!
You need to match the sample rates using the SuperpoweredResampler class. You'll also need some circular buffer for both inputs, because the available number of samples will not match in many cases.
Ok so I managed to get it work. I did what #Gabor proposed but it was not fully working. What I was missing was channels - I had to include it in my buffer/shift operations and now it's fine!

TensorFlow Lite and Android Things - Locating the detected Object and store them in RectF Objects?

I have an Android Tablet where I installed the TensorFlow-Lite DetectorActivity in the examples that were available. It works well on an Android Tablet. However, when I tried to deploy it on a RaspberryPi 3 Model B that ran Android Things, it didn't run. There seemed to be an issue with configuring the camera properly in terms of enabling a live camera preview and running an analysis.
My original goal is to make an object detection app run on Android Things. It is also essential to draw the bounding rectangle on the detected objects.
I was looking for an example of an Android App that used TensorFlow-Lite and ran on Android Things. I quickly found this example from that uses Image Classification to dispense candy. I ran it on my RaspberryPi Board, and it ran. It gives the results, the name of the object, the confidence level, as well as the ID. I was okay to build upon this sample code. Instead of a live camera feed, I could just make the app take a photo, analyze, and give the result. After which, it takes another photo and the cycle continues.
However, it did not specify the location in terms of a RectF Object.
What I tried to do was to adapt the recognizeFunction in the TFLite Android Example, it's in the TFLiteObjectDetectionAPIModel class. I adapted it to the doIdentification function of the Candy Dispenser Android App. My function now looks like this:
// outputLocations: array of shape [Batchsize, NUM_DETECTIONS,4]
// contains the location of detected boxes
private float[][][] outputLocations;
// outputClasses: array of shape [Batchsize, NUM_DETECTIONS]
// contains the classes of detected boxes
private float[][] outputClasses;
// outputScores: array of shape [Batchsize, NUM_DETECTIONS]
// contains the scores of detected boxes
private float[][] outputScores;
// numDetections: array of shape [Batchsize]
// contains the number of detected boxes
private float[] numDetections;
private static final int NUM_DETECTIONS = 10;
private static final float IMAGE_MEAN = 128.0f;
private static final float IMAGE_STD = 128.0f;
private void doIdentification(Bitmap image) {
Log.e(TAG, "doing identification!");
int numBytesPerChannel;
Log.e(TAG, "model is quantized");
numBytesPerChannel = 1; // Quantized
} else {
Log.e(TAG, "model is NOT quantized");
numBytesPerChannel = 4; // Floating point
ByteBuffer imgData = ByteBuffer.allocateDirect(1 * TF_INPUT_IMAGE_HEIGHT * TF_INPUT_IMAGE_HEIGHT
* 3 * numBytesPerChannel);
// Preprocess the image data from 0-255 int to normalized float based
// on the provided parameters.
image.getPixels(intValues, 0, image.getWidth(), 0, 0, image.getWidth(), image.getHeight());
for (int i = 0; i < TF_INPUT_IMAGE_HEIGHT; ++i) {
for (int j = 0; j < TF_INPUT_IMAGE_HEIGHT; ++j) {
int pixelValue = intValues[i * TF_INPUT_IMAGE_HEIGHT + j];
imgData.put((byte) ((pixelValue >> 16) & 0xFF));
imgData.put((byte) ((pixelValue >> 8) & 0xFF));
imgData.put((byte) (pixelValue & 0xFF));
} else {
imgData.putFloat((((pixelValue >> 16) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
imgData.putFloat((((pixelValue >> 8) & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
imgData.putFloat(((pixelValue & 0xFF) - IMAGE_MEAN) / IMAGE_STD);
Trace.endSection(); // preprocessBitmap
// Allocate space for the inference results
byte[][] confidencePerLabel = new byte[1][mLabels.size()];
//for box detections
// Copy the input data into TensorFlow.
outputLocations = new float[1][NUM_DETECTIONS][4];
outputClasses = new float[1][NUM_DETECTIONS];
outputScores = new float[1][NUM_DETECTIONS];
numDetections = new float[1];
Object[] inputArray = {imgData};
Map<Integer, Object> outputMap = new HashMap<>();
outputMap.put(0, outputLocations);
outputMap.put(1, outputClasses);
outputMap.put(2, outputScores);
outputMap.put(3, numDetections);
// Read image data into buffer formatted for the TensorFlow model
TensorFlowHelper.convertBitmapToByteBuffer(image, intValues, imgData);
// Run inference on the network with the image bytes in imgData as input,
// storing results on the confidencePerLabel array.
mTensorFlowLite.runForMultipleInputsOutputs(inputArray, outputMap);
// TODO - we try and fetch our rectF's here
final ArrayList<Recognition> recognitions = new ArrayList<>(NUM_DETECTIONS);
for (int i = 0; i < NUM_DETECTIONS; ++i) {
final RectF detection =
new RectF(
outputLocations[0][i][1] * TF_OD_API_INPUT_SIZE,
outputLocations[0][i][0] * TF_OD_API_INPUT_SIZE,
outputLocations[0][i][3] * TF_OD_API_INPUT_SIZE,
outputLocations[0][i][2] * TF_OD_API_INPUT_SIZE);
// SSD Mobilenet V1 Model assumes class 0 is background class
// in label file and class labels start from 1 to number_of_classes+1,
// while outputClasses correspond to class index from 0 to number_of_classes
int labelOffset = 1;
Log.e(TAG, "adding the following to our results: ");
Log.e(TAG, "recognition id: " + i);
Log.e(TAG, "recognition label: " + mLabels.get((int) outputClasses[0][i] + labelOffset));
Log.e(TAG, "recognition confidence: " + outputScores[0][i]);
new Recognition(
"" + i,
mLabels.get((int) outputClasses[0][i] + labelOffset),
Trace.endSection(); // "recognizeImage"
// TODO -- This is the old working code
// Get the results with the highest confidence and map them to their labels
Collection<Recognition> results = TensorFlowHelper.getBestResults(confidencePerLabel, mLabels);
Log.e(TAG, "results count is = " + results.size());
// Report the results with the highest confidence
I set the Quantized constant to true, and ran the code. However, I was greeted by the following error:
java.lang.IllegalArgumentException: Cannot convert between a TensorFlowLite tensor with type UINT8 and a Java object of type [[[F (which is compatible with the TensorFlowLite type FLOAT32).
and the line responsible is:
mTensorFlowLite.runForMultipleInputsOutputs(inputArray, outputMap);
I tried changing to to just, but that resulted in a different error.
Has anyone implemented Object Detection (drawing a RectF on the detected objects) on Android Things?
The issue here is likely you are mixing models. There is a difference between image classification and object detection models. Classification simply reports the confidence of a certain type of object within the image and detection adds on identifying the object's location. The candy dispenser sample you are starting from uses an image classification model (mobilenet_quant_v1_224.tflite) whereas the TFLite sample you mentioned runs an object detection model (mobilenet_ssd.tflite).
I would recommend starting from the sample that does object detection and work through the camera issues rather than solving the problem the other way around. The candy dispenser sample (as well as the official image classifier sample) provide a good reference for getting the camera on the RPi3 to capture an image and converting it for use with the model.

YUV_420_888 interpretation on Samsung Galaxy S7 (Camera2)

I wrote a conversion from YUV_420_888 to Bitmap, considering the following logic (as I understand it):
To summarize the approach: the kernel’s coordinates x and y are congruent both with the x and y of the non-padded part of the Y-Plane (2d-allocation) and the x and y of the output-Bitmap. The U- and V-Planes, however, have a different structure than the Y-Plane, because they use 1 byte for coverage of 4 pixels, and, in addition, may have a PixelStride that is more than one, in addition they might also have a padding that can be different from that of the Y-Plane. Therefore, in order to access the U’s and V’s efficiently by the kernel I put them into 1-d allocations and created an index “uvIndex” that gives the position of the corresponding U- and V within that 1-d allocation, for given (x,y) coordinates in the (non-padded) Y-plane (and, so, the output Bitmap).
In order to keep the rs-Kernel lean, I excluded the padding area in the yPlane by capping the x-range via LaunchOptions (this reflects the RowStride of the y-plane which thus can be ignored WITHIN the kernel). So we just need to consider the uvPixelStride and uvRowStride within the uvIndex, i.e. the index used in order to access to the u- and v-values.
This is my code:
Renderscript Kernel, named
#pragma version(1)
#pragma rs java_package_name(com.xxxyyy.testcamera2);
#pragma rs_fp_relaxed
int32_t width;
int32_t height;
uint picWidth, uvPixelStride, uvRowStride ;
rs_allocation ypsIn,uIn,vIn;
// The LaunchOptions ensure that the Kernel does not enter the padding zone of Y, so yRowStride can be ignored WITHIN the Kernel.
uchar4 __attribute__((kernel)) doConvert(uint32_t x, uint32_t y) {
// index for accessing the uIn's and vIn's
uint uvIndex= uvPixelStride * (x/2) + uvRowStride*(y/2);
// get the y,u,v values
uchar yps= rsGetElementAt_uchar(ypsIn, x, y);
uchar u= rsGetElementAt_uchar(uIn, uvIndex);
uchar v= rsGetElementAt_uchar(vIn, uvIndex);
// calc argb
int4 argb;
argb.r = yps + v * 1436 / 1024 - 179;
argb.g = yps -u * 46549 / 131072 + 44 -v * 93604 / 131072 + 91;
argb.b = yps +u * 1814 / 1024 - 227;
argb.a = 255;
uchar4 out = convert_uchar4(clamp(argb, 0, 255));
return out;
Java side:
private Bitmap YUV_420_888_toRGB(Image image, int width, int height){
// Get the three image planes
Image.Plane[] planes = image.getPlanes();
ByteBuffer buffer = planes[0].getBuffer();
byte[] y = new byte[buffer.remaining()];
buffer = planes[1].getBuffer();
byte[] u = new byte[buffer.remaining()];
buffer = planes[2].getBuffer();
byte[] v = new byte[buffer.remaining()];
// get the relevant RowStrides and PixelStrides
// (we know from documentation that PixelStride is 1 for y)
int yRowStride= planes[0].getRowStride();
int uvRowStride= planes[1].getRowStride(); // we know from documentation that RowStride is the same for u and v.
int uvPixelStride= planes[1].getPixelStride(); // we know from documentation that PixelStride is the same for u and v.
// rs creation just for demo. Create rs just once in onCreate and use it again.
RenderScript rs = RenderScript.create(this);
//RenderScript rs =;
ScriptC_yuv420888 mYuv420=new ScriptC_yuv420888 (rs);
// Y,U,V are defined as global allocations, the out-Allocation is the Bitmap.
// Note also that uAlloc and vAlloc are 1-dimensional while yAlloc is 2-dimensional.
Type.Builder typeUcharY = new Type.Builder(rs, Element.U8(rs));
//using safe height
typeUcharY.setX(yRowStride).setY(y.length / yRowStride);
Allocation yAlloc = Allocation.createTyped(rs, typeUcharY.create());
Type.Builder typeUcharUV = new Type.Builder(rs, Element.U8(rs));
// note that the size of the u's and v's are as follows:
// ( (width/2)*PixelStride + padding ) * (height/2)
// = (RowStride ) * (height/2)
// but I noted that on the S7 it is 1 less...
Allocation uAlloc = Allocation.createTyped(rs, typeUcharUV.create());
Allocation vAlloc = Allocation.createTyped(rs, typeUcharUV.create());
// handover parameters
mYuv420.set_uvRowStride (uvRowStride);
mYuv420.set_uvPixelStride (uvPixelStride);
Bitmap outBitmap = Bitmap.createBitmap(width, height, Bitmap.Config.ARGB_8888);
Allocation outAlloc = Allocation.createFromBitmap(rs, outBitmap, Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_SCRIPT);
Script.LaunchOptions lo = new Script.LaunchOptions();
lo.setX(0, width); // by this we ignore the y’s padding zone, i.e. the right side of x between width and yRowStride
//using safe height
lo.setY(0, y.length / yRowStride);
return outBitmap;
Testing on Nexus 7 (API 22) this returns nice color Bitmaps. This device, however, has trivial pixelstrides (=1) and no padding (i.e. rowstride=width). Testing on the brandnew Samsung S7 (API 23) I get pictures whose colors are not correct - except of the green ones. But the Picture does not show a general bias towards green, it just seems that non-green colors are not reproduced correctly. Note, that the S7 applies an u/v pixelstride of 2, and no padding.
Since the most crucial code line is within the rs-code the Access of the u/v planes uint uvIndex= (...) I think, there could be the problem, probably with incorrect consideration of pixelstrides here. Does anyone see the solution? Thanks.
UPDATE: I checked everything, and I am pretty sure that the code regarding the access of y,u,v is correct. So the problem must be with the u and v values themselves. Non green colors have a purple tilt, and looking at the u,v values they seem to be in a rather narrow range of about 110-150. Is it really possible that we need to cope with device specific YUV -> RBG conversions...?! Did I miss anything?
UPDATE 2: have corrected code, it works now, thanks to Eddy's Feedback.
Look at
floor((float) uvPixelStride*(x)/2)
which calculates your U,V row offset (uv_row_offset) from the Y x-coordinate.
if uvPixelStride = 2, then as x increases:
x = 0, uv_row_offset = 0
x = 1, uv_row_offset = 1
x = 2, uv_row_offset = 2
x = 3, uv_row_offset = 3
and this is incorrect. There's no valid U/V pixel value at uv_row_offset = 1 or 3, since uvPixelStride = 2.
You want
uvPixelStride * floor(x/2)
(assuming you don't trust yourself to remember the critical round-down behavior of integer divide, if you do then):
uvPixelStride * (x/2)
should be enough
With that, your mapping becomes:
x = 0, uv_row_offset = 0
x = 1, uv_row_offset = 0
x = 2, uv_row_offset = 2
x = 3, uv_row_offset = 2
See if that fixes the color errors. In practice, the incorrect addressing here would mean every other color sample would be from the wrong color plane, since it's likely that the underlying YUV data is semiplanar (so the U plane starts at V plane + 1 byte, with the two planes interleaved)
For people who encounter error Array too small for allocation type
use buffer.capacity() instead of buffer.remaining()
and if you already made some operations on the image, you'll need to call rewind() method on the buffer.
Furthermore for anyone else getting Array too
small for allocation type
I fixed it by changing yAlloc.copyFrom(y); to yAlloc.copy1DRangeFrom(0, y.length, y);
Posting full solution to convert YUV->BGR (can be adopted for other formats too) and also rotate image to upright using renderscript. Allocation is used as input and byte array is used as output. It was tested on Android 8+ including Samsung devices too.
* Renderscript-based process to convert YUV_420_888 to BGR_888 and rotation to upright.
public class ImageProcessor {
protected final String TAG = this.getClass().getSimpleName();
private Allocation mInputAllocation;
private Allocation mOutAllocLand;
private Allocation mOutAllocPort;
private Handler mProcessingHandler;
private ScriptC_yuv_bgr mConvertScript;
private byte[] frameBGR;
public ProcessingTask mTask;
private ImageListener listener;
private Supplier<Integer> rotation;
public ImageProcessor(RenderScript rs, Size dimensions, ImageListener listener, Supplier<Integer> rotation) {
this.listener = listener;
this.rotation = rotation;
int w = dimensions.getWidth();
int h = dimensions.getHeight();
Type.Builder yuvTypeBuilder = new Type.Builder(rs, Element.YUV(rs));
mInputAllocation = Allocation.createTyped(rs, yuvTypeBuilder.create(),
Allocation.USAGE_IO_INPUT | Allocation.USAGE_SCRIPT);
//keep 2 allocations to handle different image rotations
mOutAllocLand = createOutBGRAlloc(rs, w, h);
mOutAllocPort = createOutBGRAlloc(rs, h, w);
frameBGR = new byte[w*h*3];
HandlerThread processingThread = new HandlerThread(this.getClass().getSimpleName());
mProcessingHandler = new Handler(processingThread.getLooper());
mConvertScript = new ScriptC_yuv_bgr(rs);
mTask = new ProcessingTask(mInputAllocation);
private Allocation createOutBGRAlloc(RenderScript rs, int width, int height) {
//Stored as Vec4, it's impossible to store as Vec3, buffer size will be for Vec4 anyway
//using RGB_888 as alternative for BGR_888, can be just U8_3 type
Type.Builder rgbTypeBuilderPort = new Type.Builder(rs, Element.RGB_888(rs));
Allocation allocation = Allocation.createTyped(
rs, rgbTypeBuilderPort.create(), Allocation.USAGE_SCRIPT
//Use auto-padding to be able to copy to x*h*3 bytes array
return allocation;
public Surface getInputSurface() {
return mInputAllocation.getSurface();
* Simple class to keep track of incoming frame count,
* and to process the newest one in the processing thread
class ProcessingTask implements Runnable, Allocation.OnBufferAvailableListener {
private int mPendingFrames = 0;
private Allocation mInputAllocation;
public ProcessingTask(Allocation input) {
mInputAllocation = input;
public void onBufferAvailable(Allocation a) {
synchronized(this) {
public void run() {
// Find out how many frames have arrived
int pendingFrames;
synchronized(this) {
pendingFrames = mPendingFrames;
mPendingFrames = 0;
// Discard extra messages in case processing is slower than frame rate
// Get to newest input
for (int i = 0; i < pendingFrames; i++) {
int rot = rotation.get();
Allocation allocOut = rot==90 || rot== 270 ? mOutAllocPort : mOutAllocLand;
// Run processing
// ain allocation isn't really used, global frame param is used to get data from
//Save to byte array, BGR 24bit
int w = allocOut.getType().getX();
int h = allocOut.getType().getY();
if (listener != null) {
listener.onImageAvailable(frameBGR, w, h);
public interface ImageListener {
* Called when there is available image, image is in upright position.
* #param bgr BGR 24bit bytes
* #param width image width
* #param height image height
void onImageAvailable(byte[] bgr, int width, int height);
#pragma version(1)
#pragma rs java_package_name(
#pragma rs_fp_relaxed
//Script convers YUV to BGR(uchar3)
//current YUV frame to read pixels from
rs_allocation currentYUVFrame;
//input image rotation: 0,90,180,270 clockwise
uint32_t rotation;
uint32_t inWidth;
uint32_t inHeight;
//method returns uchar3 BGR which will be set to x,y in output allocation
uchar3 __attribute__((kernel)) yuv_bgr(uint32_t x, uint32_t y) {
// Read in pixel values from latest frame - YUV color space
uchar3 inPixel;
uint32_t xRot = x;
uint32_t yRot = y;
//Do not rotate if 0
if (rotation==90) {
//rotate 270 clockwise
xRot = y;
yRot = inHeight - 1 - x;
} else if (rotation==180) {
xRot = inWidth - 1 - x;
yRot = inHeight - 1 - y;
} else if (rotation==270) {
//rotate 90 clockwise
xRot = inWidth - 1 - y;
yRot = x;
inPixel.r = rsGetElementAtYuv_uchar_Y(currentYUVFrame, xRot, yRot);
inPixel.g = rsGetElementAtYuv_uchar_U(currentYUVFrame, xRot, yRot);
inPixel.b = rsGetElementAtYuv_uchar_V(currentYUVFrame, xRot, yRot);
// Convert YUV to RGB, JFIF transform with fixed-point math
// R = Y + 1.402 * (V - 128)
// G = Y - 0.34414 * (U - 128) - 0.71414 * (V - 128)
// B = Y + 1.772 * (U - 128)
int3 bgr;
//get red pixel and assing to b
bgr.b = inPixel.r +
inPixel.b * 1436 / 1024 - 179;
bgr.g = inPixel.r -
inPixel.g * 46549 / 131072 + 44 -
inPixel.b * 93604 / 131072 + 91;
//get blue pixel and assign to red
bgr.r = inPixel.r +
inPixel.g * 1814 / 1024 - 227;
// Write out
return convert_uchar3(clamp(bgr, 0, 255));
On a Samsung Galaxy Tab 5 (Tablet), android version 5.1.1 (22), with alleged YUV_420_888 format, the following renderscript math works well and produces correct colors:
uchar yValue = rsGetElementAt_uchar(gCurrentFrame, x + y * yRowStride);
uchar vValue = rsGetElementAt_uchar(gCurrentFrame, ( (x/2) + (y/4) * yRowStride ) + (xSize * ySize) );
uchar uValue = rsGetElementAt_uchar(gCurrentFrame, ( (x/2) + (y/4) * yRowStride ) + (xSize * ySize) + (xSize * ySize) / 4);
I do not understand why the horizontal value (i.e., y) is scaled by a factor of four instead of two, but it works well. I also needed to avoid use of rsGetElementAtYuv_uchar_Y|U|V. I believe the associated allocation stride value is set to zero instead of something proper. Use of rsGetElementAt_uchar() is a reasonable work-around.
On a Samsung Galaxy S5 (Smart Phone), android version 5.0 (21), with alleged YUV_420_888 format, I cannot recover the u and v values, they come through as all zeros. This results in a green looking image. Luminous is OK, but image is vertically flipped.
This code requires the use of the RenderScript compatibility library (*).
In order to get the compatibility library to work with Android API 23, I updated to gradle-plugin 2.1.0 and Build-Tools 23.0.3 as per Miao Wang's answer at How to create Renderscript scripts on Android Studio, and make them run?
If you follow his answer and get an error "Gradle version 2.10 is required" appears, do NOT change
classpath ''
Instead, update the distributionUrl field of the Project\gradle\wrapper\ file to
and change File > Settings > Builds,Execution,Deployment > Build Tools > Gradle >Gradle to Use default gradle wrapper as per "Gradle Version 2.10 is required." Error.
Re: RSIllegalArgumentException
In my case this was the case that buffer.remaining() was not multiple of stride:
The length of last line was less than stride (i.e. only up to where actual data was.)
An FYI in case someone else gets this as I was also getting " Array too small for allocation type" when trying out the code. In my case it turns out that the when allocating the buffer for Y i had to rewind the buffer because it was being left at the wrong end and wasn't copying the data. By doing buffer.rewind(); before allocation the new bytes array makes it work fine now.

