I am using the “FaceNet” model converted with “TensorFlow Lite” into a quantized model. It was done following the instructions on the page https://medium.com/analytics-vidhya/facenet-on-mobile-cb6aebe38505.
This is the info on the input and the output buffer of the quantized model.
INPUTS:
[{'index': 451, 'shape': array([ 1, 160, 160, 3], dtype=int32), 'quantization': (0.0078125, 128L), 'name': 'input', 'dtype': <type 'numpy.uint8'>}]
OUTPUTS:
[{'index': 450, 'shape': array([ 1, 512], dtype=int32), 'quantization': (0.0235294122248888, 0L), 'name': 'embeddings', 'dtype': <type 'numpy.uint8'>}]
I do not manage to fill the input buffer properly.
I have already used “FaceNet” FULL model, which takes float values and it worked as expected. So I know what input float values should look like for the full model, so I guess there is only one step more to convert each float value into a corresponding byte value and to feed those byte values into the TensorFlow Lite model.
This is what I did with “FaceNet” FULL model.
//extract all the pixels of the image (of the face area of 160 x 160)
bitmap.getPixels(intValues, 0, inputWidth, 0, 0, inputWidth, inputHeight);
//copy the value of each channel of each pixel into an array
for (int i = 0; i < intValues.length; ++i) {
int p = intValues[i];
shortValues[i * 3 + 2] = (short) (p & 0xFF);
shortValues[i * 3 + 1] = (short) ((p >> 8) & 0xFF);
shortValues[i * 3 + 0] = (short) ((p >> 16) & 0xFF);
}
//calculate the mean value of all the pixels of the image
double sum = 0f;
for (short shortValue : shortValues) {
sum += shortValue;
}
double mean = sum / shortValues.length;
sum = 0f;
for (short shortValue : shortValues) {
sum += Math.pow(shortValue - mean, 2);
}
//calculate the standard deviation of all the pixels of the image
double std = Math.sqrt(sum / shortValues.length);
double std_adj = Math.max(std, 1.0/ Math.sqrt(shortValues.length));
//FINALLY fill the input buffer for the tensorflow
//calculate a float value for each pixel
for (short shortValue : shortValues) {
inputFloatBuffer.put((float) ((shortValue - mean) * (1 / std_adj)));
}
}
Now that I have float values, how to convert them into byte values for TensorFlow Lite?
I tried every possible combination with the values “0.0078125” (1/128) and “128” (mentioned at the top of the post), but nothing gave meaningful results.
For example:
int int_value = ((short)(float_value * 128)) + 128;
I used scaling to squeeze float values first into the range of [-1,1], but that did not help either.
Does somebody have idea?
I don't know how the model was quantized, but it really worth trying that directly putting the int value pixels you got from bitmap (in range [0, 255]) to the buffer. Make sure to use a ByteBuffer rather than FloatBuffer as well.
Related
Based on the discussion I had at Camera2 api Imageformat.yuv_420_888 results on rotated image, I wanted to know how to adjust the lookup done via rsGetElementAt_uchar methods so that the YUV data is rotated by 90 degree.
I also have a project like the HdrViewfinder provided by Google. The problem is that the output is in landscape because the output surface used as target surface is connected to the yuv allocation which does not care if the device is in landscape or portrait mode. But I want to adjust the code so that it is in portrait mode.
Therefore, I took a custom YUVToRGBA renderscript but I do not know what to change to rotate the output.
Can somebody help me to adjust the following custom YUVtoRGBA script by 90 degree because I want to use the output in portrait mode:
// Needed directive for RS to work
#pragma version(1)
// The java_package_name directive needs to use your Activity's package path
#pragma rs java_package_name(net.hydex11.cameracaptureexample)
rs_allocation inputAllocation;
int wIn, hIn;
int numTotalPixels;
// Function to invoke before applying conversion
void setInputImageSize(int _w, int _h)
{
wIn = _w;
hIn = _h;
numTotalPixels = wIn * hIn;
}
// Kernel that converts a YUV element to a RGBA one
uchar4 __attribute__((kernel)) convert(uint32_t x, uint32_t y)
{
// YUV 4:2:0 planar image, with 8 bit Y samples, followed by
// interleaved V/U plane with 8bit 2x2 subsampled chroma samples
int baseIdx = x + y * wIn;
int baseUYIndex = numTotalPixels + (y >> 1) * wIn + (x & 0xfffffe);
uchar _y = rsGetElementAt_uchar(inputAllocation, baseIdx);
uchar _u = rsGetElementAt_uchar(inputAllocation, baseUYIndex);
uchar _v = rsGetElementAt_uchar(inputAllocation, baseUYIndex + 1);
_y = _y < 16 ? 16 : _y;
short Y = ((short)_y) - 16;
short U = ((short)_u) - 128;
short V = ((short)_v) - 128;
uchar4 out;
out.r = (uchar) clamp((float)(
(Y * 298 + V * 409 + 128) >> 8), 0.f, 255.f);
out.g = (uchar) clamp((float)(
(Y * 298 - U * 100 - V * 208 + 128) >> 8), 0.f, 255.f);
out.b = (uchar) clamp((float)(
(Y * 298 + U * 516 + 128) >> 8), 0.f, 255.f); //
out.a = 255;
return out;
}
I have found that custom script at https://bitbucket.org/cmaster11/rsbookexamples/src/tip/CameraCaptureExample/app/src/main/rs/customYUVToRGBAConverter.fs .
Here someone has put the Java code to rotate YUV data. But I want to do it in Renderscript since that is faster.
Any help would be great.
best regards,
I'm assuming you want the output to be in RGBA, as in your conversion script. You should be able to use an approach like that used in this answer; that is, simply modify the x and y coordinates as the first step in the convert kernel:
//Rotate 90 deg clockwise during the conversion
uchar4 __attribute__((kernel)) convert(uint32_t inX, uint32_t inY)
{
uint32_t x = wIn - 1 - inY;
uint32_t y = inX;
//...rest of the function
Note the changes to the parameter names.
This presumes you have set up the output dimensions correctly (see linked answer). A 270 degree rotation can be accomplished in a similar way.
I'm trying to use Android's RenderScript to render a semi-transparent circle behind an image, but things go very wrong when returning a value from the RenderScript kernel.
This is my kernel:
#pragma version(1)
#pragma rs java_package_name(be.abyx.aurora)
// We don't need very high precision floating points
#pragma rs_fp_relaxed
// Center position of the circle
int centerX = 0;
int centerY = 0;
// Radius of the circle
int radius = 0;
// Destination colour of the background can be set here.
float destinationR;
float destinationG;
float destinationB;
float destinationA;
static int square(int input) {
return input * input;
}
uchar4 RS_KERNEL circleRender(uchar4 in, uint32_t x, uint32_t y) {
//Convert input uchar4 to float4
float4 f4 = rsUnpackColor8888(in);
// Check if the current coordinates fall inside the circle
if (square(x - centerX) + square(y - centerY) < square(radius)) {
// Check if current position is transparent, we then need to add the background!)
if (f4.a == 0) {
uchar4 temp = rsPackColorTo8888(0.686f, 0.686f, 0.686f, 0.561f);
return temp;
}
}
return rsPackColorTo8888(f4);
}
Now, the rsPackColorTo8888() function takes 4 floats with a value between 0.0 and 1.0. The resulting ARGB-color is then found by calculating 255 times each float value. So the given floats correspond to the color R = 0.686 * 255 = 175, G = 0.686 * 255 = 175, B = 0.686 * 255 = 175 and A = 0.561 * 255 = 143.
The rsPackColorTo8888() function itself works correctly, but when the found uchar4 value is returned from the kernel, something really weird happens. The R, G and B value changes to respectively Red * Alpha = 56, Green * Alpha = 56 and Blue * Alpha = 56 where Alpha is 0.561. This means that no value of R, G and B can ever be larger than A = 0.561 * 255.
Setting the output manually, instead of using rsPackColorTo8888() yields exact the same behavior. I mean that following code produces the exact same result, which in turn proofs that rsPackColorTo8888() is not the problem:
if (square(x - centerX) + square(y - centerY) < square(radius)) {
// Check if current position is transparent, we then need to add the background!)
if (f4.a == 0) {
uchar4 temp;
temp[0] = 175;
temp[1] = 175;
temp[2] = 175;
temp[3] = 143;
return temp;
}
}
This is the Java-code from which the script is called:
#Override
public Bitmap renderParallel(Bitmap input, int backgroundColour, int padding) {
ResizeUtility resizeUtility = new ResizeUtility();
// We want to end up with a square Bitmap with some padding applied to it, so we use the
// the length of the largest dimension (width or height) as the width of our square.
int dimension = resizeUtility.getLargestDimension(input.getWidth(), input.getHeight()) + 2 * padding;
Bitmap output = resizeUtility.createSquareBitmapWithPadding(input, padding);
output.setHasAlpha(true);
RenderScript rs = RenderScript.create(this.context);
Allocation inputAlloc = Allocation.createFromBitmap(rs, output);
Type t = inputAlloc.getType();
Allocation outputAlloc = Allocation.createTyped(rs, t);
ScriptC_circle_render circleRenderer = new ScriptC_circle_render(rs);
circleRenderer.set_centerX(dimension / 2);
circleRenderer.set_centerY(dimension / 2);
circleRenderer.set_radius(dimension / 2);
circleRenderer.set_destinationA(((float) Color.alpha(backgroundColour)) / 255.0f);
circleRenderer.set_destinationR(((float) Color.red(backgroundColour)) / 255.0f);
circleRenderer.set_destinationG(((float) Color.green(backgroundColour)) / 255.0f);
circleRenderer.set_destinationB(((float) Color.blue(backgroundColour)) / 255.0f);
circleRenderer.forEach_circleRender(inputAlloc, outputAlloc);
outputAlloc.copyTo(output);
inputAlloc.destroy();
outputAlloc.destroy();
circleRenderer.destroy();
rs.destroy();
return output;
}
When alpha is set to 255 (or 1.0 as a float), the returned color-values (inside my application's Java-code) are correct.
Am I doing something wrong, or is this really a bug somewhere in the RenderScript-implementation?
Note: I've checked and verified this behavior on a Oneplus 3T (Android 7.1.1), a Nexus 5 (Android 7.1.2), Android-emulator version 7.1.2 and 6.0
Instead of passing the values with the type:
uchar4 temp = rsPackColorTo8888(0.686f, 0.686f, 0.686f, 0.561f);
Trying creating a float4 and passing that.
float4 newFloat4 = { 0.686, 0.686, 0.686, 0.561 };
uchar4 temp = rsPackColorTo8888(newFloat4);
I am trying to capture the image data in the onFrameAvailable method from a Google Tango. I am using the Leibniz release. In the header file it is said that the buffer contains HAL_PIXEL_FORMAT_YV12 pixel data. In the release notes they say the buffer contains YUV420SP. But in the documentation it is said the pixels are RGBA8888 format (). I am a little confused and additionally. I don't really get image data but a lot of magenta and green. Right now I am trying to convert from YUV to RGB similar to this one. I guess there is something wrong with the stride, too. Here eís the code of the onFrameAvailable method:
int size = (int)(buffer->width * buffer->height);
for (int i = 0; i < buffer->height; ++i)
{
for (int j = 0; j < buffer->width; ++j)
{
float y = buffer->data[i * buffer->stride + j];
float v = buffer->data[(i / 2) * (buffer->stride / 2) + (j / 2) + size];
float u = buffer->data[(i / 2) * (buffer->stride / 2) + (j / 2) + size + (size / 4)];
const float Umax = 0.436f;
const float Vmax = 0.615f;
y = y / 255.0f;
u = (u / 255.0f - 0.5f) ;
v = (v / 255.0f - 0.5f) ;
TangoData::GetInstance().color_buffer[3*(i*width+j)]=y;
TangoData::GetInstance().color_buffer[3*(i*width+j)+1]=u;
TangoData::GetInstance().color_buffer[3*(i*width+j)+2]=v;
}
}
I am doing the yuv to rgb conversion in the fragment shader.
Has anyone ever obtained an RGB image for the Google Tango Leibniz release? Or had someone similar problems when converting from YUV to RGB?
YUV420SP (aka NV21) is correct for the time being. An explanation is here. In this format you have a width x height array where each element is a Y byte, followed by a width/2 x height/2 array where each element is a V byte and a U byte. Your code is implementing YV21, which has separate arrays for V and U instead of interleaving them in one array.
You mention that you are doing YUV to RGB conversion in a fragment shader. If all you want to do with the camera images is draw then you can use TangoService_connectTextureId() and TangoService_updateTexture() instead of TangoService_connectOnFrameAvailable(). This approach delivers the camera image to you already in an OpenGL texture that gives your fragment shader RGB values without bothering with the pixel format details. You will need to bind to GL_TEXTURE_EXTERNAL_OES (instead of GL_TEXTURE_2D), and your fragment shader would look something like this:
#extension GL_OES_EGL_image_external : require
precision mediump float;
varying vec4 v_t;
uniform samplerExternalOES colorTexture;
void main() {
gl_FragColor = texture2D(colorTexture, v_t.xy);
}
If you really do want to pass YUV data to a fragment shader for some reason, you can do so without preprocessing it into floats. In fact, you don't need to unpack it at all - for NV21 just define a 1-byte texture for Y and a 2-byte texture for VU, and load the data as-is. Your fragment shader will use the same texture coordinates for both.
By the way, if someone experienced problems with capturing the image data on the Leibniz release, too: One of the developers told me that there is a bug concerning the camera and that it should be fixed with the Nash release.
The bug caused my buffer to be null but when I used the Nash update I got data again. However, right now the problem is that the data I am using doesn't make sense. I guess/hope the cause is that the Tablet didn't get the OTA update yet (there can be a gap between the actual release date and the OTA software update).
Just try code following:
//C#
public bool YV12ToPhoto(byte[] data, int width, int height, out Texture2D photo)
{
photo = new Texture2D(width, height);
int uv_buffer_offset = width * height;
for (int i = 0; i < height; i++)
{
for (int j = 0; j < width; j++)
{
int x_index = j;
if (j % 2 != 0)
{
x_index = j - 1;
}
// Get the YUV color for this pixel.
int yValue = data[(i * width) + j];
int uValue = data[uv_buffer_offset + ((i / 2) * width) + x_index + 1];
int vValue = data[uv_buffer_offset + ((i / 2) * width) + x_index];
// Convert the YUV value to RGB.
float r = yValue + (1.370705f * (vValue - 128));
float g = yValue - (0.689001f * (vValue - 128)) - (0.337633f * (uValue - 128));
float b = yValue + (1.732446f * (uValue - 128));
Color co = new Color();
co.b = b < 0 ? 0 : (b > 255 ? 1 : b / 255.0f);
co.g = g < 0 ? 0 : (g > 255 ? 1 : g / 255.0f);
co.r = r < 0 ? 0 : (r > 255 ? 1 : r / 255.0f);
co.a = 1.0f;
photo.SetPixel(width - j - 1, height - i - 1, co);
}
}
return true;
}
I have succeeded.
In my app i want to edit images like brightness, contrast, etc. I got some tutorial and i am trying this to change contrast
public static Bitmap createContrast(Bitmap src, double value) {
// image size
int width = src.getWidth();
int height = src.getHeight();
// create output bitmap
Bitmap bmOut = Bitmap.createBitmap(width, height, src.getConfig());
// color information
int A, R, G, B;
int pixel;
// get contrast value
double contrast = Math.pow((100 + value) / 100, 2);
// scan through all pixels
for(int x = 0; x < width; ++x) {
for(int y = 0; y < height; ++y) {
// get pixel color
pixel = src.getPixel(x, y);
A = Color.alpha(pixel);
// apply filter contrast for every channel R, G, B
R = Color.red(pixel);
R = (int)(((((R / 255.0) - 0.5) * contrast) + 0.5) * 255.0);
if(R < 0) { R = 0; }
else if(R > 255) { R = 255; }
G = Color.red(pixel);
G = (int)(((((G / 255.0) - 0.5) * contrast) + 0.5) * 255.0);
if(G < 0) { G = 0; }
else if(G > 255) { G = 255; }
B = Color.red(pixel);
B = (int)(((((B / 255.0) - 0.5) * contrast) + 0.5) * 255.0);
if(B < 0) { B = 0; }
else if(B > 255) { B = 255; }
// set new pixel color to output bitmap
bmOut.setPixel(x, y, Color.argb(A, R, G, B));
}
}
// return final image
return bmOut;
calling it as :
ImageView image = (ImageView)(findViewById(R.id.image));
//image.setImageBitmap(createContrast(bitmap));
But i dont see any offect happening for the image. Can you please help where i am going wrong.
I saw the effectFactory from APi 14 . IS there something similar / any tutorial that can be used for older versions for image processing
There are three basic problems with this approach. The first two are coding issues. First, you are always calling Color.red, and there is no Color.green and Color.blue to be found in your code. The second issue is that this calculation is too repetitive. You assume the colors are in the range [0, 255], so it is much faster to create a array of 256 positions with the contrast calculated for each i in [0, 255].
The third issue is more problematic. Why did you consider this algorithm to improve contrast ? The results are meaningless for RGB, you might get something better in a different color system. Here are the results you should expect, with your parameter value at 0, 10, 20, and 30:
And here is a sample Python code to perform the operation:
import sys
from PIL import Image
img = Image.open(sys.argv[1])
width, height = img.size
cvalue = float(sys.argv[2]) # Your parameter "value".
contrast = ((100 + cvalue) / 100) ** 2
def apply_contrast(c):
c = (((c / 255.) - 0.5) * contrast + 0.5) * 255.0
return min(255, max(0, int(c)))
# Build the lookup table.
ltu = []
for i in range(256):
ltu.append(apply_contrast(i))
# The following "point" method applies a function to each
# value in the image. It considers the image as a flat sequence
# of values.
img = img.point(lambda x: ltu[x])
img.save(sys.argv[3])
I have done some google-ing around and couldn't find enough information about this format. It is the default format for camera preview. Can anyone suggest good sources of information about it and how to extract data from a photo/preview image with that format? To be more specific, I need the black and white image extracted.
EDIT: Seems like that format is also called YCbCr 420 Semi Planar
I developed the following code to convert the NV21 to RGB, and it is working.
/**
* Converts YUV420 NV21 to RGB8888
*
* #param data byte array on YUV420 NV21 format.
* #param width pixels width
* #param height pixels height
* #return a RGB8888 pixels int array. Where each int is a pixels ARGB.
*/
public static int[] convertYUV420_NV21toRGB8888(byte [] data, int width, int height) {
int size = width*height;
int offset = size;
int[] pixels = new int[size];
int u, v, y1, y2, y3, y4;
// i percorre os Y and the final pixels
// k percorre os pixles U e V
for(int i=0, k=0; i < size; i+=2, k+=2) {
y1 = data[i ]&0xff;
y2 = data[i+1]&0xff;
y3 = data[width+i ]&0xff;
y4 = data[width+i+1]&0xff;
u = data[offset+k ]&0xff;
v = data[offset+k+1]&0xff;
u = u-128;
v = v-128;
pixels[i ] = convertYUVtoRGB(y1, u, v);
pixels[i+1] = convertYUVtoRGB(y2, u, v);
pixels[width+i ] = convertYUVtoRGB(y3, u, v);
pixels[width+i+1] = convertYUVtoRGB(y4, u, v);
if (i!=0 && (i+2)%width==0)
i+=width;
}
return pixels;
}
private static int convertYUVtoRGB(int y, int u, int v) {
int r,g,b;
r = y + (int)(1.402f*v);
g = y - (int)(0.344f*u +0.714f*v);
b = y + (int)(1.772f*u);
r = r>255? 255 : r<0 ? 0 : r;
g = g>255? 255 : g<0 ? 0 : g;
b = b>255? 255 : b<0 ? 0 : b;
return 0xff000000 | (b<<16) | (g<<8) | r;
}
This image helps to understand.
If you wanna just grayscale image is easer. You can discard all the U and V info, and take just the Y info. The code would can be like this:
/**
* Converts YUV420 NV21 to Y888 (RGB8888). The grayscale image still holds 3 bytes on the pixel.
*
* #param pixels output array with the converted array o grayscale pixels
* #param data byte array on YUV420 NV21 format.
* #param width pixels width
* #param height pixels height
*/
public static void applyGrayScale(int [] pixels, byte [] data, int width, int height) {
int p;
int size = width*height;
for(int i = 0; i < size; i++) {
p = data[i] & 0xFF;
pixels[i] = 0xff000000 | p<<16 | p<<8 | p;
}
}
To create your Bitmap just:
Bitmap bm = Bitmap.createBitmap(pixels, width, height, Bitmap.Config.ARGB_8888);
Where pixels is your int [] array.
NV21 is basically YUV420 but instead of planar format where Y, U and V have independent planes, NV21 has 1 plane for Luma and 2nd plane for Chroma. The format looks like
YYYYYYYYYYYYYYYYYYYYYYYYYYYYY
YYYYYYYYYYYYYYYYYYYYYYYYYYYYY
.
.
.
.
VUVUVUVUVUVUVUVUVUVUVUVUVUVUVU
VUVUVUVUVUVUVUVUVUVUVUVUVUVUVU
.
.
.
.
.
I also had lots of headache because of this preview format.
The best I could find are these:
http://www.fourcc.org/yuv.php#NV21
http://v4l2spec.bytesex.org/spec/r5470.htm
It seems that the Y component is the first width*height bytes int the array you get.
Some more informational links:
http://msdn.microsoft.com/en-us/library/ms867704.aspx#yuvformats_yuvsampling
http://msdn.microsoft.com/en-us/library/ff538197(v=vs.85).aspx
Hope this helps.
The data is in YUV420 format.
If you are only interested in the monochrome channel, i.e. "black and white", then this the first width x height bytes of the data buffer you already have.
The Y channel is the first image plane. It is exactly the grey/intensity/luminosity etc. channel.
Here's code to just extract the greyscale image data:
private int[] decodeGreyscale(byte[] nv21, int width, int height) {
int pixelCount = width * height;
int[] out = new int[pixelCount];
for (int i = 0; i < pixelCount; ++i) {
int luminance = nv21[i] & 0xFF;
out[i] = Color.argb(0xFF, luminance, luminance, luminance);
}
return out;
}
When you only need a grayscale camera preview, you could use a very simple
renderscript:
# pragma version(1)
# pragma rs java_package_name(com.example.name)
# pragma rs_fp_relaxed
rs_allocation gIn; // Allocation filled with camera preview data (byte[])
int previewwidth; // camera preview width (int)
// the parallel executed kernel
void root(uchar4 *v_out, uint32_t x,uint32_t y){
uchar c = rsGetElementAt_uchar(gIn,x+y*previewwidth);
*v_out = (uchar4){c,c,c,255};
}
Note : This is not faster than ScriptIntrinsicYuvToRGB (and a following ScriptIntrinsicColorMatrix to do the RGBA-> gray), but
it runs with API 11+ (where the Intrinsics need Api 17+).