I'm having a lot of trouble converting unique integers (index numbers) into unique float colours that are interpretable by an RGB565 OpenGL surface. When I assign unique colours, more often than not they are drawn as slightly different values due to loss of precision, so when I read the colour with glReadPixels and try to convert it back into a float for comparison, they are not equal.
I posted a similar question here OpenGL ES 2.0 solid colour & colour value precision issue but failed to implement the answer I was given, can anyone give me specifics (code and explanation) for this?
If you only need 605 unique values, then 10 bits of precision (up to 1024 values) should be enough.
RGB565 has 16 bits of precision, so you can use the 6 extra bits of precision as a form of error-correction by spacing the values out so that if there is a small adjustment of the values through rounding or dithering or whatever, you can set it to the closest valid value.
So, assign 3 of your 10 bits to R, 4 to G and 3 to B.
For example red and blue have a range of 0-31, but you only need 8 possible values (3 bits), so you only store the values 2, 6, 10, 14, 18, 22, 26, 30. When scaled up to 8 bits, these values will be 16, 48, 80, 112, 144, 176, 208, 240. Then when you reconstruct the index, any value in the range of 0-31 is interpreted as a 0, 32-63 is a 1, 64-95 is a 2 and so on (this can be done with a simple bit-shift). That way small errors of +/- a small amount won't matter.
void assignID(int regionnumber)
{
int UBR=31; //Upper boundary for blue and red
int UG=63; //Upper boundary for green
// split regionnumber into 3/4/3 bits:
int R = (regionnumber >> 7) & 7;
int G = (regionnumber >> 3) & 15;
int B = regionnumber & 7;
// space out the values by multiplying by 4 and adding 2:
R = R * 4 + 2;
G = G * 4 + 2;
B = B * 4 + 2;
// combine into an RGB565 value if you need it:
int RGB565 = (R << 11) | (G << 5) | B;
// assign the colors
regions[regionnumber].mColorID[0] = ((float)R)/UBR;
regions[regionnumber].mColorID[1] = ((float)G)/UG; // careful, there was a bug here
regions[regionnumber].mColorID[2] = ((float)B)/UBR;
}
Then at the other end, when you read a value from the screen, convert the RGB values back to integers with 3, 4 and 3 bits each and reconstruct the region:
int R = (b[0] & 0xFF) >> 5;
int G = (b[1] & 0xFF) >> 4;
int B = (b[2] & 0xFF) >> 5;
int regionnumber = (R << 7) | (G << 3) | B;
Related
I'm trying to build a classification model with keras and deploy the model to my Android phone. I use the code from this website to deploy my own converted model, which is a .pb file, to my Android phone. I load a image from my phone and everything worked fine, but the prediction result is totally different from the result I got from my PC.
The procedure of testing on my PC are:
load the image with cv2, and convert to np.float32
use the keras resnet50 'preprocess_input' python function to preprocess the image
expand the image dimension for batching (batch size is 1)
forward the image to model and get the result
Relevant code:
img = cv2.imread('./my_test_image.jpg')
x = preprocess_input(img.astype(np.float32))
x = np.expand_dims(x, axis=0)
net = load_model('./my_model.h5')
prediction_result = net.predict(x)
And I noticed that the image preprocessing part of Android is different from the method I used in keras, which mode is caffe(convert the images from RGB to BGR, then zero-center each color channel with respect to the ImageNet dataset). It seems that the original code is for mode tf(will scale pixels between -1 to 1).
So I modified the following code of 'preprocessBitmap' to what I think it should be, and use a 3 channel RGB image with pixel value [127,127,127] to test it. The code predicted the same result as .h5 model did. But when I load a image to classify, the prediction result is different from .h5 model.
Does anyone has any idea? Thank you very much.
I have tried the following:
Load a 3 channel RGB image in my Phone with pixel value [127,127,127], and use the modified code below, and it will give me a prediction result that is same as prediction result using .h5 model on PC.
Test the converted .pb model on PC using tensorflow gfile module with a image, and it give me a correct prediction result (compare to .h5 model). So I think the converted .pb file does not have any problem.
Entire section of preprocessBitmap
// code of 'preprocessBitmap' section in TensorflowImageClassifier.java
TraceCompat.beginSection("preprocessBitmap");
// Preprocess the image data from 0-255 int to normalized float based
// on the provided parameters.
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
// this is a ARGB format, so we need to mask the least significant 8 bits to get blue, and next 8 bits to get green and next 8 bits to get red. Since we have an opaque image, alpha can be ignored.
final int val = intValues[i];
// original
/*
floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;
*/
// what I think it should be to do the same thing in mode caffe when using keras
floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - (float)123.68);
floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - (float)116.779);
floatValues[i * 3 + 2] = (((val & 0xFF)) - (float)103.939);
}
TraceCompat.endSection();
This question is old, but remains the top Google result for preprocess_input for ResNet50 on Android. I could not find an answer for implementing preprocess_input for Java/Android, so I came up with the following based on the original python/keras code:
/*
Preprocesses RGB bitmap IAW keras/imagenet
Port of https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/applications/imagenet_utils.py#L169
with data_format='channels_last', mode='caffe'
Convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling.
Returns 3D float array
*/
static float[][][] imagenet_preprocess_input_caffe( Bitmap bitmap ) {
// https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/applications/imagenet_utils.py#L210
final float[] imagenet_means_caffe = new float[]{103.939f, 116.779f, 123.68f};
float[][][] result = new float[bitmap.getHeight()][bitmap.getWidth()][3]; // assuming rgb
for (int y = 0; y < bitmap.getHeight(); y++) {
for (int x = 0; x < bitmap.getWidth(); x++) {
final int px = bitmap.getPixel(x, y);
// rgb-->bgr, then subtract means. no scaling
result[y][x][0] = (Color.blue(px) - imagenet_means_caffe[0] );
result[y][x][1] = (Color.green(px) - imagenet_means_caffe[1] );
result[y][x][2] = (Color.red(px) - imagenet_means_caffe[2] );
}
}
return result;
}
Usage with a 3D tensorflow-lite input with shape (1,224,224,3):
Bitmap bitmap = <your bitmap of size 224x224x3>;
float[][][][] imgValues = new float[1][bitmap.getHeight()][bitmap.getWidth()][3];
imgValues[0]=imagenet_preprocess_input_caffe(bitmap);
... <prep tfInput, tfOutput> ...
tfLite.run(tfInput, tfOutput);
I'm making an Android app, in which the user takes two images and the first is "subtracted" from the second on a pixel-by-pixel basis.
Essentially, the two Bitmaps are converted to 2D int arrays, and the image subtraction is performed using the following method:
private int[][] pixelmapDifference(int[][] subtrahend, int[][] minuend) {
int[][] diff = new int[subtrahend.length][subtrahend[0].length];
for (int x = 0; x < diff.length; x++) {
for (int y = 0; y < diff[0].length; y++) {
diff[x][y] = minuend[x][y] - subtrahend[x][y];
}
}
return diff;
}
The resultant 2D array is then converted to a Bitmap. This is what the 3 images look like (first, second, and difference).
How do I account for this? I'd like to just get the difference between the two, in this case just the water.
You are subtracting always second from the first. What happens when second one is brighter? Value returned is below zero. I'm not one hundred percent sure what will happen, but documentation says that color is
int color = (A & 0xff) << 24 | (R & 0xff) << 16 | (G & 0xff) << 8 | (B & 0xff);
So as you substract lighter from darker in some situations outcome are these strange dots.
I am trying to run a Tensorflow model on my Android application, but the same trained model gives different results (wrong inference) compared to when it is run on Python on desktop.
The model is a simple sequential CNN to recognize characters, much like this number plate recognition network, minus the windowing, as my model has the characters already cropped into place.
I have:
Model saved in protobuf (.pb) file - modeled and trained in Keras on Python/Linux + GPU
The inference was tested on a different computer on pure Tensorflow, to make sure Keras was not the culprit. Here, the results were as expected.
Tensorflow 1.3.0 is being used on Python and Android. Installed from PIP on Python and jcenter on Android.
The results on Android do not resemble the expected outcome.
The input is a 129*45 RGB image, so a 129*45*3 array, and the output is a 4*36 array (representing 4 characters from 0-9 and a-z).
I used this code to save the Keras model as a .pb file.
Python code, this works as expected:
test_image = [ndimage.imread("test_image.png", mode="RGB").astype(float)/255]
imTensor = np.asarray(test_image)
def load_graph(model_file):
graph = tf.Graph()
graph_def = tf.GraphDef()
with open(model_file, "rb") as f:
graph_def.ParseFromString(f.read())
with graph.as_default():
tf.import_graph_def(graph_def)
return graph
graph=load_graph("model.pb")
with tf.Session(graph=graph) as sess:
input_operation = graph.get_operation_by_name("import/conv2d_1_input")
output_operation = graph.get_operation_by_name("import/output_node0")
results = sess.run(output_operation.outputs[0],
{input_operation.outputs[0]: imTensor})
Android code, based on this example; this gives seemingly random results:
Bitmap bitmap;
try {
InputStream stream = getAssets().open("test_image.png");
bitmap = BitmapFactory.decodeStream(stream);
} catch (IOException e) {
e.printStackTrace();
}
inferenceInterface = new TensorFlowInferenceInterface(context.getAssets(), "model.pb");
int[] intValues = new int[129*45];
float[] floatValues = new float[129*45*3];
String outputName = "output_node0";
String[] outputNodes = new String[]{outputName};
float[] outputs = new float[4*36];
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255;
floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255;
floatValues[i * 3 + 2] = (val & 0xFF) / 255;
}
inferenceInterface.feed("conv2d_1_input", floatValues, 1, 45, 129, 3);
inferenceInterface.run(outputNodes, false);
inferenceInterface.fetch(outputName, outputs);
Any help is greatly appreciated!
One Problem is in the lines:
floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255;
floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255;
floatValues[i * 3 + 2] = (val & 0xFF) / 255;
where the RGB values are divided by an integer, thus yielding an integer result (namely 0 every time).
Moreover, the division, even if executed with a 255.0 yielding a float between 0 and 1.0 may pose a problem, as the values aren't distributed in the projection space (0..1) like they were in Natura. To explain this: a value of 255 in the sensor domain (i.e. the R value for example) means that the natural value of the measured signal fell somewhere in the "255" bucket which is a whole range of energies/intensities/etc. Mapping this value to 1.0 will most likely cut half of its range, as subsequent calculations could saturate at a maximum multiplicator of 1.0 which really is only the midpoint of a +- 1/256 bucket. So maybe the transformation would be more correctly a mapping to the midpoints of a 256-bucket division of the 0..1 range:
((val & 0xff) / 256.0) + (0.5/256.0)
but this is just a guess from my side.
I'm using VideoSurfaceView to render filtered video. I'm doing it buy changing the fragment shader according to my needs. Now I would like to save/render the video after the changes to a file of the same format(Ex. mp4 - h264) but couldn't find how to do it.
PS - saving texture as bitmap and the bitmap to a file is easy but I could find how to do it with videos..
Any experts here?
As you already found out and said in the comments, OpenGL can't export multiple frames as a video.
Though if you simply want to filter/process each frame of a video, then you don't need OpenGL at all, and you don't need a Fragment Shader, you can simply loop through all the pixels yourself.
Now let's say that you process your video one frame at a time, and each frame is a BufferedImage, you can of course use whatever you want or get provided with, as long as you have the option to get and set pixels.
I'm simply supplying you with a way of calculating and applying a filter, you will have to do the decoding and encoding of the video file yourself.
But back to the BufferedImage, first we want to get all the pixels in our BufferedImage, we do that using the following.
BufferedImage bi = ...; // Here you would get a frame from the video
int width = bi.getWidth();
int height = bi.getHeight();
int[] pixels = ((DataBufferInt) bi.getRaster().getDataBuffer()).getData();
Be aware that depending on the type of image and if the image contains transparency, the DataBuffer might vary between a DataBufferInt to DataBufferByte, etc. You can read about the different DataBuffers in the Oracle Docs, click here.
Now simply by looping through the pixels from the image, then we can apply and create any kind of effect and filtering.
Let's say we want to create a grayscale effect also called a black-and-white effect, you would then do that by the following.
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
final int index = x + y * width;
final int pixel = pixels[index];
final int alpha = (pixel >> 24) & 0xFF;
final int red = (pixel >> 16) & 0xFF;
final int green = (pixel >> 8) & 0xFF;
final int blue = pixel & 0xFF;
final int gray = (red + green + blue) / 3;
pixels[index] = alpha << 24 | gray << 16 | gray << 8 | gray;
}
}
Now you can simply save the image again, or do anything else you would like to do. Though you can also use and draw the BufferedImage, because the pixel array provided by the BufferedImage will of course change the BufferedImage as well.
Important if you want to perform a blur effect, then after you calculate each pixel store it into another array, because performing a blur effect, requires the surrounding pixels. Therefore it you replace the old once while you calculate all the pixels, some of the pixels will use the calculated values instead of the actual value.
The above code also works for images as well of course.
Extra
If you want to get RGBA values which is stored in a single int then you can do the following.
int pixel = 0xFFFF8040; // This is a random testing value
int alpha = (pixel >> 24) & 0xFF; // Would equal 255 using the testing value
int red = (pixel >> 16) & 0xFF; // ... 255 ...
int green = (pixel >> 8) & 0xFF; // ... 128 ...
int blue = pixel & 0xFF; // ... 64 ...
Then if you have the RGBA values and want to combine them to a single int then you can do the following.
int alpha = 255;
int red = 255;
int green = 128;
int blue = 64;
int pixel = alpha << 24 | red << 16 | green << 8 | blue;
If you only have the RGB values then you just do, either red << 16 | green << 8 | blue or you do 255 << 24 | red << 16 | green << 8 | blue
I am trying to use one of these algorithms to convert a RGB image to grayscale:
The lightness method averages the most prominent and least prominent colors:
(max(R, G, B) + min(R, G, B)) / 2.
The average method simply averages the values: (R + G + B) / 3.
The formula for luminosity is 0.21 R + 0.71 G + 0.07 B.
But I get very weird results! I know there are other ways to acheive this but is it possible to do this way?
Here is the code:
for(int i = 0 ; i < eWidth*eHeight;i++){
int R = (pixels[i] >> 16) ; //bitwise shifting
int G = (pixels[i] >> 8) ;
int B = pixels[i] ;
int gray = (R + G + B )/ 3 ;
pixels[i] = (gray << 16) | (gray << 8) | gray ;
}
You need to strip off the bits that aren't part of the component you're getting, especially if there's any sign extension going on in the shifts.
int R = (q[i] >> 16) & 0xff ; //bitwise shifting
int G = (q[i] >> 8) & 0xff ;
int B = q[i] & 0xff ;
What you made looks allright to me..
I once did this, in java, in much the same way.
Getting the average of the 0-255 color values of RGB, to get grayscale, and it looks alot like yours.
public int getGray(int row, int col) throws Exception
{
checkInImage(row,col);
int[] rgb = this.getRGB(row,col);
return (int) (rgb[0]+rgb[1]+rgb[2])/3;
}
I understand you are not asking for hoe to code this, but for algorithm?
There is no "correct" algorithm as per http://www.dfanning.com/ip_tips/color2gray.html
They use
Y = 0.3*R + 0.59*G + 0.11*B
You can certainly modify each pixel in Java, but that's very inefficient. If you have the option, I would use a ColorMatrix. See the Android documentation for details: http://developer.android.com/resources/samples/ApiDemos/src/com/example/android/apis/graphics/ColorMatrixSample.html
You could set the matrix' saturation to 0 to make it grayscale.
IF you really want to do it in Java, you can do it the way you did it, but you'll need to mask out each element first, i.e. apply & 0xff to it.