Same Tensorflow model giving different results on Android and Python - android

I am trying to run a Tensorflow model on my Android application, but the same trained model gives different results (wrong inference) compared to when it is run on Python on desktop.
The model is a simple sequential CNN to recognize characters, much like this number plate recognition network, minus the windowing, as my model has the characters already cropped into place.
I have:
Model saved in protobuf (.pb) file - modeled and trained in Keras on Python/Linux + GPU
The inference was tested on a different computer on pure Tensorflow, to make sure Keras was not the culprit. Here, the results were as expected.
Tensorflow 1.3.0 is being used on Python and Android. Installed from PIP on Python and jcenter on Android.
The results on Android do not resemble the expected outcome.
The input is a 129*45 RGB image, so a 129*45*3 array, and the output is a 4*36 array (representing 4 characters from 0-9 and a-z).
I used this code to save the Keras model as a .pb file.
Python code, this works as expected:
test_image = [ndimage.imread("test_image.png", mode="RGB").astype(float)/255]
imTensor = np.asarray(test_image)
def load_graph(model_file):
graph = tf.Graph()
graph_def = tf.GraphDef()
with open(model_file, "rb") as f:
graph_def.ParseFromString(f.read())
with graph.as_default():
tf.import_graph_def(graph_def)
return graph
graph=load_graph("model.pb")
with tf.Session(graph=graph) as sess:
input_operation = graph.get_operation_by_name("import/conv2d_1_input")
output_operation = graph.get_operation_by_name("import/output_node0")
results = sess.run(output_operation.outputs[0],
{input_operation.outputs[0]: imTensor})
Android code, based on this example; this gives seemingly random results:
Bitmap bitmap;
try {
InputStream stream = getAssets().open("test_image.png");
bitmap = BitmapFactory.decodeStream(stream);
} catch (IOException e) {
e.printStackTrace();
}
inferenceInterface = new TensorFlowInferenceInterface(context.getAssets(), "model.pb");
int[] intValues = new int[129*45];
float[] floatValues = new float[129*45*3];
String outputName = "output_node0";
String[] outputNodes = new String[]{outputName};
float[] outputs = new float[4*36];
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255;
floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255;
floatValues[i * 3 + 2] = (val & 0xFF) / 255;
}
inferenceInterface.feed("conv2d_1_input", floatValues, 1, 45, 129, 3);
inferenceInterface.run(outputNodes, false);
inferenceInterface.fetch(outputName, outputs);
Any help is greatly appreciated!

One Problem is in the lines:
floatValues[i * 3 + 0] = ((val >> 16) & 0xFF) / 255;
floatValues[i * 3 + 1] = ((val >> 8) & 0xFF) / 255;
floatValues[i * 3 + 2] = (val & 0xFF) / 255;
where the RGB values are divided by an integer, thus yielding an integer result (namely 0 every time).
Moreover, the division, even if executed with a 255.0 yielding a float between 0 and 1.0 may pose a problem, as the values aren't distributed in the projection space (0..1) like they were in Natura. To explain this: a value of 255 in the sensor domain (i.e. the R value for example) means that the natural value of the measured signal fell somewhere in the "255" bucket which is a whole range of energies/intensities/etc. Mapping this value to 1.0 will most likely cut half of its range, as subsequent calculations could saturate at a maximum multiplicator of 1.0 which really is only the midpoint of a +- 1/256 bucket. So maybe the transformation would be more correctly a mapping to the midpoints of a 256-bucket division of the 0..1 range:
((val & 0xff) / 256.0) + (0.5/256.0)
but this is just a guess from my side.

Related

Pretrained keras model is returing the same result in android

I have created an image classifier in Keras, later I saved the model in pb format to use it in android.
However, in the python code, it can classify the image properly. But in android whatever image I gave as input the output is always the same .
This is how I have trained my model
rom keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
# Initialising the CNN
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a second convolutional layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Step 3 - Flattening
classifier.add(Flatten())
# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
classifier.fit_generator(training_set,
steps_per_epoch = 8000,
epochs = 25,
validation_data = test_set,
validation_steps = 2000)
classifier.summary()
classifier.save('saved_model.h5')
Later I convert that keras model(saved_model.h5) to tensorflow model by using this
This is how I have converted my bitmap float array
public static float[] getPixels(Bitmap bitmap) {
final int IMAGE_SIZE = 64;
int[] intValues = new int[IMAGE_SIZE * IMAGE_SIZE];
float[] floatValues = new float[IMAGE_SIZE * IMAGE_SIZE * 3];
if (bitmap.getWidth() != IMAGE_SIZE || bitmap.getHeight() != IMAGE_SIZE) {
// rescale the bitmap if needed
bitmap = ThumbnailUtils.extractThumbnail(bitmap, IMAGE_SIZE, IMAGE_SIZE);
}
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
// bitwise shifting - without our image is shaped [1, 64, 64, 1] but we need [1, 168, 168, 3]
floatValues[i * 3 + 2] = Color.red(val) / 255.0f;
floatValues[i * 3 + 1] = Color.green(val) / 255.0f;
floatValues[i * 3] = Color.blue(val) / 255.0f;
}
return floatValues;
}
Later, I tried to classify image using tensorflow in android , like following .
TensorFlowInferenceInterface tensorFlowInferenceInterface;
tensorFlowInferenceInterface = new TensorFlowInferenceInterface(getAssets(),"model.pb");
float[] output = new float[2];
tensorFlowInferenceInterface.feed("conv2d_11_input",
getPixels(bitmap), 1,64,64,3);
tensorFlowInferenceInterface.run(new String[]{"dense_12/Sigmoid"});
tensorFlowInferenceInterface.fetch("dense_12/Sigmoid",output);
Whatever image I gave the value of the output is [1,0]
Is there anything I have missed?
The color components returned by Color.red(int), Color.blue(int) and Color.green(int) are integers in the range [0, 255] (see doc). The same thing holds when reading images using ImageDataGenerator of Keras. However, as I stated in comments section, in prediction phase you need to do the same preprocessing steps as done in training phase. You are scaling the image pixels by 1./255 in training (using rescale = 1./255 in ImageDataGenerator) and therefore, according to the first point I mentioned, this must also be done in prediction:
floatValues[i * 3 + 2] = Color.red(val) / 255.0;
floatValues[i * 3 + 1] = Color.green(val) / 255.0;
floatValues[i * 3] = Color.blue(val) / 255.0;

How to fix the image preprocessing difference between tensorflow and android studio?

I'm trying to build a classification model with keras and deploy the model to my Android phone. I use the code from this website to deploy my own converted model, which is a .pb file, to my Android phone. I load a image from my phone and everything worked fine, but the prediction result is totally different from the result I got from my PC.
The procedure of testing on my PC are:
load the image with cv2, and convert to np.float32
use the keras resnet50 'preprocess_input' python function to preprocess the image
expand the image dimension for batching (batch size is 1)
forward the image to model and get the result
Relevant code:
img = cv2.imread('./my_test_image.jpg')
x = preprocess_input(img.astype(np.float32))
x = np.expand_dims(x, axis=0)
net = load_model('./my_model.h5')
prediction_result = net.predict(x)
And I noticed that the image preprocessing part of Android is different from the method I used in keras, which mode is caffe(convert the images from RGB to BGR, then zero-center each color channel with respect to the ImageNet dataset). It seems that the original code is for mode tf(will scale pixels between -1 to 1).
So I modified the following code of 'preprocessBitmap' to what I think it should be, and use a 3 channel RGB image with pixel value [127,127,127] to test it. The code predicted the same result as .h5 model did. But when I load a image to classify, the prediction result is different from .h5 model.
Does anyone has any idea? Thank you very much.
I have tried the following:
Load a 3 channel RGB image in my Phone with pixel value [127,127,127], and use the modified code below, and it will give me a prediction result that is same as prediction result using .h5 model on PC.
Test the converted .pb model on PC using tensorflow gfile module with a image, and it give me a correct prediction result (compare to .h5 model). So I think the converted .pb file does not have any problem.
Entire section of preprocessBitmap
// code of 'preprocessBitmap' section in TensorflowImageClassifier.java
TraceCompat.beginSection("preprocessBitmap");
// Preprocess the image data from 0-255 int to normalized float based
// on the provided parameters.
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
for (int i = 0; i < intValues.length; ++i) {
// this is a ARGB format, so we need to mask the least significant 8 bits to get blue, and next 8 bits to get green and next 8 bits to get red. Since we have an opaque image, alpha can be ignored.
final int val = intValues[i];
// original
/*
floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - imageMean) / imageStd;
floatValues[i * 3 + 2] = ((val & 0xFF) - imageMean) / imageStd;
*/
// what I think it should be to do the same thing in mode caffe when using keras
floatValues[i * 3 + 0] = (((val >> 16) & 0xFF) - (float)123.68);
floatValues[i * 3 + 1] = (((val >> 8) & 0xFF) - (float)116.779);
floatValues[i * 3 + 2] = (((val & 0xFF)) - (float)103.939);
}
TraceCompat.endSection();
This question is old, but remains the top Google result for preprocess_input for ResNet50 on Android. I could not find an answer for implementing preprocess_input for Java/Android, so I came up with the following based on the original python/keras code:
/*
Preprocesses RGB bitmap IAW keras/imagenet
Port of https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/applications/imagenet_utils.py#L169
with data_format='channels_last', mode='caffe'
Convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling.
Returns 3D float array
*/
static float[][][] imagenet_preprocess_input_caffe( Bitmap bitmap ) {
// https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/applications/imagenet_utils.py#L210
final float[] imagenet_means_caffe = new float[]{103.939f, 116.779f, 123.68f};
float[][][] result = new float[bitmap.getHeight()][bitmap.getWidth()][3]; // assuming rgb
for (int y = 0; y < bitmap.getHeight(); y++) {
for (int x = 0; x < bitmap.getWidth(); x++) {
final int px = bitmap.getPixel(x, y);
// rgb-->bgr, then subtract means. no scaling
result[y][x][0] = (Color.blue(px) - imagenet_means_caffe[0] );
result[y][x][1] = (Color.green(px) - imagenet_means_caffe[1] );
result[y][x][2] = (Color.red(px) - imagenet_means_caffe[2] );
}
}
return result;
}
Usage with a 3D tensorflow-lite input with shape (1,224,224,3):
Bitmap bitmap = <your bitmap of size 224x224x3>;
float[][][][] imgValues = new float[1][bitmap.getHeight()][bitmap.getWidth()][3];
imgValues[0]=imagenet_preprocess_input_caffe(bitmap);
... <prep tfInput, tfOutput> ...
tfLite.run(tfInput, tfOutput);

How to pass Image Bitmap to Tensorflow Mobile model?

I have created a model in Keras using Transfer Learning.
I used InceptionV3 model as base and made initial 52 layers as non-trainable.
Then I add my custom layers on top of it.
Trained it and saved to hdf5 file.
The model predicted A.png on my laptop. Correct
I converted it to pb file using a Github tool.
Then I predicted A.png by pb model on laptop. It was correct.
Then I moved pb file to android asset folder.
I add Tensorflow Mobile dependency on it (not TensorflowLite).
I also added the same A.png to asset.
I loaded the model and pass A.png to pb model.
Output was wrong class.
I tried with other images. Always it points to the same wrong class.
Output never changed.
So I feel the hdf5 , pb model are correct but there is some mistake in my code which is passing the A.png to pb model. Please help me!
resized_image -> Bitmap of A.png
inferenceInterface -> tensorflow model interface
INPUT_NODE -> name of input node
OUTPUT_NODE -> name of output node
1,128,128,3 -> image is 128x128 and 3 channels
imageValuesFloat = normalizeBitmap(resized_image,128,127.5f,1.0f);
inferenceInterface.feed(INPUT_NODE,imageValuesFloat,1,128,128,3);
inferenceInterface.run(OUTPUT_NODES);
//declare array to hold results obtained from model
float[] result = new float[OUTPUT_SIZE];
//copy the output into the result array
inferenceInterface.fetch(OUTPUT_NODE,result);
Here is the normalizeBitmap function
public float[] normalizeBitmap(Bitmap source,int size,float mean,float std){
float[] output = new float[size * size * 3];
int[] intValues = new int[source.getHeight() * source.getWidth()];
source.getPixels(intValues, 0, source.getWidth(), 0, 0, source.getWidth(), source.getHeight());
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
output[i * 3] = (((val >> 16) & 0xFF) - mean)/std;
output[i * 3 + 1] = (((val >> 8) & 0xFF) - mean)/std;
output[i * 3 + 2] = ((val & 0xFF) - mean)/std;
}
return output;
}

glReadPixels without loss of precision (Android)

I'm having a lot of trouble converting unique integers (index numbers) into unique float colours that are interpretable by an RGB565 OpenGL surface. When I assign unique colours, more often than not they are drawn as slightly different values due to loss of precision, so when I read the colour with glReadPixels and try to convert it back into a float for comparison, they are not equal.
I posted a similar question here OpenGL ES 2.0 solid colour & colour value precision issue but failed to implement the answer I was given, can anyone give me specifics (code and explanation) for this?
If you only need 605 unique values, then 10 bits of precision (up to 1024 values) should be enough.
RGB565 has 16 bits of precision, so you can use the 6 extra bits of precision as a form of error-correction by spacing the values out so that if there is a small adjustment of the values through rounding or dithering or whatever, you can set it to the closest valid value.
So, assign 3 of your 10 bits to R, 4 to G and 3 to B.
For example red and blue have a range of 0-31, but you only need 8 possible values (3 bits), so you only store the values 2, 6, 10, 14, 18, 22, 26, 30. When scaled up to 8 bits, these values will be 16, 48, 80, 112, 144, 176, 208, 240. Then when you reconstruct the index, any value in the range of 0-31 is interpreted as a 0, 32-63 is a 1, 64-95 is a 2 and so on (this can be done with a simple bit-shift). That way small errors of +/- a small amount won't matter.
void assignID(int regionnumber)
{
int UBR=31; //Upper boundary for blue and red
int UG=63; //Upper boundary for green
// split regionnumber into 3/4/3 bits:
int R = (regionnumber >> 7) & 7;
int G = (regionnumber >> 3) & 15;
int B = regionnumber & 7;
// space out the values by multiplying by 4 and adding 2:
R = R * 4 + 2;
G = G * 4 + 2;
B = B * 4 + 2;
// combine into an RGB565 value if you need it:
int RGB565 = (R << 11) | (G << 5) | B;
// assign the colors
regions[regionnumber].mColorID[0] = ((float)R)/UBR;
regions[regionnumber].mColorID[1] = ((float)G)/UG; // careful, there was a bug here
regions[regionnumber].mColorID[2] = ((float)B)/UBR;
}
Then at the other end, when you read a value from the screen, convert the RGB values back to integers with 3, 4 and 3 bits each and reconstruct the region:
int R = (b[0] & 0xFF) >> 5;
int G = (b[1] & 0xFF) >> 4;
int B = (b[2] & 0xFF) >> 5;
int regionnumber = (R << 7) | (G << 3) | B;

How to apply, converting image from colored to grayscale algorithm to Android?

I am trying to use one of these algorithms to convert a RGB image to grayscale:
The lightness method averages the most prominent and least prominent colors:
(max(R, G, B) + min(R, G, B)) / 2.
The average method simply averages the values: (R + G + B) / 3.
The formula for luminosity is 0.21 R + 0.71 G + 0.07 B.
But I get very weird results! I know there are other ways to acheive this but is it possible to do this way?
Here is the code:
for(int i = 0 ; i < eWidth*eHeight;i++){
int R = (pixels[i] >> 16) ; //bitwise shifting
int G = (pixels[i] >> 8) ;
int B = pixels[i] ;
int gray = (R + G + B )/ 3 ;
pixels[i] = (gray << 16) | (gray << 8) | gray ;
}
You need to strip off the bits that aren't part of the component you're getting, especially if there's any sign extension going on in the shifts.
int R = (q[i] >> 16) & 0xff ; //bitwise shifting
int G = (q[i] >> 8) & 0xff ;
int B = q[i] & 0xff ;
What you made looks allright to me..
I once did this, in java, in much the same way.
Getting the average of the 0-255 color values of RGB, to get grayscale, and it looks alot like yours.
public int getGray(int row, int col) throws Exception
{
checkInImage(row,col);
int[] rgb = this.getRGB(row,col);
return (int) (rgb[0]+rgb[1]+rgb[2])/3;
}
I understand you are not asking for hoe to code this, but for algorithm?
There is no "correct" algorithm as per http://www.dfanning.com/ip_tips/color2gray.html
They use
Y = 0.3*R + 0.59*G + 0.11*B
You can certainly modify each pixel in Java, but that's very inefficient. If you have the option, I would use a ColorMatrix. See the Android documentation for details: http://developer.android.com/resources/samples/ApiDemos/src/com/example/android/apis/graphics/ColorMatrixSample.html
You could set the matrix' saturation to 0 to make it grayscale.
IF you really want to do it in Java, you can do it the way you did it, but you'll need to mask out each element first, i.e. apply & 0xff to it.

Categories

Resources