squeeze and unsqueeze tensor in Android (Java) - android

I am trying to port my torchscript over from Python to Android (Java). Currently, I ran into a problem of trying to squeeze/ unsqueeze my input and output tensor in Android. In Python, here's how I did it:
**tensor = torch.Tensor(image_n.transpose(2, 0, 1).astype('float32')).unsqueeze(0)
tensor = tensor.to(device)**
output tensor:
**with torch.no_grad():
prob = model.forward(tensor)
prediction = prob.squeeze().numpy().astype('uint8')**
In Android, I managed to input and output my tensors following the Pytorch tutorial as such:
**final Tensor inputTensor = TensorImageUtils.bitmapToFloat32Tensor(mBitmap,
TensorImageUtils.TORCHVISION_NORM_MEAN_RGB, TensorImageUtils.TORCHVISION_NORM_STD_RGB);
final float[] inputs = inputTensor.getDataAsFloatArray();**
and the output tensor:
**Map<String, IValue> outTensors = mModule.forward(IValue.from(inputTensor)).toDictStringKey();**
The problem is, without squeeze and unsqueeze, although the code managed to run, the dimensions are wrong and I didnt manage to get the correct output.
Does anyone know if there is actually squeeze/unsqueeze function for pytorch in Android?
Edit: just to add on, my input tensor has the size of (3,224,416) (an RGB image), and my out put tensor has the size of (1,1,224,416) (a grayscale image)

Related

How to read output of Keras Handwriting Recongition Model in TF Lite Android?

I'm trying to implement handwriting text recogontition in my Android App. I found TensorFlow to be a doable solution, so I've tried to create a .tflite Model from the Handwriting Recognition Model from Keras
The tutorial states that it is fully compatible with TF Lite
I managed to create the .tflite model and then in Android intialize the Interpreter with the model. I then ran the Interpreter with a ByteBuffer of a bitmap and the output is a shape of [1,32,81], which is a array of floats. As far as i know the output should just be a String; the prediction text of the given input. How can I get/decode the output to the String I need?
I had a few problems
Converting the model to a .tflite but i managed to do it using certain flags as follows:
converter = tf.lite.TFLiteConverter.from_keras_model(prediction_model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
converter._experimental_lower_tensor_list_ops = False
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tf_lite_model = converter.convert()
open('textRecognitionModel.tflite', 'wb').write(tf_lite_model)
According to the docs of TF Lite you have to use the following dependencies
implementation 'org.tensorflow:tensorflow-lite:0.0.0-nightly-SNAPSHOT'
// This dependency adds the necessary TF op support.
implementation 'org.tensorflow:tensorflow-lite-select-tf-ops:0.0.0-nightly-SNAPSHOT'
After finally creating a .tflite model file, I then added it to the assets directory of my android app and tried importing it. However, it would crash with no error message, apparently a memory failure. I updated the libraries to the latest version:
"org.tensorflow:tensorflow-lite:2.11.0"
"org.tensorflow:tensorflow-lite-select-tf-ops:2.11.0"
And converted my model to ByteBuffer as follows (I'm not sure if i'm doing it right regarding the native order logic):
// fileName is the name of the model file in the assets dir
val inputStream = assetManager.open(filename)
val output = ByteArrayOutputStream()
inputStream.copyTo(output, 1024)
val file = output.toByteArray()
val bb = ByteBuffer.allocateDirect(file.size)
bb.order(ByteOrder.nativeOrder())
bb.put(file)
return bb
And finally the initialization of the Interpreter API is finally working.
I then run the interpreter on a ByteBuffer of a Bitmap. So I'm expecting that the model will read the input and give prediction text (a String) as output. However, the output is a [1,32,81] shape, so i created an array to read the output and ran the Interpreter on it:
val output = Array(1) {
Array(32) {
FloatArray(81)
}
}
// byteBuffer: ByteBuffer of bitmap
interpreter.run(byteBuffer, output)
And the output is an array of floats which I don't understand what this means. Shouldn't it just be a String? I've attached a screenshot of the output arrayoutput screenshot
Can someone please help me??
I would highly appreciate any tips or solutions :)
Before converting the prediction_model to tflite format, you need to add a custom layer at the end and then convert it into tflite format.
prediction_model = keras.models.Model(
model.get_layer(name="image").input, model.get_layer(name="dense2").output
) # This line is present in the handwriting_recognition notebook.
def CTCDecoder():
def decode_batch_predictions(pred):
input_len = np.ones(pred.shape[0]) * pred.shape[1]
# Use greedy search. For complex tasks, you can use beam search
results = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)[0][0][:, :max_length]
# Iterate over the results and get back the text
output_text = []
for res in results:
#print(res)
res = tf.strings.reduce_join(num_to_char(res)).numpy().decode("utf-8")
output_text.append(res)
return output_text
return tf.keras.layers.Lambda(decode_batch_predictions, name='decode')
decoded_pred_model = keras.models.Model(prediction_model.input, outputs=CTCDecoder()(prediction_model.output))
Now you can convert decoded_pred_model to your tflite format and use it. CTCDecoder is the custom layer added on top of prediction_model.output to decode the predictions with shape [1,32,81] into texts.

Tensorflow Model output weights have different values

I am developing an Android application which requires an ML model integration.For it I am using TensorFlow lite for deployment.I am using Custom Model based Siamese Network for output and the output shape is [1 128].When I infer the tf lite model in python on Google Colab the output [1 128] numbers are different from the one being produced on my Android device.THe input image is same on both inferences and also the input and output shapes but still I am getting different output vectors on my Android Phone and Python TFlite model.I am using Firebase Machine Learning.
Android Code
val interpreter=Interpreter(model)
val imageBitmap= Bitmap.createScaledBitmap(BitmapFactory.decodeFileDescriptor(contentResolver.openFileDescriptor(fileUri,"r")?.fileDescriptor),256,256,true)
val inputImage=ByteBuffer.allocateDirect(256*256*3*4).order(ByteOrder.nativeOrder())
for(ycord in 0 until 256){
for(xcord in 0 until 256){
val pixel=imageBitmap.getPixel(xcord,ycord)
inputImage.putFloat(Color.red(pixel)/1.0f)
inputImage.putFloat(Color.green(pixel)/1.0f)
inputImage.putFloat(Color.blue(pixel)/1.0f)
}
}
imageBitmap.recycle()
val modelOutput=ByteBuffer.allocateDirect(outputSize).order(ByteOrder.nativeOrder())
interpreter.run(inputImage,modelOutput)
modelOutput.rewind()
val probs=modelOutput.asFloatBuffer()
success(ImageProcessResult.Success(probs))
Kindly help me.I need it soon.Any help is appreciated
You are resizing the bitmap to [256,256] in the Android platform.
Even the slightest change in input vectors would change the output vector. When you resize the bitmaps, you change the input vector. However, if the model is general enough the final result which would be argmax of the output vector (in classification) would be the same.
In the case of Siamese, I believe it won't affect the final result (similarity score) in a meaningful way if the model is not overfitted.

How can I input an N-dimensional-input into a Tensorflow Lite model on Android?

I created a Tensorflow model which takes a single 700x700 48-dimension "image" as an input (input shape is {1, 700, 700, 48}).
To do so, I used Numpy's numpy.concatenate([array_of_images], -1), when array_of_images is an array of 16 700x700 JPEG images.
I converted the model to Tensorflow Lite and I'm running it on Android.
No conversion errors or anything - all ops are valid and supported.
My question is - where in Android (or how) can I create an N-dimensional object (or container) and use it as an input to the model?
I think you have 16 RGB images,
on android you load your bitmaps into image tensors like this :
var bitmap1 = Bitmap.load( from anywhere )
var tImage1 = TensorImage(DataType.FLOAT32)
tImage1.load(bitmap1)
for each image,
then
input = arrayof(tImage1.buffer, tImage2.buffer,........tImage16.buffer)
interpreteur.runForMultipleInputsOutputs(arrayOf(input), output)
i'm not sure but this can give you an idea

How to draw bounding boxes around classified objects using tensorflow lite?

I would like to know if it is possible to draw bounding boxes using Tensorflow lite. I have been able to draw them using tensorflow-android in version 1.12 but I have no example for drawing bounding boxes in tensorflow lite.
In the code below you can see my way in tensorflow-android 1.12 to get the outputLocations which is working.
inferenceInterface.run(outputNames, logStats);
LOGGER.d("End Section run " + System.currentTimeMillis());
Trace.endSection();
// Copy the output Tensor back into the output array.
Trace.beginSection("fetch");
LOGGER.d("Begin Section fetch " + System.currentTimeMillis());
outputLocations = new float[MAX_RESULTS * 4];
outputScores = new float[MAX_RESULTS];
outputClasses = new float[MAX_RESULTS];
outputNumDetections = new float[1];
inferenceInterface.fetch(outputNames[0], outputLocations);
It would be great if you could tell me how to get outputLocations using runInference() from trensorflow-lite instead.
If you use object detection models such as the following: http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_quantized_300x300_coco14_sync_2018_07_18.tar.gz
The output tensors already have output locations, scores, classes etc.
You can follow example similar to Android Java sample app:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/android/app/src/main/java/org/tensorflow/demo/TFLiteObjectDetectionAPIModel.java

Image pre-processing parameters for tensorflow models

I have a basic question about how to determine the image pre-processing parameters like - "IMAGE_MEAN", "IMAGE_STD" for various tensorflow pre-trained models. The Android sample applications for TensorFlow provides these parameters for a certain inception_v3 model in the ClassifierActivity.java (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/ClassifierActivity.java) as shown below -
"If you want to use a model that's been produced from the TensorFlow for Poets codelab, you'll need to set IMAGE_SIZE = 299, IMAGE_MEAN = 128, IMAGE_STD = 128"
How do I determine these parameters for other TF models
Also, while converting the TF model to CoreML model, to be used on iOS, there are additional image pre-processing parameters that need to be specified (like - red_bias, green_bias, blue_bias and image_scale) as shown in the code segment below. The below parameters are for inception_v1_2016.pb model. If I want to use another pre-trained model like - ResNet50, MobileNet, etc how do I determine these parameters
tf_converter.convert(tf_model_path = 'inception_v1_2016_08_28_frozen.pb',
mlmodel_path = 'InceptionV1.mlmodel',
output_feature_names = ['InceptionV1/Logits/Predictions/Softmax:0'],
image_input_names = 'input:0',
class_labels = 'imagenet_slim_labels.txt',
red_bias = -1,
green_bias = -1,
blue_bias = -1,
image_scale = 2.0/255.0
)
Any help will be greatly appreciated
Unfortunately, the preprocessing requirements of various ImageNet models are still under documented. ResNet and VGG models both use the same preprocessing parameters. You can find biases for each of the color channels here:
https://github.com/fchollet/deep-learning-models/blob/master/imagenet_utils.py#L11
The preprocessing for Inception_V3, MobileNet, and other models can be found in the individual model files of this repo: https://github.com/fchollet/deep-learning-models
When converting to Core ML you always need to specify preprocessing biases on a per channel basis. So in the case of a VGG-type preprocessing, you can just copy each channel's biases directly from the code linked to above. It's super important to note that the biases are applied (added) BEFORE scaling. You can read more about setting the proper values here: http://machinethink.net/blog/help-core-ml-gives-wrong-output/
The conversion code you posted looks good for MobileNet or Inception_V3 models, but would not work for VGG or ResNet. For those you'd need:
tf_converter.convert(...
red_bias=-123.68,
green_bias=-116.78,
blue_bias=-103.94
)
No scaling is required.

Categories

Resources