How do I serialize an NLP classification PyTorch model - android

I am attempting to use a new NLP model within the PyTorch android demo app Demo App Git however I am struggling to serialize the model so that it works with Android.
The demonstration given by PyTorch is as follows for a Resnet model:
model = torchvision.models.resnet18(pretrained=True)
model.eval()
example = torch.rand(1, 3, 224, 224)
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("app/src/main/assets/model.pt")
However I am not sure what to use for the 'example' input with my NLP model.
The model that I am using from a fastai tutorial and the python is linked here: model
Here is the Python used to create my model (using the Fastai library). It is the same as in the model link above, but in a simplified form.
from fastai.text import *
path = untar_data('http://files.fast.ai/data/examples/imdb_sample')
path.ls()
#: [PosixPath('/storage/imdb_sample/texts.csv')]
data_lm = TextDataBunch.from_csv(path, 'texts.csv')
data = (TextList.from_csv(path, 'texts.csv', cols='text')
.split_from_df(col=2)
.label_from_df(cols=0)
.databunch())
bs=48
path = untar_data('https://s3.amazonaws.com/fast-ai-nlp/imdb')
data_lm = (TextList.from_folder(path)
.filter_by_folder(include=['train', 'test', 'unsup'])
.split_by_rand_pct(0.1)
.label_for_lm()
.databunch(bs=bs))
learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3)
learn.fit_one_cycle(1, 1e-2, moms=(0.8,0.7))
learn.unfreeze()
learn.fit_one_cycle(10, 1e-3, moms=(0.8,0.7))
learn.save_encoder('fine_tuned_enc')
path = untar_data('https://s3.amazonaws.com/fast-ai-nlp/imdb')
data_clas = (TextList.from_folder(path, vocab=data_lm.vocab)
.split_by_folder(valid='test')
.label_from_folder(classes=['neg', 'pos'])
.databunch(bs=bs))
learn = text_classifier_learner(data_clas, AWD_LSTM, drop_mult=0.5)
learn.load_encoder('fine_tuned_enc')
learn.fit_one_cycle(1, 2e-2, moms=(0.8,0.7))
learn.freeze_to(-2)
learn.fit_one_cycle(1, slice(1e-2/(2.6**4),1e-2), moms=(0.8,0.7))
learn.freeze_to(-3)
learn.fit_one_cycle(1, slice(5e-3/(2.6**4),5e-3), moms=(0.8,0.7))
learn.unfreeze()
learn.fit_one_cycle(2, slice(1e-3/(2.6**4),1e-3), moms=(0.8,0.7))

I worked out how to do this after a while. The issue was that the Fastai model wasn't tracing correctly no matter what shape of input I was using.
In the end, I used another text classification model and got it to work. I wrote a tutorial about how I did it, in case it can help anyone else.
NLP PyTorch Tracing Tutorial
Begin by opening a new Jupyter Python Notebook using your preferred cloud machine provider (I use Paperspace).
Next, copy and run the code in the PyTorch Text Classification tutorial. But replace the line…
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
With…
device = torch.device("cpu")
NOTE: It caused issues tracing when the device was set to CUDA so I forced it on to the CPU. (this will slow training, but inference on the mobile will run at the same speed as it is cpu anyway)
Lastly, run the code below to correctly trace the model to allow it to be run on Android:
data = DataLoader(test_dataset, batch_size=1, collate_fn=generate_batch)
for text, offsets, cls in data:
text, offsets, cls = text.to(device), offsets.to(device), cls.to(device)
example = text, offsets
traced_script_module = torch.jit.trace(model, example)
traced_script_module.save("model.pt")
In addition, if you would like a CSV copy of the vocab list for use on Android when you are making predictions, run the following code afterwards:
import pandas as pd
vocab = train_dataset.get_vocab()
df = pd.DataFrame.from_dict(vocab.stoi, orient='index', columns=['token'])
df[:30]
df.to_csv('out.csv')
This model should work fine on Android using the PyTorch API.

Related

TFlite with delegate gpu is give wrong result

I'm trying to inference my tflite model on c++ code at an embedded device.
So, I type simple tflite inference code which is using GPU.
And I cross-compile in my PC and run at the embedded device which is running android.
However, (1) If I use delegate gpu option, c++ codes give random results.
(2) It has given same input, but the results changed every time.
(3) When I turn off the gpu option, it gives me a correct results.
When I test my tflite model in python, it gives the correct output.
So I think the model file has no problem.
Should I re-build TensorFlow-lite because I use a prebuilt .so file? or Should I change the GPU option?
I don't know what should I check more.
Please Help!
Here is my C++ code
// Load model
std::unique_ptr<tflite::FlatBufferModel> model =
tflite::FlatBufferModel::BuildFromFile(model_file.c_str());
// Build the interpreter
tflite::ops::builtin::BuiltinOpResolver resolver;
std::unique_ptr<tflite::Interpreter> interpreter;
tflite::InterpreterBuilder(*model, resolver)(&interpreter);
// set delegate option
bool use_gpu = true;
if(use_gpu)
{
TfLiteDelegate* delegate;
auto options = TfLiteGpuDelegateOptionsV2Default();
options.inference_preference = TFLITE_GPU_INFERENCE_PREFERENCE_FAST_SINGLE_ANSWER;
options.inference_priority1 = TFLITE_GPU_INFERENCE_PRIORITY_AUTO;
delegate = TfLiteGpuDelegateV2Create(&options);
interpreter->ModifyGraphWithDelegate(delegate);
}
interpreter->AllocateTensors();
// set input
float* input = interpreter->typed_input_tensor<float>(0);
for(int i=0; i<width*height*channel; i++)
*(input+i) = 1;
TfLiteTensor* output_tensor = nullptr;
// Inference
interpreter->Invoke();
// Check output
output_tensor = interpreter->tensor(interpreter->outputs()[0]);
printf("Result : %f\n",output_tensor->data.f[0]);
//float* output = interpreter->typed_output_tensor<float>(0);
//printf("output : %f\n",*(output));
Luckily, I found the answer.
This was caused by the mismatch of the NDK version. (The prebuilt so file I used is built another version)
After unifying the NDK version to 21, it was tested again, and it worked normally.
In my case same application using gpu on S10 lite was not working but on S22,S21 Ultra 5G was working.
So you can consider running on different chipset configuration on different devices with same application.

How to restore Tensorflow model from .pb file in python?

I have an tensorflow .pb file which I would like to load into python DNN, restore the graph and get the predictions. I am doing this to test out whether the .pb file created can make the predictions similar to the normal Saver.save() model.
My basic problem is am getting a very different value of predictions when I make them on Android using the above mentioned .pb file
My .pb file creation code:
frozen_graph = tf.graph_util.convert_variables_to_constants(
session,
session.graph_def,
['outputLayer/Softmax']
)
with open('frozen_model.pb', 'wb') as f:
f.write(frozen_graph.SerializeToString())
So I have two major concerns:
How can I load the above mentioned .pb file to python Tensorflow model ?
Why am I getting completely different values of prediction in python and android ?
The following code will read the model and print out the names of the nodes in the graph.
import tensorflow as tf
from tensorflow.python.platform import gfile
GRAPH_PB_PATH = './frozen_model.pb'
with tf.Session() as sess:
print("load graph")
with gfile.FastGFile(GRAPH_PB_PATH,'rb') as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
sess.graph.as_default()
tf.import_graph_def(graph_def, name='')
graph_nodes=[n for n in graph_def.node]
names = []
for t in graph_nodes:
names.append(t.name)
print(names)
You are freezing the graph properly that is why you are getting different results basically weights are not getting stored in your model. You can use the freeze_graph.py (link) for getting a correctly stored graph.
Here is the updated code for tensorflow 2.
import tensorflow as tf
GRAPH_PB_PATH = './frozen_model.pb'
with tf.compat.v1.Session() as sess:
print("load graph")
with tf.io.gfile.GFile(GRAPH_PB_PATH,'rb') as f:
graph_def = tf.compat.v1.GraphDef()
graph_def.ParseFromString(f.read())
sess.graph.as_default()
tf.import_graph_def(graph_def, name='')
graph_nodes=[n for n in graph_def.node]
names = []
for t in graph_nodes:
names.append(t.name)
print(names)

Titanium Hyperloop access to android.os.SystemProperties

I have been trying a ( i hope) simple bit of Android hyperloop code directly within a titanium project (using SDK 7.0.1.GA and hyperloop 3).
var sysProp = require('android.os.SystemProperties');
var serialNumber = sysProp.get("sys.serialnumber", "none");
But when the app is run it reports
Requested module not found:android.os.SystemProperties
I think this maybe due to the fact that when compiling the app (using the cli) it reports
hyperloop:generateSources: Skipping Hyperloop wrapper generation, no usage found ...
I have similar code in a jar and if I use this then it does work, so I am wondering why the hyperloop generation is not being triggered, as I assume that is the issue.
Sorry should have explained better.
This is the jar source that I use, the extraction of the serial number was just an example (I need access to other info manufacturer specific data as well), I wanted to see if I could replicate the JAR functionality using just hyperloop rather that including the JAR file. Guess if it's not broke don't fix it, but was curious to see if it could be done.
So with the feedback from #miga and a bit of trial and error, I have come up with a solution that works really well and will do the method reflection that is required. My new Hyperloop function is
function getData(data){
var result = false;
var Class = require("java.lang.Class");
var String = require("java.lang.String");
var c = Class.forName("android.os.SystemProperties");
var get = c.getMethod("get", String.class, String.class);
result = get.invoke(c, data, "Error");
return result;
}
Where data is a string of the system property I want.
I am using it to extract and match a serial number from a Samsung device that is a System Property call "ril.serialnumber" or "sys.serialnumber". Now I can use the above function to do what I was using the JAR file for. Just thought I'd share in case anyone else needed something similar.
It is because android.os.SystemProperties is not class you can import. Check the android documentation at https://developer.android.com/reference/android/os/package-summary.html
You could use
var build = require('android.os.Build');
console.log(build.SERIAL);
to access the serial number.

Keras deep learning model to android

I'm developing a real-time object classification app for android. First I created a deep learning model using "keras" and I already have trained model saved as "model.h5" file. I would like to know how can I use that model in android for image classification.
You cant export Keras directly to Android but you have to save the model
Configure Tensorflow as your Keras backend.
Save model wights using model.save(filepath) (you already done this)
Then load it with one of the following solutions:
Solution 1: Import model in Tensflow
1- Build Tensorflow model
Build tensorflow model from keras model use this code (link updated)
2- Build Android app and call Tensorflow. check this tutorial and this official demo from google to learn how to do it.
Solution 2: Import model in java
1- deeplearning4j a java library allow to import keras model: tutorial link
2- Use deeplearning4j in Android: it is easy since you are in java world. check this tutorial
First you need to export the Keras model to a Tensorflow model :
def export_model_for_mobile(model_name, input_node_names, output_node_name):
tf.train.write_graph(K.get_session().graph_def, 'out', \
model_name + '_graph.pbtxt')
tf.train.Saver().save(K.get_session(), 'out/' + model_name + '.chkp')
freeze_graph.freeze_graph('out/' + model_name + '_graph.pbtxt', None, \
False, 'out/' + model_name + '.chkp', output_node_name, \
"save/restore_all", "save/Const:0", \
'out/frozen_' + model_name + '.pb', True, "")
input_graph_def = tf.GraphDef()
with tf.gfile.Open('out/frozen_' + model_name + '.pb', "rb") as f:
input_graph_def.ParseFromString(f.read())
output_graph_def = optimize_for_inference_lib.optimize_for_inference(
input_graph_def, input_node_names, [output_node_name],
tf.float32.as_datatype_enum)
with tf.gfile.FastGFile('out/tensorflow_lite_' + model_name + '.pb', "wb") as f:
f.write(output_graph_def.SerializeToString())
You just need to know the input_nodes_names and output_node_names of your graph. This will create a new folder with several files. Among them, one starts with tensorflow_lite_. This is the file you shall move to your Android device.
Then import Tensorflow library on Android and use TensorFlowInferenceInterface to run your model.
implementation 'org.tensorflow:tensorflow-android:1.5.0'
You can check my simple XOR example on Github :
https://github.com/OmarAflak/Keras-Android-XOR
If you want optimize way to do classification then i will suggest you to run inference of your model using armnn android libraries.
You have to follow few steps.
1. Install and setup arm nn libraries in ubuntu. You can take help from below url
https://github.com/ARM-software/armnn/blob/branches/armnn_19_08/BuildGuideAndroidNDK.md
Just import your model and do inference. You can take help from below url
https://developer.arm.com/solutions/machine-learning-on-arm/developer-material/how-to-guides/deploying-a-tensorflow-mnist-model-on-arm-nn/deploying-a-tensorflow-mnist-model-on-arm-nn-single-page
After compilation you will get binary which will take input and give you output
You can run that binary inside any andriod appication
It is optimize way.

Given a tensor flow model graph, how to find the input node and output node names

I use custom model for classification in Tensor flow Camera Demo.
I generated a .pb file (serialized protobuf file) and I could display the huge graph it contains.
To convert this graph to a optimized graph, as given in [https://www.oreilly.com/learning/tensorflow-on-android], the following procedure could be used:
$ bazel-bin/tensorflow/python/tools/optimize_for_inference \
--input=tf_files/retrained_graph.pb \
--output=tensorflow/examples/android/assets/retrained_graph.pb
--input_names=Mul \
--output_names=final_result
Here how to find the input_names and output_names from the graph display.
When I dont use proper names, I get device crash:
E/TensorFlowInferenceInterface(16821): Failed to run TensorFlow inference
with inputs:[AvgPool], outputs:[predictions]
E/AndroidRuntime(16821): FATAL EXCEPTION: inference
E/AndroidRuntime(16821): java.lang.IllegalArgumentException: Incompatible
shapes: [1,224,224,3] vs. [32,1,1,2048]
E/AndroidRuntime(16821): [[Node: dropout/dropout/mul = Mul[T=DT_FLOAT,
_device="/job:localhost/replica:0/task:0/cpu:0"](dropout/dropout/div,
dropout/dropout/Floor)]]
Try this:
run python
>>> import tensorflow as tf
>>> gf = tf.GraphDef()
>>> gf.ParseFromString(open('/your/path/to/graphname.pb','rb').read())
and then
>>> [n.name + '=>' + n.op for n in gf.node if n.op in ( 'Softmax','Placeholder')]
Then, you can get result similar to this:
['Mul=>Placeholder', 'final_result=>Softmax']
But I'm not sure it's the problem of node names regarding the error messages.
I guess you provided wrong arguements when loading the graph file or your generated graph file is something wrong?
Check this part:
E/AndroidRuntime(16821): java.lang.IllegalArgumentException: Incompatible
shapes: [1,224,224,3] vs. [32,1,1,2048]
UPDATE:
Sorry,
if you're using (re)trained graph , then try this:
[n.name + '=>' + n.op for n in gf.node if n.op in ( 'Softmax','Mul')]
It seems that (re)trained graph saves input/output op name as "Mul" and "Softmax", while optimized and/or quantized graph saves them as "Placeholder" and "Softmax".
BTW, using retrained graph in mobile environment is not recommended according to Peter Warden's post: https://petewarden.com/2016/09/27/tensorflow-for-mobile-poets/ . It's better to use quantized or memmapped graph due to performance and file size issue, I couldn't find out how to load memmapped graph in android though...:(
(no problem loading optimized / quantized graph in android)
Recently I came across this option directly from tensorflow:
bazel build tensorflow/tools/graph_transforms:summarize_graph
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph
--in_graph=custom_graph_name.pb
I wrote a simple script to analyze the dependency relations in a computational graph (usually a DAG, directly acyclic graph). It's so obvious that the inputs are the nodes that lack a input. However, outputs can be defined as any nodes in a graph because, in the weirdest but still valid case, outputs can be inputs while the other nodes are all dummy. I still define the output operations as nodes without output in the code. You could neglect it at your willing.
import tensorflow as tf
def load_graph(frozen_graph_filename):
with tf.io.gfile.GFile(frozen_graph_filename, "rb") as f:
graph_def = tf.compat.v1.GraphDef()
graph_def.ParseFromString(f.read())
with tf.Graph().as_default() as graph:
tf.import_graph_def(graph_def)
return graph
def analyze_inputs_outputs(graph):
ops = graph.get_operations()
outputs_set = set(ops)
inputs = []
for op in ops:
if len(op.inputs) == 0 and op.type != 'Const':
inputs.append(op)
else:
for input_tensor in op.inputs:
if input_tensor.op in outputs_set:
outputs_set.remove(input_tensor.op)
outputs = list(outputs_set)
return (inputs, outputs)

Categories

Resources