I'm trying to use the model used on this tutorial in an Android app. I wanted to modify DetectorActivity.java and TensorFlowMultiBoxDetector.java found here but it seems like i miss some parameters as imageMean, imageStd, inputName, outputLocationsName and outputScoresName.
From what I understand, input name is the name of the input for the model and both outputs are the names for the position and score output, but what do imageMean and imageStd stand for ?
I don't need to use the model with a camera, I just need to detect objects on bitmaps.
Your understanding of input/output names seems correct. They are tensorflow node names that can receive input and will contain the outputs at the end. imageMean and imageStd are used to scale the RGB values of the image to the mean of 0 and the standard deviation of 1. See the 8 lines starting from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/TensorFlowMultiBoxDetector.java#L208
The TensorFlow android demo app you are referring to has been updated. It now supports MobileNets. Check it out at github: commit 53aabd5cb0ffcc1fd33cbd00eb468dd8d8353df2.
Related
I have written an Android application to analyze screenshots and extract data for further analysis.
Unfortunately it happens that characters get extracted twice.
For example the value 15.134.567 is detected as one line with elements 15.134 and 4.567.
When I combine the line to get the full value -> 151344567 which is one digit too long, because 4 has been detected twice.
This happens very random, so maybe 1 out of 30 values is detected wrong.
Is there any improvement possible?
e.g. predefined scan area, adjust limits, change to greyscale image, set expected data format ##.###.### ?
Thanks a for for your help!
[example image for OCR][1]
[1]: https://i.stack.imgur.com/PbtSx.jpg
I am new to NLP and I am using different pretrained model than Wav2Vec2.
I am now playing with createWav2Vec2 py. provided by Pytorch.
https://github.com/pytorch/android-demo-app/blob/master/SpeechRecognition/create_wav2vec2.py
I load the pretrained model from hugging face , but during the sanity check , the transcribed text is wrong
Place i changed in code from
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")
To
model1 = Wav2Vec2ForCTC.from_pretrained("patrickvonplaten/wav2vec2-base-timit-demo-colab")
Correct results
Result: I HAD THAT CURIOSITY BESIDE ME AT THIS MOMENT
But i got
Result: J <pad></s>DJ<pad>F</s>DJF<pad>JBJSN JKJCJ JFJO<pad>YLJCJ L<pad>HL<pad> F<pad>F</s> JC<pad>JHKJHLRFJ<pad>
Could somebody advise what is wrong here?
Your problem is the alphabet variable in https://github.com/pytorch/android-demo-app/blob/master/SpeechRecognition/create_wav2vec2.py. You should to replace it with here https://huggingface.co/patrickvonplaten/wav2vec2-base-timit-demo-colab/blob/main/vocab.json. You have to use only the keys of the dict as list.
For the <pad> you have to specify it in the load of the tokernizer/processor that you want it as <pad>
I am quite new to this platform so please be kind if my question is stupid. Currently I am trying to integrate a deep learning model by using SNPE to detect human pose. The architecture of the model is as following:
Input -> CNN layers -> seperate to two different set of CNN -- > 2 different output layers
So, basically my network is stated from an input data and then genertates two different outputs (output1 and output2), but when I try to execute the network in SNPE, It seems like only have information about the output2 layer. Do any of you has any idea about this situation and is it possipole for me to look for the output of output1. Thank you all in Advance!.
I assume you have successfully converted the model to DLC and are trying to run the network with snpe-net-run tool. For getting multiple outputs, while running snpe-net-run you need to specify the output layers (in addition to input) in the file that is given to --input_list argument.
Let's assume outputlayer1 and outputlayer2 are the names of 2 output layers and ~/test/example_input.raw is the path of the input, then the input list file format for the same is as follows:
#outputlayer1 outputlayer2
~/test/example_input.raw
In the first line # is followed by output layer names which are separated by a whitespace. Next line contains the path to input(single input case). You can also add multiple input files, one line per iteration. If there is more than one input per iteration, a whitespace should be used as a delimiter.
General format for input list file is as follows
#<output_name>[<space><output_name>]
<input_layer_name>:=<input_layer_path>[<space><input_layer_name>:=<input_layer_path>]
…
You can refer to snpe-net-run documentation for more information.
This is my code of a tensorflow Lite model imported on Android Studio :
enter image description here
And this is the out put When I run the App:
enter image description here
I Don't understand it , How can I get the model Output ??
update :
The output is Float Array of 6 elements , but what I want is the Index of the Largesse element , I tried this code :
enter image description here
Is It right ?? I'm getting the same output 1- on every prediction
Looks like your TFLite model generates a feature vector as a float array, which represents the characteristics of the image data.
See also https://brilliant.org/wiki/feature-vector/
The feature vector is usually considered as an intermediate data and it often requires to put the feature vectors again to the additional model for image classifications or some other tasks.
I have completed training a simple linear regression model on jupyter notebook using tensorflow, and I am able to save and restore the saved variables like so:
Now I'm trying to use the model on an android application.
Following the tutorial here, I am able to get to the stage where i import the tensorflow library like so:
Now I'm at the point where I want to give the model an input data and get a output value.(Refer to application flow below) However, they are using a .pb file(no clue what this is) in their application. In the 4 files:
that i got from saving my model, i do not have a .pb file which left me dumbfounded.
What the application does:
Predicts the SoC with a pre-trained tensorflow model using user's input value of height.
Whereby, the linear regression equation is used: y = Wx + b
y - SoC
W - weight
x - height
b - bias
All variables are float values.
Android application flow:
User inputs height value in textbox, and presses "Predict" button.
Application uses the weight, bias & height values of the saved model to predict SoC.
Application displays predicted SoC in textview.
So my question is: how do I import and use my model in Android application using android studios 2.3.1?
Here are my ipynb and csv data files.
I may have misunderstood the question but:
Given that the model is pre-trained, the weight and bias are not going to change, you can simply use the W and b values calculated in the Jupyter notebook and hard code them in a simple expression
<soc> = -56.0719*<height> + 98.3029
there is no need to import a tensorflow model for this.
UPDATE
To ensure the question is answered, the *.pb file comes from freezing the checkpoint file with the graph - refer to the second code panel in the linked tutorial for how to do this.
In terms of what freezing is refer here