My requirement is to scan a fixed object. After recognizing that, I want to highlight the object and to display corresponding pre-feeded parameters accordingly, like height, width, circumference, etc.
This all, I want to do, without internet, using camera only.
Please, let me know if any solution / suggestion for this.
I have seen CraftAR SDK. It seems working as per my requirement, in order to recognize object, but it uses its server for storing images, which I don't want. As I want the static image, to be stored in app itself.
Try using the TensorFlow Object Detection API. Link: TensorFlow Object Detection API
And you can then customize your overall app behaviour accordingly, managing all your requirements (like for eg. showing a pop up with all the details of the object that's being detected after receiving some kind of callback when using the Tensoflow Object Detection API after the object has been detected successfully) as well as I believe that you can customise the TensorFlow object detection scenario part as per your need (Here, I am talking about the UI related part specifically in case of how you want your app to detect the object graphically).
To answer about the details like how it works offline and the resulting overall APK size etc. can be better understood from the links given below:
1] Step by Step TensorFlow Object Detection API Tutorial — Part 1: Selecting a Model
2] How to train your own Object Detector with TensorFlow’s Object Detector API
As an overview, for detecting the objects offline you have to be limited (just to reduce your APK size) with your own set of data/objects (as you have mentioned that you have got a fixed object for detection, that's good) and then you have to train (can be trained locally as well as on cloud) this set of objects using a SSD-Mobilenet model. Then you will have your own trained model (in simpler words) of those set of objects which will give you a retrained_graph.pb file (this goes into your assets folder for your android project) which is the final outcome that includes the trick (in simpler words) to detect and classify the camera frames in real time thereby displaying the results (or object details) as per the info (or the set of data/objects) provided; without the need of any sort of internet connection. For instance, TF Detect can track objects (from 80 categories) in the camera preview in real-time.
For further reference follow these links:
1] Google Inception Model
2] Tensorflow Object Detection API Models
3] Speed/Accuracy Trade-offs for Modern Convolutional Object Detectors
Also you can optimize (or compress) the retrained_graph.pb to optimized_graph.pb as this is the only major thing that would increase your APK file size. Long ago, when I tried detecting 5 different objects (using TF Classify), each object's folder was having about 650 photographs and the overall size of all the 5 folders (together) was about 230 mb and my retrained_graph.pb size was only 5.5 mb (which can further be optimized to optimized_graph.pb reducing its size even more).
For to start learning it from the beginner's level I would suggest you to once go through these codelab links and understand the working of these two projects as I too did so.
1] TensorFlow For Poets
2] TensorFlow For Poets 2: Optimize for Mobile
Wishing you good luck.
The below link to TensorFlow GitHub (Master) includes almost everything:
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android
Related
I've tried several ways to create android app to classify (with Tensorflow) that image on camera view IS document of my type OR NOT.
I've decided to create my custom model for this classification task.
First I try to use CustomVision helper, I've created 'model.pb' file, train it on 100 correct images of my document and 50 images on this document with mistakes (I know that its very small number of images, but that's all i have at the moment). On the output I have 'model.pb' and 'labels'('correct' and 'invalid') files. I put it in android example (custom vision sample) and it work very sadly: app always says that all I'm seeing in the camera screen (peoples, desks, windows, nature...) - is my CORRECT document label. Only sometimes, if I catch document with wrong stamps in the camera screen I've got INVALID label.
So i've decided to use more complex model and simple re-train it.
I've used this tutorial to get model and train it (Tensorflow for Poets codelab). But the situation is the same: all in camera view detecting such as 'correct' and sometimes (when i put camera on document with wrong angle or not full document) - 'invalid'
SO MY QUESTION IS:
What I'm doing in concept way wrong? Maybe I train models in wrong way? Or maybe tensorflow models couldn't be used to goal os detection documents on screen?
I have developed a model, for classification which used pre-trained glove embeddings on Wikipedia, as the lookup table for words in a sentence. As the size of this embedding is large (around 2.5GB), I am passing it as a placeholder rather than a variable.
EMBEDDING_MATRIX = tf.placeholder(tf.float32, [None, embedding_dimension], name='EMBEDDING_MATRIX')
While training, I am passing that value as feed dict.
embeddings = np.load('glove_wiki.npy')
feed_dict = {EMBEDDING_MATRIX: embeddings}.
Now, I have a created a .pb format of the model, using tensorflow freezing. Things are good in computer.
Now, I want to integrate this model into tensorflow lite and I know, .pb is the way to go for it. But, how will I use this huge size glove embeddings in a mobile device? That is not possible right. So, I was wondering if there any solution for this. I believe, most of the latest models are more advanced and they are also using word embeddings as a lookup table. Rather, than provide this model as an API in the cloud, how can I integrate this into a mobile device. Because Google smart reply is also using LSTM model I guess. Any suggestions will be appreciated.
I am new to android and ARToolkit.I have to develop the android application which can augment and render the 3D models from CT scan images in DICOM format on the detected marker. I am using ARToolkit SDK for my purpose. But don't how to proceed with the dicom files and render the 3D model on marker. Someone please suggest some approach. Any sort of help will be highly appreciated.
Thanks
I recommend the following process;
Figure out a tool for segmentation. This is the process whereby you will build a 3d model of subset of the data depending on density. For example, you will build a model of the ribs of a chest CT. You should do this outside of Android and then figure out how to move it later. You can use tools like ITK and VTK to learn how to do this stage.
If you want to avoid the ITK/VTK learning curve, use GDCM (grass roots dicom) to learn how to load a DICOM series. With this approach you can have a 3D array of density points in your app in a few hours. At this point you can forget about the DICOM and just work on the numbers. You still have the segmentation problem.
You can look at the NIH app ImageVis3D which has source code and see what there approach is.
Once you have a segmented dataset, conversion to a standard format is not too hard and you will be on your way.
What is the 'detected marker' you refer to? If you have a marker in the image set to aid in segmentation, you can work on detection from the 3d dataset you get back from loading the dicom data.
Once you have the processes worked out, you can then see how to apply it all to Android.
It seems a little old but, recommended for a start: Android OpenGL .OBJ file loader
I was wondering too about building a CustomView to address your needs, since in a CV you can display anything.
I use ffmpeg to play video stream on SurfaceView of Android project. Now I would like to implement following feature.
1) Select one object by drawing a red rectangle on the SurfaceView.
2) Send x, y, width, height of the selected object and the original video frame to opencv.
3) Then, opencv return the new x and y of the object by processing the new video frame.
Anybody did it before? I will be very nice of you to give me some suggestion, or tell me very I can download the demo source code. Thank you so much.
For part (1), try searching Google a little more. It won't be hard to find a tutorial that uses touch input, a tutorial to draw a rectangle, and a tutorial to draw over the SurfaceView. Part (2) is done just by how you set up and define your variables - there isn't a specific mechanism or function that "sends" the data over.
Part (3) is the part that isn't obvious, so that's the part I'll focus on. As with most problems in computer vision, you can solve object tracking in many ways. In no particular order, what comes to mind includes:
Optical Flow - Python openCV examples are here
Lucas-Kanade - the algorithm compares extracted features frame-by-frame. [The features are Shi-Tomasi by default, but can also be BRIEF, ORB, SIFT/SURF, or any others.] This runs quickly enough if the number of features is reasonable [within an order of magnitude of 100].
Dense [Farneback] - the algorithm compares consecutive frames and produces a dense vector field of motion direction and magnitude.
Direct Image Registration - if the motion between frames is small [about 5-15% of the camera's field of view], there are functions that can map the previous image to the current image quickly and efficiently. However, this feature is not in the vanilla OpenCV package - you need to download and compile the contrib modules, and use the Android NDK. If you're a beginner with Java and C++, I don't recommend using it.
Template Matching [example] - useful and cheap if the object and camera orientations do not change much.
I'm working on a novelty detection project (in Android) using background subtraction. I would like to do simple classification after an object is detected (e.g. Person, Dog, Car, Motorbike). I have read some literature on how to do this using e.g. Bag Of Words, Cascade classifier, but I'm still not sure how to approach the task. Essentially, I want to store the novelty as an image and then label it into some sort of class - like Person, Dog, etc..or even match against a specific person, for example. The number of classes doesn't have to be too large - 2 or 3 even. So my questions are:
1) What's the best/simplest approach to do this? Cascade/Bag of Words/other methods?
2) How can I transport this classifier to my Android app, so that after I store the novelty as an image, I can just pass it to the classifier, which then applies a label. I know Cascade classifiers have associated xml files, but is this the only way?
Sorry if the questions seem trivial, but I've had a difficult time finding literature online.