I'm trying to detect the text on the digital led displays of some device like below and need help on doing that. I tried the sample for text detection that Google provides with firebase ML kit but does not perform well on device(not cloud).
Help me out on how to optimise the accuracy for on device model.
I'm looking for suggestion for doing it the right way with ML kit or are there any other alternatives that are easy than this like OpenCV, etc.
how to optimise the accuracy for on device model?
Unfortunately, you cannot optimize the accuracy of the out-of-box API models in ML Kit. We will update the model to better recognize these images.
are there any other alternatives?
Unless there is a library that performs at the level you are satisfied with, you will have to train your own model. You could potentially start with an open source model and do transfer learning for your own use case without too much data. You can look at Tensorflow-for-poets-2 codelab for a quick tutorial on how to do this. Another option is to look at TF Hub for reusing existing models easily.
Then to deploy your model onto the device for inference, please take a look at using your own custom model in ML Kit.
Related
I've been working on an Android app to use the CameraX API and I would like to create an Image Analyzer using MLKit. My main objective is to detect a certain type of document of particular size, one detection label for the front and one for the back. So I would like to train a TensorFlow Lite model that can meet my particular needs. I've already seen lots of great documentation that describes just how I can use my own model locally on the phone for my Analysis use case so that isn't an issue.
Problem is I'm a mobile dev and I don't know much about actually training my model. I have a ton of training data as I've gone through some great tutorials like like this one and I have a lot of images (thousands) of the documents to be classified, and I have organized them into their respective directories as described. I also followed along with this video and it is pretty apparent that using the TensorFlow Image Classifier and a trained model to retrain is a hugely beneficial approach, but I haven't seen much information on exactly why they have chosen that Tensorflow Hub lite model. What makes one better than the others, and what should I look for to pick the best one for retraining and fitting my particular needs (detecting a particular document)?
Is this the right approach? Any advice is appreciated, thanks!
I need to detect the faces of any animal say Cat and match that to stored photo from database in Android app.
Is this possible to implement through Firebase Face detection APIs OR should I use TensorFlow lite's object detection ?
I do not think that you can detect an animal's face through the Firebase Face Detection API, because I think it is only suitable for human faces since it allows you to detect features such as facial expression. However, if you want to detect whether there are cats in the frame of your camera, your best bet would be to use the Object Detection API. Additionally, if you train your model well it will be able to detect whether there is a cat's face present in the frame. Once you have your model it is rather straightforward on how to integrate it within your app by using Firebase's ML Kit - for reference you can take a look at their quick start example.
The ML Kit Image Labeling API should get you what you are looking for. It's available for on-device or cloud processing. The on-device API can distinguish between 400+ labels, including identifying whether there's a cat or a dog in an image. If you are looking for something more specific you can build your own custom model using AutoML Vision Edge in ML Kit, or using TensorFlow Lite directly.
I want to know the feasibility of an android app which I am going to build for my College project.
The App, which I am trying to build is for attendance on the class through voice recognition or face detection.
For this, I suppose to first collect the data set for all the student of the class and then train it.
so, Is it feasible to build such an app and how to approach this?
I am new in Tensor flow and ML and also searched about this on the internet but unsuccessful to find anything so please help me come out from this. Your help is appreciated.
You will have to train and use a custom model for this.
ML Kit offers face detection but does not offer face recognition or voice recognition at the moment. So you will have to collect data and train a model yourself. You can look at the quickstart samples for iOS and Android on GitHub and learn about using mobile-optimized custom models in your app.
I'm doing a project using Vuforia AR SDK (for Android) using the front camera and doing ImageTarget recognition. But the question is, when I flat the book on the desktop, it can not be tracked by Vuforia (see the picture).
I know in Android I can use method just like android.graphic.Camera.rotateX to modify the view; can I do it in Vuforia? Or is there some other way to make it better when flat book on the desktop for quicker recognition speed?
You can use there Extended Tracking feature. It should help you with tracking, but it is not improving recognition in this kind of conditions like flat book.
Known that I'm very new in Machine learning.
I was thinking about a real world example of using Machine Learning
and Neural network in an application and I want to try it with a
mobile application who can handle image recognition with the front
camera after make an image of something(A cat for exemple).
I really need advice of tools to use to rapidly make a prototype of this application with a python backend that I will call via rest.
Thanks in advance.
I suggest if you are new to the machine learning algorithms, that you use an API from Google or Microsoft and get in touch with the flow and how it works .. Once you understand what are the inputs and outputs, you can try to replace the API for you own neural net, try to train it properly and collect results ..
Machine learning is not an easy concept and if you start big, there is a good chance that you'll get discouraged before you finish building it ... The API will provide you with a functional prototype very quickly and thus help you stay motivated to pursue it more ..
But to answer your question more directly, TensorFlow by Google is probably the most sophisticated tool for machine learning in general right now..
There is an excellent course for deep learning with TensorFlow made by Google on Udacity ..
You can follow PyImageSearch. It has lot of stuff related image processiong like face recognition and license Plate Recognition system. It also use neural networks.
Use an image recognition API, like google vision.
It is easy and fast to put in an application, and a lot more effective if you do not have experience and ressources in ML
I have done something similar for our company website. It is based on caffe though.
You can go through the source code here
However, it is a segmentation demo. You need to modify it a little.