Feasibility of running an ML model on phone hardware?

Feasibility of running an ML model on phone hardware? - android

I've trained a TensorFlow model which takes my RTX2080 several seconds per action (in addition to 20-30 seconds to initialise the model).
I've been looking into turning this into an iOS/Android app running on tensorflow lite, but apart from the technical challenge of converting the model into a tensorflow lite model and everything else,
I am wondering about the feasibility of this running on phone hardware even on a reasonably modern phone with inbuilt GPU would this still likely be too slow for practical purposes?
Can anyone who has built an iOS/Android app with tensorflow lite where the phone is responsible for computation comment on performance and other practical considerations?
The only other option of having requests served by my own server(s) on AWS, for example, would turn into a major expense if the app had significant use.

I have created an image recognition app in swift using TensorFlow lite and did not find any performance issues with it. The prediction took anywhere from 3 - 5 seconds consistently , which I personally think is not that bad.So I would suggest to go ahead with your app using TF Model.Thanks!

Related

Number of computer vision model can run on Android devices?

I was checking the feasibility of number of computer vision model that can run on the android devices simultaneously but could not find any resource for it.
I have two computer vision models, one is doing classification of images about 20 classes and another one I want to integrate is the image depth map model. Both of them will work on real time. Thus I want to know will they able to run on Android devices with limited ram of 1 Giga Byte.

My own experience is that this is highly dependent on hardware as well as the neural network architectures themselves and optimization methods performed on them. So there is not exact answer to your question.
However, I can tell you that 10 FPS will be very hard to achieve, specially with 1GB of RAM on an old device I assume. I have a TFLite Yolov5s model, quantized to INT8 which I pass 320x320 images to, running on an old Samsung tablet from 2015 (3GB of RAM), achieving an average inference time of 0.47 seconds. This is only one model.

How to overcome EAST Text Detector slowness in Android app?

Im working on an Android app that's basically detects text and recognizes it (OCR) from pictures taken by users.
Im using Opencv V4.3 and Tesseract V4, and because of the fact that most of the docs (opencv) are in C++ and python, i try to test things in python before implementing them in java - Android.
So for EAST it takes precisely 1.6 seconds to executes on pyhton but in Android app it takes a whole lot more (didnt calculated it yet).
I have been thinking of using either multithreding, or async task for parallel processing bounding boxes (executing time 1 sec in python) but due to im new to mobile app dev and computer vision :/, i wanted to do some researching/testing first and take advice from SOF community.
Thanks.
Code used in Python : https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/
Code used in Java : https://gist.github.com/berak/788da80d1dd5bade3f878210f45d6742

After a long search, finally i discovered TFLITE, then i thought about converting EAST Detector model to tflite format although it might a litle bit slower, but its a progress, refer to : Converting EAST to TFLite
Now i begin to consider training my own model for text detection, i'll keep this thread updated for others in the future.

Mxnet on mobile GPU

I want to run a Neural Network on mobile. Currently, I am exploring Mxnet (http://mxnet.io) framework for deploying it (only for Inference). As I am concerned about the execution time performance on mobile, I want to know if it runs on the GPU of mobile phones (Android/iOS). The documentation mentions that it can use multiple CPUs as well as GPUs for training, but it is still not clear if it can GPU of mobile phone for inference on mobile. It mentions about dependency on BLAS, because of which it seems it uses CPU on mobile. Could anyone please tell me if I can use mobile GPU with mxnet for inference? If not, what are my other options?

UPDATE: The Neural Networks API is now available on Android devices starting from API 27 (Oreo 8.1). The API provides a lower-level facility that a higher-level machine learning framework (e.g. Tensorflow, Caffe) can use to build models. It is a C-language API that can be accessed through the Android Native Development Kit (NDK).
NNAPI gives hardware vendors a Service Provider Interface (SPI) to provide drivers for computational hardware such as Graphics Processing Units (GPUs) and Digital Signal Processors (DSPs). As a result, the NNAPI provides an abstraction for high performance computation. There is a CPU fallback in case no hardware acceleration drivers are present.
For those wanting to implement a machine learning model on Android, the framework of choice is now Tensorflow Lite. Tensorflow Lite for Android is implemented on top of the NNAPI, so Tensorflow models will get hardware acceleration when available. Tensorflow Lite has other optimizations to squeeze more performance out of the mobile platform.
The process goes as follows:
Develop and train your model on Keras (using Tensorflow backend)
Or use a pretrained model
Save a "frozen" model in Tensorflow protobuf format
Use the Tensorflow Optimizing Converter to convert the protobuf into a "pre-parsed" tflite model format
See the Tensorflow Lite Developer Guide
I went through an exercise of creating a neural net application for Android using Deeplearning4j. Because Deeplearning4j is based on Java, I thought it would be a good match for Android. Based on my experience with this, I can answer some of your questions.
To answer your most basic question:
Could anyone please tell me if I can use mobile GPU with mxnet for inference?
The answer is: No. The explanation for this follows.
It mentions about dependency on BLAS, because of which it seems it uses CPU on mobile.
BLAS (Basic Linear Algebraic Subprograms) is at the heart of AI computation. Because of the sheer amount of number-crunching involved in these complex models the math routines must be optimized as much as possible. The computational firepower of GPUs make them ideal processors for AI models.
It appears that MXNet can use Atlas (libblas), OpenBLAS, and MKL. These are CPU-based libraries.
Currently the main (and — as far as I know — only) option for running BLAS on a GPU is CuBLAS, developed specifically for NVIDIA (CUDA) GPUs. Apparently MXNet can use CuBLAS in addition to the CPU libraries.
The GPU in many mobile devices is a lower-power chip that works with ARM architectures which doesn't have a dedicated BLAS library yet.
what are my other options?
Just go with the CPU. Since it's the training that's extremely compute-intensive, using the CPU for inference isn't the show-stopper you think it is. In OpenBLAS, the routines are written in assembly and hand-optimized for each CPU it can run on. This includes ARM.
Do the recognition on the server. After working on another demo application which sent an image to the server that performed the recognition and returned results to the device, I think this approach has some benefits for the user such as better overall response time and not having to find space to install a 100MB(!) application.
Since you also tagged iOS, using a C++-based framework like MXNet is probably the best choice if you are trying to go cross-platform. But their method of creating one giant .cpp file and providing a simple native interface might not give you enough flexibility for Android. Deeplearning4j has solved that problem pretty elegantly by using JavaCPP which takes the complexity out of JNI.

Tensorflow Android App train Model

Right now I am trying to learn Tensorflow. But I am not sure if I understand it right, i.e. if tensorflow is working for what I want to do.
I have an android app which collects data from the device and trains a model using weka and store this model.
Instead of weka I wanted to use Tensorflow
As far as I understood here I have to train the model before.
I can't train a model on the android app using tensorflow?

In theory you can train a model on the device. However, it generally requires huge amounts of processing power (and/or a GPU), memory (RAM) and disk space to train a model. Nobody recommends attempting to do this on a mobile device, due to the hardware and battery life constraints.
If you were doing only a limited amount of training, you might be able to do it on the device. You could also consider only training the model when the phone is plugged into a power cable and is otherwise idle (in this case you might have problems if Doze mode kicks in).
The other problem is that almost all the tutorials and code labs assume you are training the model on a powerful computer, then embedding that trained model in the application (e.g. here are some blog posts I wrote). If you do find any good examples of training a model on an Android device please share them in the comments!

i think you can run the tensorflow apk first (size of 106MB):
https://ci.tensorflow.org/view/Nightly/job/nightly-matrix-android/TF_BUILD_CONTAINER_TYPE=ANDROID,TF_BUILD_IS_OPT=OPT,TF_BUILD_IS_PIP=NO_PIP,TF_BUILD_PYTHON_VERSION=PYTHON2,label=android-slave/
i think if we know how the tensorflow work and we can allow the train job to the romete service such as AWS or sth. our android phone just send the data and receive the result. right?

How much resource usage would pattern recognition/image processing take in Android

I'm writing an Android app that involves some form of pattern recognition to count the number of similar objects in an image. The app would be designed to work with a specific type of objects and would not involve machine learning.
Is the computation and processing within the device for such a scenario feasible or would it be better to send the image over to a remote server?
If the computation can be handled by the device, would a first generation device running on version 2.2 with 528MHz of CPU and 288MB of RAM be able to return an output within a convenient amount of time?

It completely depends on your algorithm. There's no universal pattern recognition/image processing algorithm, even for your somewhat specific case of counting similar objects.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.