I've used local Android face detection on an Android device, but it seems quite slow and I'm not so sure on the reliability. I've also used OpenCV's face detection but only on PC, as opposed to an Android device. For Android, I'm guessing I'll have to use JavaCV (or OpenCV4Android?).
Do you know what the speed differences are between Android API's facial detection and OpenCV's facial detection? I'm sure OpenCV/JavaCV is both more efficient/faster and more accurate, but cannot confirm.
Thanks!
Suggestion: If you are looking for face detection, I suggest you use platform specific APIs like FaceDetector rather than OpenCV Java wrapper. This is since those API's would be hardware accelerated(GPU) unlike OpenCV face detection which till version 3.0 relied on CPU only.
The speed difference you perceive between desktop and mobile device should be for the difference in device hardware ( like CPU ) and not because of different libraries wrappers like JavaCV/OpenCV4Android. OpenCV is written is in C/C++. All processing intensive code is still in C/C++ and the Java libraries are just wrappers over JNI.
OpenCV4Android - OpenCV.org maintained Android Java wrapper. Recommended.
OpenCV Java - OpenCV.org maintained auto generated desktop Java wrapper.
JavaCV - Popular Java wrapper maintained by independent developer(s). Not Android specific. This library might get out of sync with OpenCV newer versions.
Related
I want to run a Neural Network on mobile. Currently, I am exploring Mxnet (http://mxnet.io) framework for deploying it (only for Inference). As I am concerned about the execution time performance on mobile, I want to know if it runs on the GPU of mobile phones (Android/iOS). The documentation mentions that it can use multiple CPUs as well as GPUs for training, but it is still not clear if it can GPU of mobile phone for inference on mobile. It mentions about dependency on BLAS, because of which it seems it uses CPU on mobile. Could anyone please tell me if I can use mobile GPU with mxnet for inference? If not, what are my other options?
UPDATE: The Neural Networks API is now available on Android devices starting from API 27 (Oreo 8.1). The API provides a lower-level facility that a higher-level machine learning framework (e.g. Tensorflow, Caffe) can use to build models. It is a C-language API that can be accessed through the Android Native Development Kit (NDK).
NNAPI gives hardware vendors a Service Provider Interface (SPI) to provide drivers for computational hardware such as Graphics Processing Units (GPUs) and Digital Signal Processors (DSPs). As a result, the NNAPI provides an abstraction for high performance computation. There is a CPU fallback in case no hardware acceleration drivers are present.
For those wanting to implement a machine learning model on Android, the framework of choice is now Tensorflow Lite. Tensorflow Lite for Android is implemented on top of the NNAPI, so Tensorflow models will get hardware acceleration when available. Tensorflow Lite has other optimizations to squeeze more performance out of the mobile platform.
The process goes as follows:
Develop and train your model on Keras (using Tensorflow backend)
Or use a pretrained model
Save a "frozen" model in Tensorflow protobuf format
Use the Tensorflow Optimizing Converter to convert the protobuf into a "pre-parsed" tflite model format
See the Tensorflow Lite Developer Guide
I went through an exercise of creating a neural net application for Android using Deeplearning4j. Because Deeplearning4j is based on Java, I thought it would be a good match for Android. Based on my experience with this, I can answer some of your questions.
To answer your most basic question:
Could anyone please tell me if I can use mobile GPU with mxnet for inference?
The answer is: No. The explanation for this follows.
It mentions about dependency on BLAS, because of which it seems it uses CPU on mobile.
BLAS (Basic Linear Algebraic Subprograms) is at the heart of AI computation. Because of the sheer amount of number-crunching involved in these complex models the math routines must be optimized as much as possible. The computational firepower of GPUs make them ideal processors for AI models.
It appears that MXNet can use Atlas (libblas), OpenBLAS, and MKL. These are CPU-based libraries.
Currently the main (and — as far as I know — only) option for running BLAS on a GPU is CuBLAS, developed specifically for NVIDIA (CUDA) GPUs. Apparently MXNet can use CuBLAS in addition to the CPU libraries.
The GPU in many mobile devices is a lower-power chip that works with ARM architectures which doesn't have a dedicated BLAS library yet.
what are my other options?
Just go with the CPU. Since it's the training that's extremely compute-intensive, using the CPU for inference isn't the show-stopper you think it is. In OpenBLAS, the routines are written in assembly and hand-optimized for each CPU it can run on. This includes ARM.
Do the recognition on the server. After working on another demo application which sent an image to the server that performed the recognition and returned results to the device, I think this approach has some benefits for the user such as better overall response time and not having to find space to install a 100MB(!) application.
Since you also tagged iOS, using a C++-based framework like MXNet is probably the best choice if you are trying to go cross-platform. But their method of creating one giant .cpp file and providing a simple native interface might not give you enough flexibility for Android. Deeplearning4j has solved that problem pretty elegantly by using JavaCPP which takes the complexity out of JNI.
I'm about to start working on an augmented reality project that will involve using the EPSON Moverio AR glasses or similar. Assuming we go for those glasses, they run on Android 4.0.4 - API 15. The app will involve near realtime analysis of video (frames from the glasses camera) for feature detection / tracking with markers and overlaying 3d objects in the 'real world' based on the markers.
So far then, on the technical side, it looks like we'll be dealing with:
API 15
OpenCV
OpenGLES2
Considering the above, I'm wondering if it's worth doing it all thru the NDK, using a single NativeActivity with the android_native_app_glue code. When I say worth it I mean performance wise.
Sure, doing it all on the C/C++ side has for instance the advantage that the code could then potentially be ported with minimal modification to run on other environments. But OpenCV does have Java bindings for Android and GL can also be used to a certain extent from Java. So I'm just wondering if performance-wise it's worth it or it would be about the same as, say, using a GLSurfaceView.
I work in augmented reality. The vast majority of applications I've seen have been native. Google recommends avoiding native application unless the gains are absolutely necessary. I think AR is one of the relatively few cases where it is necessary. The benefits I'm aware of are:
Native camera access will allow you to get a higher capture framerate. Passing the data to the Java layer considerably slows this down. Even OpenCV's non-native capture can be slower in Java because OpenCV primary maintains the data in native objects. Your capture framerate is a limiting factor on how fast you can update the pose information for your objects. Beware though, OpenCV's native camera will not work on devices running Qualcomm's MSM optimized fork of android - this includes many snapdragon devices.
Every call to an OpenGL method in Java not only has a cost related to dropping into native, and they also perform quite a few additional checks. Look through GLES20.cpp which contains the native implementation of the GLES20 class's methods. You'll see that you could bypass quite a lot of logic by using native calls. This is fine in most mobile application, but 3D rendering often gets a significant benefit from bypassing those checks and the JNI overhead. This is even more important in AR because your will already be swamping the system with CV processing.
You will very likely want your detection related code in native. OpenCV have samples if you want to see the difference between native and Java detection performance. The former will use fewer resources and be more consistent. Using a native application means that you can call your native functions without paying the cost of passing large amount of data from Java to native.
Native sensor access is more efficient and the rate says far more consistent in native thanks to the lack of garbage collection and JNI. This is relevant if you will be using IMU data interesting ways.
You may be able to build an non-native application that has most of it's code in native and runs well despite being Java based, but it is considerably more difficult.
I am using IPP for my PC application and Accelerate Framework for iOS app. Now I have to develop something for Android and I need some form of acceleration to boost the performance.
I used project ne10 but the result is not good enough. So is there anything out there I can use that is like Accelerate Framework equivalent for Android?
Thanks,
Kelvin
Take a look at renderscript
http://developer.android.com/guide/topics/renderscript/index.html
http://developer.android.com/guide/topics/renderscript/compute.html
http://android-developers.blogspot.com/2013/01/evolution-of-renderscript-performance.html
They call it Renderscript but its really OpenCL for Android
Renderscript is an Android computation engine that lets you use CPU/GPU native hardware acceleration in order to boost applications, for example in image processing and computer vision algorithms.
Is there a similar thing in iOS and Windows Phone 7/8?
The RenderScript compatibility library is designed to compile for most any posix system. It would be very easy to get it running on other platforms.
I can't speak for Windows Phone but on iOS it would be vImage running on the Accelerate Framework. Just like Renderscript, it is dynamically optimized for the CPU on the target platform.
vImage optimizes image processing by using the CPU’s vector processor.
If a vector processor is not available, vImage uses the next best
available option. This framework allows you to reap the benefits of
vector processors without the need to write vectorized code.
https://developer.apple.com/library/mac/documentation/performance/Conceptual/vImage/Introduction/Introduction.html
I can't speak for Windows Phone but on iOS it would be Apple Metal, its language specification being almost same as renderscript c99.
For iOS it is the newly introduced swift I guess.
Maybe it is worth to try it out, but I'm not an iOS developer so I can't say anything about its performance, but the demos on the WWDC looked promising. Also instead of Renderscript Swift seemes to be designed for graphics, the Renderscript soppurt for graphics has been deprecated and Renderscript turned more into a general computation framework (which of course can be used as a backend for graphic calculations).
https://developer.apple.com/library/prerelease/ios/documentation/swift/conceptual/swift_programming_language/TheBasics.html
I am now programming on Android and I wonder whether we can use GPGPU for Android now? I once heard that Renderscript can potentially execute on GPGPU in the future. But I wonder whether it is possible for us to programming on GPGPU now? And if it is possible for me to program on the Android GPGPU, where can I find some tutorials or sample programs? Thank you for your help and suggestions.
Up till now I know that the OpenGL ES library was now accelerated use GPU, but I want to use the GPU for computing. What I want to do is to accelerate computing so that I hope to use some libraries of APIs such as OpenCL.
2021-April Update
Google has announced deprecation of the RenderScript API in favor of Vulkan with Android 12.
The option for manufacturers to include the Vulkan API was made available in Android 7.0 Compatibility Definition Document - 3.3.1.1. Graphic Libraries.
Original Answer
Actually Renderscript Compute doesn't use the GPU at this time, but is designed for it
From Romain Guy who works on the Android platform:
Renderscript Compute is currently CPU bound but with the for_each construct it will take advantage of multiple cores immediately
Renderscript Compute was designed to run on the GPU and/or the CPU
Renderscript Compute avoids having to write JNI code and gives you architecture independent, high performance results
Renderscript Compute can, as of Android 4.1, benefit from SIMD optimizations (NEON on ARM)
https://groups.google.com/d/msg/android-developers/m194NFf_ZqA/Whq4qWisv5MJ
yes , it is possible .
you can use either renderscript or opengGL ES 2.0 .
renderscript is available on android 3.0 and above , and openGL ES 2.0 is available on about 95% of the devices.
As of Android 4.2, Renderscript can involve GPU in computations (in certain cases).
More information here: http://android-developers.blogspot.com/2013/01/evolution-of-renderscript-performance.html
As I understand, ScriptIntrinsic subclasses are well-optimized to run on GPU on compatible hardware (for example, Nexus10 with Mali T604). Documentation:
http://developer.android.com/reference/android/renderscript/ScriptIntrinsic.html
Of course you can decide to use OpenCL, but Renderscript is guaranteed (by Google, being a part of Android itself) to be running even on hardware which doesn't support GPGPU computation and will use any other available acceleration means supported by hardware it is running on.
There are several options: You can use OpenGL ES 2.0, which is supported by almost all devices but has limited functionality for GPGPU. You can use OpenGL ES 3.0, with which you can do much more in terms of GPU processing. Or you can use RenderScript, but this is platform-specific and furthermore does not give you any influence on whether your algorithms run on the GPU or the CPU. A summary about this topic can be found in this master's thesis: Parallel Computing for Digital Signal Processing on Mobile Device GPUs.
You should also check out ogles_gpgpu, which allows GPGPU via OpenGL ES 2.0 on Android and iOS.