I have checked this question.
It is very similar:
I want to record a video with android camera.
After that with a library remove the background, which is with chroma key.
First I think I should use android NDK in order to escape from SDK memory limitation and use the whole memory.
The length of the video is short, a few seconds so maybe is able to handle it.
I would prefer to use an SDK implementation and set the android:largeHeap="true" , because of mismatching the .so files architecture.
Any library suggestion for SDK or NDK please.
IMO you should prefer NDK based solution, since video processing is a CPU-consuming operation and java code won't give you a better performance. Moreover, the most popular and reliable media-processing libraries are often written in C or C++.
I'd recommend you to take a look at FFmpeg. It offers reach abilities to cope with multimedia. chromakey filter may help you to remove green background (or whatever color you want). Then you can use another video as new background, if needed. See blend filter docs.
Filters are a nice and powerful concept. They may be used both via ffmpeg tool command line or via libavfilter API. For the former case you should find ffmpeg binary compiled for android and run it with traditional Runtime.exec(). For the latter case - you need to write native code, that creates proper filter graph and performs processing. This code must be linked against FFmpeg libraries.
Related
What I have: A trained recurrent neural network in Tensorflow.
What I want: A mobile application that can run this network as fast as possible (inference mode only, no training).
I believe there are multiple ways how I can accomplish my goal, but I would like you feedback/corrections and additions because I have never done this before.
Tensorflow Lite. Pro: Straight forward, available on Android and iOS. Contra: Probably not the fastest method, right?
TensorRT. Pro: Very fast + I can write custom C code to make it faster. Contra: Used for Nvidia devices so no easy way to run on Android and iOS, right?
Custom Code + Libraries like openBLAS. Pro: Probably very fast and possibility to link to it on Android on iOS (if I am not mistaken). Contra: Is there much use for recurrent neural networks? Does it really work well on Android + iOS?
Re-implement Everything. I could also rewrite the whole computation in C/C++ which shouldn't be too hard with recurrent neural networks. Pro: Probably the fastest method because I can optimize everything. Contra: Will take a long time and if the network changes I have to update my code as well (although I am willing to do it this way if it really is the fastest). Also, how fast can I make calls to libraries (C/C++) on Android? Am I limited by the Java interfaces?
Some details about the mobile application. The application will take a sound recording of the user, do some processing (like Speech2Text) and output the text. I do not want to find a solution that is "fast enough", but the fastest option because this will happen over very large sound files. So almost every speed improvement counts. Do you have any advice, how I should approach this problem?
Last question: If I try to hire somebody to help me out, should I look for an Android/iOS-, Embedded- or Tensorflow- type of person?
1. TensorflowLite
Pro: it uses GPU optimizations on Android; fairly easy to incorporate into Swift/Objective-C app, and very easy into Java/Android (just adding one line in gradle.build); You can transform TF model to CoreML
Cons: if you use C++ library - you will have some issues adding TFLite as a library to your Android/Java-JNI (there is no native way to build such library without JNI); No GPU support on iOS (community works on MPS integration tho)
Also here is reference to TFLite speech-to-text demo app, it could be useful.
2. TensorRT
It uses TensorRT uses cuDNN which uses CUDA library. There is CUDA for Android, not sure if it supports the whole functionality.
3. Custom code + Libraries
I would recommend you to use Android NNet library and CoreML; in case you need to go deeper - you can use Eigen library for linear algebra. However, writing your own custom code is not beneficial in the long term, you would need to support/test/improve it - which is a huge deal, more important than performance.
Re-implement Everything
This option is very similar to the previous one, implementing your own RNN(LSTM) should be fine, as soon as you know what you are doing, just use one of the linear algebra libraries (e.g. Eigen).
The overall recommendation would be to:**
try to do it server side: use some lossy compression and serverside
speech2text;
try using Tensorflow Lite; measure performance, find bottlenecks, try to optimize
if some parts of TFLite would be too slow - reimplement them in custom operations; (and make PR to the Tensorflow)
if bottlenecks are on the hardware level - goto 1st suggestion
Maybe you should try this lib, it can run on android and ios devices.
https://github.com/Tencent/TNN
OK So here is my story:
I am creating an app that requires me to take a couple images and a video and merge them together. At first I had no idea what to use, never heard of ffmpeg or ndk.. After around 5 days of battling NDK, switching to Ubuntu and going crazy with ndk-build commands I finally got FFmpeg to compile using the dolphin-player example. Now that I can run ffmpeg on my computer and android device I have no idea what to do next.
Here are the main questions I have:
To use FFmpeg, I saw that I need to use some sort of commands. First off what are these commands, where do I run them?
Second of all, Are the commands all I need? By that I mean can i just run my application normally, somewhere in it execute the commands in some way and it will do the rest for me? or do I need some sort of element in the code, for example VideoEncoder instance or something..
Third of all, I saw people using NDK to use FFmpeg, Do I have to? Or is it optional? I would like to avoid using C if possible as I don't know it at all..
OPTIONAL: Last but not least, Is this the best way of handling what I need to do in my application? If so, can someone guide me in a brief manner of how to use FFmpeg to accomplish said task (mention commands or anything like this)..
I know it's a wall of text but every question is important to me!
Thank you very much stackoverflow community!
I see my answer may no longer relevant to your question but I still put it here as I've recently gone through that very same path and I understand the pain as well as the confusion causing by this matter (setting up NDK using mixed gradle plugin take me 1 day, building FFmpeg takes 2 days and then fail at wtf am I supposed to do next??)
So in short, as #Daniel has pointed out, if you just want to use FFmpeg to run command such ask compressing, cutting, inserting keyframes... then Writing mind's prebuilt FFmpeg Android Java is the easiest way to get FFmpeg running on your app. The downside is since it just run command so it needs to take an input and an output file for the process. See my question here for further clarification.
If you need to do more complex task than this then you have no choice but building the FFmpeg as a library and calling API from it. I've written down step by step instruction that work for me (May 2016). You can see it here:
Building FFmpeg v3.0.2 with NDK r11c (please use Ubuntu if you don't want to rebuild the whole thing, Linux Mint fails me)
Using FFmpeg in Android Studio 2.1.1
Please don't ask me to copy the whole thing here as its a very long instruction and it's easier for me to keep 1 source of information up-to-date. I hope this can save someone's keyboard ;).
1, FFmpeg can be either an app or a set of libraries. If you use it as an app (with an executable binary installed), you can type the commands in a terminal. The app only has limited functions and may not solve your problem. In this case you need to use ffmpeg as libraries and call APIs in your program.
2, To my understanding the commands cannot solve your problem. You need to call ffmpeg APIs. There are a bunch of sample codes for video/image encoding/decoding. You probably also need a container to package the outcome, and ffmpeg libraries can also do that.
3, NDK is preferred by me, since ffmpeg are written in C/C++. There are JAVA wrappers for ffmpeg; if you use them, NDK is not required. However, not all functions in ffmpeg are wrapped well - you may try. If not, then go back to the NDK solution.
4, The simplest way is to decode all your video/images into raw frames, combine them with desired order, and encode them. However in practice this consumes too much memory. The key point then becomes: how can I do the same on the fly? It's not too hard once you reach this step.
I'm working with OpenCV 2.2 for Android under Windows, and faced a problem when using cvCreateVideoWriter. It always returns NULL. I'm guessing it has something to do with library FFMPEG not being properly built. The thing is that I followed instructions in http://opencv.willowgarage.com/wiki/Android2.2, and since FFMPEG is included as a 3rd party library (at least I can see the source withing the whole OpenCV package) I thought I didn't have to do anything extra to get this library installed. I might be wrong. How do I check if the library was correctly built (or built at all)? Do I need to make any changes to the default make files?
Any help is much appreciated.
Thanks!
There are 2 important things to consider when using cvCreateVideoWriter():
Your application needs rights to create files and be able to write on them. Make sure you have setup the necessary directory permissions for it to do so.
The 2nd argument of the function is the code of codec used to compress the frames. For For instance, CV_FOURCC('P','I','M','1') is MPEG-1 codec and CV_FOURCC('M','J','P','G') defines motion-jpeg.
A typical call may look like this:
CvVideoWriter *writer = cvCreateVideoWriter("video.avi", CV_FOURCC('M','J','P','G'), fps, size, 0);
if (!write)
{
// handle error
}
I suggest calling cvCreateVideoWriter with different codecs. It may be that your platform doesn't support the one you are using right now.
I don't know if the default build for Android enables the flag HAVE_FFMPEG, but you need to have ffmpeg installed and it's best to make sure this flag is enable when compiling OpenCV.
I was using opencv for some time for programming in Android, and I now see that the Gimp library is much stronger. Where can I find a starting point to learn Gimp?
I also want to know the basic concepts behind of Gimp plugins. In the past, I used C APIs in opencv. How could I write the code for android?
Also, what packages do I need to install in windows to start using Gimp?
ALthough GIMP dows have some standalone libraries that perform some image manipulation, most image manipulation is done either by GIMP's core program or through GIMP's plug-ins. Both approaches need to have the entire program installed and running (though not necessarily usin a display).
I know nothing on Andorid progrmaing, and don't knwo how can one install ordinary native code in C and call it from Android apps - if you are very familiar with it, you might have a chance in your attempt.
However GIMP itself relies on a extensive ecosystem of libraries, including, but not limited to, glib, gtk+, cairo, pango, gegl - and each of these in turn might have other pre-requisites. Since Windows does not have a working package manager to authomatically install libraries and header files of these various libraries, working with these natively on Windows, though the code of each of them is multiplatform and can run on Windows and other OSses,is very hard. So hard that hthe people who build GIMP for Windows themselves do so in a Linux environment, from where they cros-compile GIMP for Windows.
Making all of these libraries work on an Android is probably not hard if you are using the GNU ecosystem around the Android's Linux kernel , and not just the bare Android environment (I don't know enough about android to even know if that is possible).
All in all: it will be though for you, and demand a whole lot of research.
One of GIMP's libraries, the GEGL (Generic Graphics Library) has a lot less prerequistes, and can be used as an ordinary library. I think you can probably build it with just glib and Babl as prerequisites. This is the library that will replace current's GIMP core, and reimplement the operations of most existing plug-ins -- so it might be enough for you.
If you can get GEGL running and usable from an Android system share that with the World --it would be , in itelsef, a project worth of a Google Summer of Code project. (And still would be about an order of magnitude easier than getting GIMP code in there to be used as a library from other applications).
Finally -- if you want just a couple of GIMP's effects, if the effect is implemented as a Plug-in in GIMP, the plug-ins' code is quite straightforward. So, while it would be hard to get the whole GIMP environment inside Android, copying the functions that actually perform the pixel manipulation from GIMP's source tree and converting them to work in a java method inside your app would not be hard. Just remember to comply with the license in this case: GIMP's plugins code is under GPLv3. (the GEGL library is only LGPL)
In short: no, you can't use GIMP's "libraries" as native code from an Android app -if you can use OpenCV, you have a good chance of being able to use GEGL instead. Only orting the algorithms of certain plugins to manipulate pixels in your app would be easier.
However -- if your application would allow delegating Image Processing to an internet based server, setting up an HTTP application to receive a image, use GIMP to process it, and stream it back would be a simple thing to do.
(So, you could not apply effects in real time, but would allow one to, for example, take a photo, select a series of effects from menus, and send it to the server for processing)
GIMP uses quite a bit of memory when loading brushes. If you drop all of the useless plug-ins, and build it from source. You may be able to get it working but you will have to build ALL of the linked libraries directly into the executable.
In other words; build linked libraries directly into the code as a static build. In this manner things may function properly unless one of those linked libraries call another linked library.
Getting the libraries themselves to work on the OS may provide additional programs opportunities to use them. Additionally, GTK+ (GIMP Tool Kit), GIMP's interface is also rather bloated and ugly.
If all else fails, you'll simply have to settle for a smaller program with the features you're looking for on the fly ( Levels, Curves, the clone tool, dodge and burn, etc. ) Layers are also nice, but editing a a large megapixel image begins to eat up memory rather quickly and most android device don't have a swap partition.
I want to use the codecs in Android from my application. For now I just want to use the H.264 codec for testing, unless the mp3 or aac codecs provide functions for sending the audio to the device's speaker in which case I would prefer one of those.
I have the NDK installed along with Cygwin, GNU Make, and GNU Awk. I can't figure out what I need to do from here though. I'm downloading the entire OpenCORE tree right now but I don't even know how to build it or make Eclipse plugin know it needs to include the files.
An example or a tutorial would be much appreciated.
EDIT:
It looks like I can use JNI like P/Invoke which would mean I don't have to build the OpenCORE libraries myself. However, I can't find any documentation on the names of the libraries I need to load.
I'm also confused as to how to do it. I'm looking at http://www.koushikdutta.com/2009/01/jni-in-android-and-foreword-of-why-jni.html and I don't understand what the purpose of writing a library to access a library is. Couldn't you just use something like System.loadLibrary("opencore.so")?
You cannot build opencore seperately. It has to be built with whole source code. What are you trying to acheive. If you just want to play a video/audio, use VideoView or MediaPlayer object.
Build the Android source and use the headers and the static library from it. This will propel you straight to the zone of unsupported APIs.