I am trying to play a synthesized sound (basically 2 sine waves and some noise) using the AudioTrack class. It doesn't seem to be any different than the SourceDataLine in javax.sound.sampled, BUT the synthesis is REALLY SLOW. Even for ARM standards, it's unrealistic to think that 32768 samples (16 bit, stereo, for a total of 65536) take over 1 second to render on a Nexus 4 (measured with System.nanotime(), write to AudioTrack excluded).
The synthesis part is almost identical to this http://audioprograming.wordpress.com/2012/10/18/a-simple-synth-in-android-step-by-step-guide-using-the-java-sdk/, the only difference is that I play stereo sound (I can't reduce it to mono because it's a binaural tone).
Any ideas? what can I do?
Thanks in advance
Marko's answer seems very good. But if you're still in the experimental/investigational phase of your project, you might want to consider using Pure Data, which already is implemented as a combination Android library/NDK library and which would allow you to synthesize many sounds and interact with them in a relatively simple manner.
The libpd distribution is the Android implementation of Pure Data. Some good starting references can be found at the SoundOnSound site and also at this site.
Addendum: I found a basic but functional implementation of an Android Midi Driver through this discussion link. The relevant code can be found here (github, project by billthefarmer, named mididriver).
You can view how I use it in my Android app (imSynt link leads you to Google Play), or on YouTube.
The performance of audio synthesis on ARM is actually very respectable with native code that makes good use of the NEON unit. The Dalvik's JIT compiler is never going to get close to this level of performance for floating-point intensive code.
A look at the enormous number of soft-synth apps for iOS provides ample evidence of what should be possible on ARM devices with similar levels of performance.
However, the performance you are reporting is several orders of magnitude short of what I would expect. You might consider the following:
Double precision float-point arithmetic is particularly expensive on ARM Cortex A-x NEON units, where as single precision is very fast and highly parallelizable. Math.sin() returns a double, so is unnecessarily precise, and liable to be slow. The 24-mantissa provided by single precision floating point value is substantially larger than the 16-bit int used by the audio subsystem.
You could precompute sin(x) and then perform a table-lookup in your render loop.
There is a previous post on SO concerning Math.sin(x) on android suggesting degrading performance as x becomes large, as it's likely to in this case over time.
For a more advanced table-based synthesiser, you might consider using a DDS Oscillator.
Ultimately, you might consider using native code for synthesis, with the NDK.
You should be able render multiple oscillators with filters and envelopes and still have CPU time left over. Check your inner loops to make sure that there are no system calls.
Are you on a very old phone? You did not mention the hardware or OS version.
You might want to try using JSyn. It is a free modular Java synthesizer that runs on any Java platform including desktops, Raspberry Pi and Android.
https://github.com/philburk/jsyn
Have you tried profiling your code? It sounds like something else is possibly causing your slow down, profiling would help to highlight the cause.
Mike
Related
I'm thinking of starting a android project, which records audio signals and does some processing to denoise. My quesion is, as many (nearly all) denoising algorithms involve FFT, is it possible for me to do a real-time program? By real-time I mean the program do recording and processing at the same time, so I could save my time when I finish recording.
I have made a sample project, which applies fourier transformation to the audio signal and implement a simple algorithm called sub-spectrum. But I found that it is difficult to implement this algorithm in real time, which means after I press the 'stop' button, it takes me a while to do the processing and save the file (I'm also wondering how do these commercial recorder programs record sound and at the same time save it). I know that my FFT may not be the fastest, but I'd like to know whether I could achieve 'real-time', if I fully optimized it or use the fastest FFT code? Thanks a lot!
It sounds like you are talking about broadband denoising. So I'll address my question to that. There are other kinds of denoising, from simple filtering to adaptive filtering to dynamic range expanding and probably others.
I don't think anyone can answer this question with a simple yes or no. You will have to try it and see what can be done.
First off, there are a variety of FFT implementations, including FFTW, of varying speed you could try. Some are faster than others, but at the end of the day they are all going to deliver comparable results.
This is one place where native C/C++ will outperform Java/Dalvik code because it can truly take advantage of vector code. For that to work, you'll probably need to write some assembler, or find some code that is already android optimized. I'm not aware of an android optimized FFT, but I'm sure it exists.
The real performance win will come from how you structure your overall denoising algorithm. All denoising I'm familiar with is extremely processor intensive and probably won't work on a phone in real-time, although it might on a tablet. That's just a(n educated) guess, though.
I just added some computationally expensive code to an Android game I am developing. The code in question is a collection of collision detection routines that get called very often (every iteration of the game-loop) and are doing a large amount of computation. I feel my collision detection implementation is fairly well developed, and as reasonably fast as I can make it in Java.
I've been using Traceview to profile the code, and this new piece of collision detection code has somewhat unsurprisingly doubled the duration of my game logic. That's obviously a concern since for certain devices, this performance hit could take my game from a playable to an unplayable state.
I have been considering different ways to optimize this code, and I am wondering if by moving the code into C++ and accessing it with the JNI, if I will get some noticeable performance savings?
The above question is my main concern and my reason for asking. I've determined that the two following reasons would be other positive results from using the JNI. However, it is not enough to persuade me to port my code to C++.
This would make the code cleaner. Since most of the collision detection is some sort of vector math, it is much cleaner to be able to use overloaded operators rather than using some more verbose vector classes in Java.
Memory management would be simpler. Simpler you say? Well, this is a game so the garbage collector running is not welcome because the GC could end up ruining the performance of your game if it constantly has to interrupt to clean up. In C I don't have to worry about the garbage collector, so I can avoid all the ugly things I do in Java with temporary static variables and just rely on the good old stack memory of C++
Long-winded as this question may be, I think I covered all my points. Given this information, would it be worth porting my code from Java to C++ and accessing it with the JNI (for reasons of improving performance)? Also, is there a way to measure or estimate a potential performance gain?
EDIT:
So I did it. Results? Well from TraceView's perspective, it was a 6x increase in speed of my collision detection routine.
It wasn't easy getting there though. Besides having to do the JNI dance, I also had to make some optimizations that I did not expect. Mainly, using a directly allocated float buffer to pass data from Java to native. My initial attempt just used a float array to hold the data in question because the conversion from Java to C++ was more natural, but that was realllly reallllly slow. The direct buffer completely side-stepped performance issues with array copying between java and native, and left me with a 6x bump.
Also, instead of rolling my own vector class, I just used the Eigen math library. I'm not sure how much of an affect this has had on performance, but at the least, it saved me the time of dev'ing my own (less efficient) vector class.
Another lesson learned is that excessive logging is bad for performance (jic that isn't obvious).
Not really a direct answer to your question, but the following links might be of use to you:
Android Developers, JNI Tips.
Android Developers, Designing for Performance
In the second link the following is written:
Native code isn't necessarily more efficient than Java. For one thing,
there's a cost associated with the Java-native transition, and the JIT
can't optimize across these boundaries. If you're allocating native
resources (memory on the native heap, file descriptors, or whatever),
it can be significantly more difficult to arrange timely collection of
these resources. You also need to compile your code for each
architecture you wish to run on (rather than rely on it having a JIT).
You may even have to compile multiple versions for what you consider
the same architecture: native code compiled for the ARM processor in
the G1 can't take full advantage of the ARM in the Nexus One, and code
compiled for the ARM in the Nexus One won't run on the ARM in the G1.
Native code is primarily useful when you have an existing native
codebase that you want to port to Android, not for "speeding up" parts
of a Java app.
If you are still at a fairly early stage of game development, you can consider using a Game Engine which provides a good collision detection mechanism, like Libgdx which does a fairly good job of box2d collision detection.
I want to transplant a 3D program written in OpenGL on windows platform to Android, but I wonder if it can run smoothly on general Android platforms, so i want to estimate how much hardware resource is sufficient for it to run smoothly. It is some kind like the hardware requirements for a software or 3d game that a company will recommend the users. I don't know how can i get a hardware requirements of my program when transplant to Android.
i used gdebugger and it gave me some information but i don't think that is enough for me. Anyone here have some idea or solution? Many thanks in advance!
If your program is simple enough, you could write up some estimates about texture fill rate, which is a pretty basic (and old) metric of rendering performance. Nearly every 3D chip comes with a theoretical fill rate, so you can get the theoretical numbers of both your desktop system and some Android phones.
The texture memory footprint is another thing that you can estimate, especially using gdebugger. Once again, these numbers are known for most chips.
This is a quick way to produce some numbers, obviously without any real life performance guarantees.
The best way would be to test it on an actual device, and get an idea of what hardware works well. You could distribute a beta app and get some feedback too.
Depends on feature set that you use. For example, if you use FBO, the device will have to support framebuffer extension. If you use MSAA, smooth line, the device will have support corresponding extensions.
After listing down your requirements, you can use glGet to check for the device suppport
http://www.opengl.org/sdk/docs/man/xhtml/glGet.xml
The question does not mean that I'm interested if ffmpeg code can be used on Andoid. I know that it can. I'm just asking if somebody has the real performance progress with that stuff.
I've created the question after several weeks of experiments with the stuff and I've had enough.
I do not want to write to branches where people even do not say what kind of video they decode (resolution, codec) and talk only about some mystical FPS. I just don't understand what they want to do. Also I'm not going to develop application only for my phone or for Android 2.2++ phones that have some extended OpenGL features. I have quite popular phone HTC Desire so if the application does not work on it, so what's next?
Well, what do I have?
FFMpeg source from the latest HEAD branch. Actually I could not buld it with NDK5 so I decided to use stolen one.
Bambuser's build script (bash) with appropriate ffmpeg source ([web]: http://bambuser.com/r/opensource/ffmpeg-4f7d2fe-android-2011-03-07.tar.gz).
It builds well after some corrections by using NDK5.
Rockplayer's gelded ffmpeg source code with huge Android.mk in the capacity of build script ([web]: http://www.rockplayer.com/download/rockplayer_ffmpeg_git_20100418.zip).
It builds by NDK3 and NDK5 after some corrections. Rockplayer is probably the most cool media player for Android and I supposed that I would have some perks using it's build.
I had suitable video for a project (is not big and is not small): 600x360 H.264.
Both libraries we got from clauses 2 and 3 provide us possibility to get frames from video (frame-by-frame, seek etc.). I did not try to get an audio track because I did not need one for the project. I'm not publishing my source here because I think that's traditional and it's easy to find.
Well, what's the results with video?
HTC Desire, Android 2.2
600x360, H.264
decoding and rendering are in different threads
Bambuser (NDK5 buld for armv5te, RGBA8888): 33 ms/frame average.
Rockplayer (NDK3 build for neon, RGB565): 27 ms/frame average.
It's not bad for the first look, but just think that these are results only to decode frames.
If somebody has much better results with decoding time, let me know.
The most hard thing for a video is rendering. If we have bitmap 600x360 we should scale one somehow before painting because different phones have different screen sizes and we can not expect that our video will be the same size as screen.
What options do we have to rescale a frame to fit it to screen?
I was able to check (the same phone and video source) those cases:
sws_scale() C function in Bambuser's build: 70 ms/frame. Unacceptable.
Stupid bitmap rescaling in Android (Bitmap.createScaledBitmap): 65 ms/frame. Unacceptable.
OpenGL rendering in ortho projection on textured quad. In this case I did not need to scale frame. I just needed to prepare texture 1024x512 (in my case it was RGBA8888) containig frame pixels and than load it in GPU (gl.glTexImage2D). Result: ~220 ms/frame to render. Unacceptable. I did not expect that glTexImage2D just sucked on Snapdragon CPU.
That's all. I know that there is some way to use fragment shader to convert YUV pixels using GPU, but we will have the same glTexImage2D and 200 ms just to texture loading.
But this is not the end. ...my only friend the end... :) It is not a hopeless condition.
Trying to use RockPlayer you definitely will wonder how they do that damn frame scaling so fast. I suppose that they have really good experience in ARM achitecture. They most probably use avcodec_decode_video2 and than img_convert (as I did in RP version), but then they use some tricks (depends of ARM version) for scaling.
Maybe they also have some "magic" buld configuration for ffmpeg decreasing decoding time but Android.mk that they published is not THE Android.mk they use. I don't know.
So, now it looks like you can not just buld some easy JNI bridge for ffmpeg and than have real media player for Android platform. You can do this only if you have suitable video that you do not need to scale.
Any ideas?
I did compile ffmpeg on android. From this point - playing video is purely implementation dependant, so no point of measuring latencies on things whitch can be highly optimised in needed place and not using standart swscale. And yes - you can build some easy JNI bridge and use it in NDK to perform ffmpeg calls, but this would already be a player code.
In my experience, YUV to RGB conversion has always been a bottleneck. Therefore, using an OpenGL shader for this proved to give a significant boost.
I use http://writingminds.github.io/ffmpeg-android-java/ for my project. There is some workaround with complex commands but for simple commands the wrapper work very well for me.
So anybody worth their salt in the android development community knows about issue 3434 relating to low latency audio in Android. For those who don't, you can educate yourself here. http://code.google.com/p/android/issues/detail?id=3434
I'm looking for any sort of temporary workaround for my personal project. I've heard tell of exposing private interfaces to the NDK by rolling your own build of android and modifying the NDK.
All I need is a way to access the low level alsa drivers which are already packaged with the standard 2.2 build. I'd like to have the ability to send PCM directly to the audio hardware on my device. I don't care that the resulting app won't be distributable over the marketplace, and likely won't run with any other device than mine.
Anybody have any useful ideas?
-Griff
EDIT: I should mention, I know AudioTrack provides this functionality, but I'd like much lower latency -- AudioTrack sits around 300ms, I'd like somewhere around 20-30 ms.
Griff, that's just the problem, NDK wil not improve the known latency issue (that's even documented). The hardware abstraction layer in native code is currently adding to the latency, so it's not just about access to the low level drivers (btw you shouldn't rely on alsa drivers being there anyway).
Android: sound API (deterministic, low latency) covers the tradeoffs pretty well. TL;DR: NDK gives you a minor benefit because the threads can run at higher priority, but this benefit is meaningless pre-Jellybean because the entire audio system is tuned for Java.
The Galaxy Nexus running 4.1 can get fairly close to 30ms of output latency.