So anybody worth their salt in the android development community knows about issue 3434 relating to low latency audio in Android. For those who don't, you can educate yourself here. http://code.google.com/p/android/issues/detail?id=3434
I'm looking for any sort of temporary workaround for my personal project. I've heard tell of exposing private interfaces to the NDK by rolling your own build of android and modifying the NDK.
All I need is a way to access the low level alsa drivers which are already packaged with the standard 2.2 build. I'd like to have the ability to send PCM directly to the audio hardware on my device. I don't care that the resulting app won't be distributable over the marketplace, and likely won't run with any other device than mine.
Anybody have any useful ideas?
-Griff
EDIT: I should mention, I know AudioTrack provides this functionality, but I'd like much lower latency -- AudioTrack sits around 300ms, I'd like somewhere around 20-30 ms.
Griff, that's just the problem, NDK wil not improve the known latency issue (that's even documented). The hardware abstraction layer in native code is currently adding to the latency, so it's not just about access to the low level drivers (btw you shouldn't rely on alsa drivers being there anyway).
Android: sound API (deterministic, low latency) covers the tradeoffs pretty well. TL;DR: NDK gives you a minor benefit because the threads can run at higher priority, but this benefit is meaningless pre-Jellybean because the entire audio system is tuned for Java.
The Galaxy Nexus running 4.1 can get fairly close to 30ms of output latency.
Related
I am working on a video recording and sharing application for Android. The specifications of the app are as follows:-
Recording a 10 second (maximum) video from inside the app (not using the device's camera app)
No further editing on the video
Storing the video in a Firebase Cloud Storage (GCS) bucket
Downloading and playing of the said video by other users
From the research, I did on SO and others sources for this, I have found the following (please correct me if I am wrong):-
The three options and their respective features are:-
1.Ffmpeg
Capable of achieving the above goal and has extensive answers and explanations on sites like SO, however
Increases the APK size by 20-30mb (large library)
Runs the risk of not working properly on certain 64-bit devices
2.MediaRecorder
Reliable and supported by most devices
Will store files in .mp4 format (unless converted to h264)
Easier for playback (no decoding needed)
Adds the mp4 and 3gp headers
Increases latency according to this question
3.MediaCodec
Low level
Will require MediaCodec, MediaMuxer, and MediaExtractor
Output in h264 ( without using MediaMuxer for playback )
Good for video manipulations (though, not required in my use case)
Not supported by pre 4.3 (API 18) devices
More difficult to implement and code (my opinion - please correct me if I am wrong)
Unavailability of extensive information, tutorials, answers or samples (Bigflake.com being the only exception)
After spending days on this, I still can't figure out which approach suits my particular use case. Please elaborate on what I should do for my application. If there's a completely different approach, then I am open to that as well.
My biggest criteria are that the video encoding process be as efficient as possible and the video to be stored in the cloud should have the lowest possible space usage without compromising on the video quality.
Also, I'd be grateful if you could suggest the appropriate format for saving and distributing the video in Firebase Storage, and point me to tutorials or samples of your suggested approach.
Thank you in advance! And sorry for the long read.
Your overview on this topic is applicable to the point.
I'll just add my 2 cents on this topic that you might have missed as addition:
1.FFMpeg
+/-If you build your own SO then you can reduce the size down to about 2-3 MB depending on the use-case of course. Editing a 6000 lines buildscript takes time and effort though
++Supports wide range of formats (almost everything)
++Results are the same for every device
++Any resolution supported
--High energy consumption due do SW-En-/Decoding, while also making it slow. There is a plugin to support lib-stagefright, but it doesn't work on many devices (as of May 2016)
--Licensing can be problematic depending on your location and use-case. I'm not a lawyer, but we had legal consulting on this topic and it's quite complex.
2. MediaRecorder
++Easiest to implement (simplified access to mediacodec/libstagefright) Raw data gets passed to the encoder directly so no messing around there
++HW Accelerated on most devices. Makes it fast and energy saving.
++Delay only applies to live streaming
--Dependent on implementation of HW-manufacturers
--Results may vary from device to device
++No licensing problems
3.MediaCodec
+/-Most of 2.MediaRecorder applies to this as well (apart from ease of use)
++Most flexible access to HW-en-/decoding
--Hard to use for cases that were not thought of (e.g. mixing videos from different sources)
+/-Delay for streaming can be eliminated (is tricky though)
--HW-manufacturers sometimes don't implement things correctly (e.g the Samsung Galaxy S5 sometimes produces a SIG-SEV if live data from some DLSR is fed to the encoder. Works fine for a while, then all of a sudden it's SIG-SEV. This might be the dslr's fault, but the SIG-SEV is not avoidable and crashes the app, which in the end is the app developers fault ;) )
--If used without MediaMuxer you need either good understanding of media containers or rely on 3rd party libraries
The list is obviously not complete and some points might not be correct. The last time I worked with video was almost half a year ago.
As for your use-case I would recommend using MediaRecorder since it is the easiest to implement, supported on all devices, and offers a good deal of quality/size option. FFMpeg produces better results for the same storage size, but takes longer (extreme case, DSLR live footage was encoded 30 times faster), and is more energy consuming.
As far as I understand your use-case, there is no need to fiddle around with MediaCodec since you want to encode and decode only.
I suggest using VP8 or 9 since you wont run into licensing problems. Again I'm no lawyer but distributing H264 over your own server might make you a broadcasting station, so i was told.
Hope this helps you in your decision making
I am trying to play a synthesized sound (basically 2 sine waves and some noise) using the AudioTrack class. It doesn't seem to be any different than the SourceDataLine in javax.sound.sampled, BUT the synthesis is REALLY SLOW. Even for ARM standards, it's unrealistic to think that 32768 samples (16 bit, stereo, for a total of 65536) take over 1 second to render on a Nexus 4 (measured with System.nanotime(), write to AudioTrack excluded).
The synthesis part is almost identical to this http://audioprograming.wordpress.com/2012/10/18/a-simple-synth-in-android-step-by-step-guide-using-the-java-sdk/, the only difference is that I play stereo sound (I can't reduce it to mono because it's a binaural tone).
Any ideas? what can I do?
Thanks in advance
Marko's answer seems very good. But if you're still in the experimental/investigational phase of your project, you might want to consider using Pure Data, which already is implemented as a combination Android library/NDK library and which would allow you to synthesize many sounds and interact with them in a relatively simple manner.
The libpd distribution is the Android implementation of Pure Data. Some good starting references can be found at the SoundOnSound site and also at this site.
Addendum: I found a basic but functional implementation of an Android Midi Driver through this discussion link. The relevant code can be found here (github, project by billthefarmer, named mididriver).
You can view how I use it in my Android app (imSynt link leads you to Google Play), or on YouTube.
The performance of audio synthesis on ARM is actually very respectable with native code that makes good use of the NEON unit. The Dalvik's JIT compiler is never going to get close to this level of performance for floating-point intensive code.
A look at the enormous number of soft-synth apps for iOS provides ample evidence of what should be possible on ARM devices with similar levels of performance.
However, the performance you are reporting is several orders of magnitude short of what I would expect. You might consider the following:
Double precision float-point arithmetic is particularly expensive on ARM Cortex A-x NEON units, where as single precision is very fast and highly parallelizable. Math.sin() returns a double, so is unnecessarily precise, and liable to be slow. The 24-mantissa provided by single precision floating point value is substantially larger than the 16-bit int used by the audio subsystem.
You could precompute sin(x) and then perform a table-lookup in your render loop.
There is a previous post on SO concerning Math.sin(x) on android suggesting degrading performance as x becomes large, as it's likely to in this case over time.
For a more advanced table-based synthesiser, you might consider using a DDS Oscillator.
Ultimately, you might consider using native code for synthesis, with the NDK.
You should be able render multiple oscillators with filters and envelopes and still have CPU time left over. Check your inner loops to make sure that there are no system calls.
Are you on a very old phone? You did not mention the hardware or OS version.
You might want to try using JSyn. It is a free modular Java synthesizer that runs on any Java platform including desktops, Raspberry Pi and Android.
https://github.com/philburk/jsyn
Have you tried profiling your code? It sounds like something else is possibly causing your slow down, profiling would help to highlight the cause.
Mike
I'm currently attempting to minimize audio latency for a simple application:
I have a video on a PC, and I'm transmitting the video's audio through RTP to a mobile client. With a very similar buffering algorithm, I can achieve 90ms of latency on iOS, but a dreadful ±180ms on Android.
I'm guessing the difference stems from the well-known latency issues on Android.
However, after reading around for a bit, I came upon this article, which states that:
Low-latency audio is available since Android 4.1/4.2 in certain devices.
Low-latency audio can be achieved using libpd, which is Pure Data library for Android.
I have 2 questions, directly related to those 2 statements:
Where can I find more information on the new low-latency audio in Jellybean? This is all I can find but it's sorely lacking in specific information. Should the changes be transparent to me, or is there some new class/API calls I should be implementing for me to notice any changes in my application? I'm using the AudioTrack API, and I'm not even sure if it should reap benefits from this improvement or if I should be looking into some other mechanism for audio playback.
Should I look into using libpd? It seems to me like it's the only chance I have of achieving lower latencies, but since I've always thought of PD as an audio synthesis utility, is it really suited for a project that just grabs frames from a network stream and plays them back? I'm not really doing any synthesizing. Am I following the wrong trail?
As an additional note, before someone mentions OpenSL ES, this article makes it quite clear that no improvements in latency should be expected from using it:
"As OpenSL ES is a native C API, non-Dalvik application threads which
call OpenSL ES have no Dalvik-related overhead such as garbage
collection pauses. However, there is no additional performance benefit
to the use of OpenSL ES other than this. In particular, use of OpenSL
ES does not result in lower audio latency, higher scheduling priority,
etc. than what the platform generally provides."
For lowest latency on Android as of version 4.2.2, you should do the following, ordered from least to most obvious:
Pick a device that supports FEATURE_AUDIO_PRO if possible, or FEATURE_AUDIO_LOW_LATENCY if not. ("Low latency" is 50ms one way; pro is <20ms round trip.)
Use OpenSL. The Dalvik GC has a low amortized cost, but when it runs it takes more time than a low-latency audio thread can allow.
Process audio in a buffer queue callback. The system runs buffer queue callbacks in a thread that has more favorable scheduling than normal user-mode threads.
Make your buffer size a multiple of AudioManager.getProperty(PROPERTY_OUTPUT_FRAMES_PER_BUFFER). Otherwise your callback will occasionally get two calls per timeslice rather than one. Unless your CPU usage is really light, this will probably end up glitching. (On Android M, it is very important to use EXACTLY the system buffer size, due to a bug in the buffer handling code.)
Use the sample rate provided by AudioManager.getProperty(PROPERTY_OUTPUT_SAMPLE_RATE). Otherwise your buffers take a detour through the system resampler.
Never make a syscall or lock a synchronization object inside the buffer callback. If you must synchronize, use a lock-free structure. For best results, use a completely wait-free structure such as a single-reader single-writer ring buffer. Loads of developers get this wrong and end up with glitches that are unpredictable and hard to debug.
Use vector instructions such as NEON, SSE, or whatever the equivalent instruction set is on your target processor.
Test and measure your code. Track how long it takes to run--and remember that you need to know the worst-case performance, not the average, because the worst case is what causes the glitches. And be conservative. You already know that if it takes more time to process your audio than it does to play it, you'll never get low latency. But on Android this is even more important, because the CPU frequency fluctuates so much. You can use perhaps 60-70% of CPU for audio, but keep in mind that this will change as the device gets hotter or cooler, or as the wifi or LTE radios start and stop, and so on.
Low-latency audio is no longer a new feature for Android, but it still requires device-specific changes in the hardware, drivers, kernel, and framework to pull off. This means that there's a lot of variation in the latency you can expect from different devices, and given how many different price points Android phones sell at, there probably will always be differences. Look for FEATURE_AUDIO_PRO or FEATURE_AUDIO_LOW_LATENCY to identify devices that meet the latency criteria your app requires.
From the link at your point 1:
"Low-latency audio
Android 4.2 improves support for low-latency audio playback, starting
from the improvements made in Android 4.1 release for audio output
latency using OpenSL ES, Soundpool and tone generator APIs. These
improvements depend on hardware support — devices that offer these
low-latency audio features can advertise their support to apps through
a hardware feature constant."
Your citation in complete form:
"Performance
As OpenSL ES is a native C API, non-Dalvik application threads which
call OpenSL ES have no Dalvik-related overhead such as garbage
collection pauses. However, there is no additional performance benefit
to the use of OpenSL ES other than this. In particular, use of OpenSL
ES does not result in lower audio latency, higher scheduling priority,
etc. than what the platform generally provides. On the other hand, as
the Android platform and specific device implementations continue to
evolve, an OpenSL ES application can expect to benefit from any future
system performance improvements."
So, the api to comunicate with drivers and then hw is OpenSl (in the same fashion Opengl does with graphics). The earlier versions of Android have a bad design in drivers and/or hw, though. These problems were addressed and corrected with 4.1 and 4.2 versions, so if the hd have the power, you get low latency using OpenSL.
Again, from this note from the puredata library website, is evident that the library uses OpenSL itself to achieve low latency:
Low latency support for compliant devices
The latest version of Pd for
Android (as of 12/28/2012) supports low-latency audio for compliant
Android devices. When updating your copy, make sure to pull the latest
version of both pd-for-android and the libpd submodule from GitHub.
At the time of writing, Galaxy Nexus, Nexus 4, and Nexus 10 provide a
low-latency track for audio output. In order to hit the low-latency
track, an app must use OpenSL, and it must operate at the correct
sample rate and buffer size. Those parameters are device dependent
(Galaxy Nexus and Nexus 10 operate at 44100Hz, while Nexus 4 operates
at 48000Hz; the buffer size is different for each device).
As is its wont, Pd for Android papers over all those complexities as
much as possible, providing access to the new low-latency features
when available while remaining backward compatible with earlier
versions of Android. Under the hood, the audio components of Pd for
Android will use OpenSL on Android 2.3 and later, while falling back
on the old AudioTrack/AudioRecord API in Java on Android 2.2 and
earlier.
When using OpenSL ES you should fulfil the following requirements to get low latency output on Jellybean and later versions of Android:
The audio should be mono or stereo, linear PCM.
The audio sample rate should be the same same sample rate as the output's native rate (this might not actually be required on some devices, because the FastMixer is capable of resampling if the vendor configures it to do so. But in my tests I got very noticeable artifacts when upsampling from 44.1 to 48 kHz in the FastMixer).
Your BufferQueue should have at least 2 buffers. (This requirement has since been relaxed. See this commit by Glenn Kasten. I'm not sure in which Android version this first appeared, but a guess would be 4.4).
You can't use certain effects (e.g. Reverb, Bass Boost, Equalization, Virtualization, ...).
The SoundPool class will also attempt to make use of fast AudioTracks internally when possible (the same criteria as above apply, except for the BufferQueue part).
Those of you more interested in Android’s 10 Millisecond Problem ie low latency audio on Android. We at Superpowered created the Android Audio Path Latency Explainer. Please see here:
http://superpowered.com/androidaudiopathlatency/#axzz3fDHsEe56
Another database of audio latencies and buffer sizes used:
http://superpowered.com/latency/#table
Source code:
https://github.com/superpoweredSDK/SuperpoweredLatency
There is a new C++ Library Oboe which help with reducing Audio Latency. I have used it in my projects and it works good.
It has this features which help in reducing audio latency:
Automatic latency tuning
Chooses the audio API (OpenSL ES on API 16+ or AAudio on API 27+)
Application for measuring sampleRate and bufferSize: https://code.google.com/p/high-performance-audio/source/checkout and http://audiobuffersize.appspot.com/ DB of results
Please, share your experience in using software echo cancellers on Android:
Built-in (the one that appeared in v3.0, as I hear)
Speex
WebRTC
Etc.
I'm just finishing the AEC work on android, I tried speex/android-built-in-ec/webrtc-aec and webrtc-aecm(echo control on mobile), and finally choose the AECM module, and there are some tips:
speex and webrtc-aec is not good for running on mobile(for low CPU perform reason).
android built-in EC is working, but the effect is not ideal, can still heard some echos or lots of self-excitation(maybe I'm not using it right). and not all the android device at this time support built-in EC, so this situation is discarded.
webrtc-aecm module is fine, it just took 1~2ms to process a 10ms frame. and the most important is the thing called delay, you should follow the description of it in audio_processing.h strictly, if you calculate a right value of delay, everything will be OK.
EDIT
After a long long time working with WebRTC AECM(or APM), I still cannot make it work perfect on android. I think AECM need more optimazition, but Google seems no plan on it. Any way, I'll keep attention about Google WebRTC and its AECM(or AEC) performance on android.
(Updated on 6/23/2020) Please refer to my GitHub project's README, my solution above was deprecated by myself years ago. I don't want to misleading others.
There are two issues that relates to AEC on Android:
CPU. Most AEC algorithms does not perform well with low CPU.
Echo Path - many VoIP application on Android introduce echo delay that is higher than what the free algorithm can handle (efficiently).
Bottom line, I suggest that you first measure the echo delay (i.e. echo tail) in your VoIP application. If it does not exceed 16ms-64ms you can try using one of the above mentioned free solutions.
One more note, I believe Speex will not work good on mobile devices since as far as I know it does not have a fix-point version.
I came across this relatively old post which describes how impressively Nexus One's noise cancellation works and I was wondering where can I find more information about its implementation in the OS software.
In particular:
How much of it is done using software and how much of it is done in
hardware?
Which modules in the Android source code are responsible for noise
cancellation?
Can I control its behavior via Android's API? (if so, which ones)
Does it also work with the microphone in the headset that comes with
Nexus One (4-pin 3.5mm jack) or does it work with the built-in
microphone only?
I only know the answer for the Nexus One, but:
It's done in hardware.
Not sure.
Nope.
Maybe?
For the N1, it works using a second microphone in the back, and comparing the two signals. I don't know exactly how this process is done (hardware or software), but I know there isn't an API for it. Also, it probably doesn't work for the external headset, since there's no second sound source to compare the first one to (unless the headset has two mics too, but I don't think it does).
About the Nexus One:
All hardware only configuration in software.
Sound drivers and sound system but only configuration.
No API possibly some prop configuration but I haven't been able to get that to work.
No, longer reply following.
I haven't found any indication that it uses the other microphone to do noise reduction for the headset. It wouldn’t make much sense either as it would most likely just try to cancel out with the noise from your pocket.
For most other android phones and for headset on the Nexus One I'm pretty sure that there is only some sort of filter to reduce input of sound that is not speech.
I have done some research on this that I tried to get some help with on the android porting and dev lists. There is a little further info:
http://groups.google.com/group/android-porting/browse_thread/thread/fe1b92065b75c6da?pli=1
With the reservation that I haven't looked at the latest and greatest versions of android.