Android: sound API (deterministic, low latency)

Android: sound API (deterministic, low latency) - android

I'm reviewing all kinds of Android sound API and I'd like to know which one I should use.
My goal is to get low latency audio or, at least, deterministic behavior regarding delay of playback.
We've had a lot of problems and it seems that Android sound API is crap, so I'm exploring possibilities.
The problem we have is that there is significant delay between sound_out.write(sound_samples); and actual sound played from the speakers. Usually it is around 300 ms. The problem is that on all devices it's different; some don't have that problem, but most are crippled (however, CS call has zero latency). The biggest issue with this ridiculous delay is that on some devices this delay appears to be some random value (i.e. it's not always 300ms).
I'm reading about OpenSL ES and I'd like to know if anybody had experience with it, or it's the same shit but wrapped in different package?
I prefer to have native access, but I don't mind Java layer indirection as long as I can get deterministic behavior: either the delay has to be constant (for a given device), or I'd like to get access to current playback position instead of guessing it with a error range of ±300 ms...
EDIT:1.5 years later I tried multiple android phones to see how I can get best possible latency for a real time voice communication. Using specialized tools I measured the delay of waveout path. Best results were over 100 ms, most phones were in 180ms range. Anybody have ideas?

SoundPool is the lowest-latency interface on most devices, because the pool is stored in the audio process. All of the other audio paths require inter-process communication. OpenSL is the best choice if SoundPool doesn't meet your needs.
Why OpenSL? AudioTrack and OpenSL have similar latencies, with one important difference: AudioTrack buffer callbacks are serviced in Dalvik, while OpenSL callbacks are serviced in native threads. The current implementation of Dalvik isn't capable of servicing callbacks at extremely low latencies, because there is no way to suspend garbage collection during audio callbacks. This means that the minimum size for AudioTrack buffers has to be larger than the minimum size for OpenSL buffers to sustain glitch-free playback.
On most Android releases this difference between AudioTrack and OpenSL made no difference at all. But with Jellybean, Android now has a low-latency audio path. The actual latency is still device dependent, but it can be considerably lower than previously. For instance, http://code.google.com/p/music-synthesizer-for-android/ uses 384-frame buffers on the Galaxy Nexus for a total output latency of under 30ms. This requires the audio thread to service buffers approximately once every 8ms, which was not feasible on previous Android releases. It is still not feasible in a Dalvik thread.
This explanation assumes two things: first, that you are requesting the smallest possible buffers from OpenSL and doing your processing in the buffer callback rather than with a buffer queue. Second, that your device supports the low-latency path. On most current devices you will not see much difference between AudioTrack and OpenSL ES. But on devices that support Jellybean+ and low-latency audio, OpenSL ES will give you the lowest-latency path.

IIRC, OpenSL is passed through the same interface as AudioTrack, so at best it will match AudioTrack. (FWIW, I'm currently using OpenSL for "low latency" output)
The sad truth is there is no such thing as low latency audio on Android. There isn't even a proper way to flag and/or filter devices based on audio latency.
What interface you'll want to use to minimize latency is going to depend on what you are trying to do.
If you want to have an audio stream you'll be looking at either OpenSL or AudioTrack.
If you want to trigger some static oneshots you may want to use SoundPool. For static oneshots SoundPool will have low latency as the samples are preloaded to the hardware. I think it's possible to preload oneshots using OpenSL as well, but I haven't tried.

The lowest latency you can get is from SoundPool. There's a limit on how big of a sound you can play that way, but if you're under the limit (1Mb, IIRC) it's pretty low latency. Even that's probably not 40ms, though.
But it is faster than what you can get by streaming, at least in my experience.
Caveat: You may see occasional crashes in SoundPool on Samsung devices. I'm have a theory that it only happens when you access the SoundPool from multiple threads, but I haven't verified this.
EDIT: OpenSL ES apparently has extremely HIGH latency on Kindle Fire, while SoundPool is much better, but the reverse may be true on other platforms.

About the problem of a deterministic/constant latency, here you can find an interesting article:
APPROACHES FOR CONSTANT AUDIO LATENCY ON ANDROID
The core of their investigations is: Because the Audio HAL, which is one of the deeper levels of the audiopath and responsible for the timing of the audio-callback-events, is vendor-implemented the relative latencies can vary, especially in cheap hardware.
So they suggest two approaches to reduce the variance of the latency. One is to take care of the callback-timing by inserting audio in fixed intervals, the other one is to filter the callback times to estimate the time at which a constant latency callback should have occured by appliyng a smoothing filter.
With this two approaches the could significantly reduce the variance of the latency.
It should also be mentioned, that there is a new native Android-Audio-API, AAudio.
AAudio API
It's available/stable from Android Oreo 8.1 (API Level 27).
There is also a wrapper, which dynamically chooses between OpenSL ES and AAudio and is much easier to code then OpenSL ES. It's still in developer preview.
Oboe Audio Library

The best way to get low latency for native code on Android is to use Oboe.
https://github.com/google/oboe
Oboe wraps AAudio on newer devices. AAudio offers the lowest possible latency paths. If AAudio is not available then Oboe calls OpenSL ES. Oboe is much easier to use than OpenSL ES.
AAudio either calls through AudioTrack or through a new MMAP data path. AAudio makes it easier to get a FAST track because you can leave some parameters unspecified. AAudio will then choose the right parameters needed for a FAST track.

Related

Maximum number of simultaneous MediaRecorder instances on android?

I created android app that records device screen (using MediaProjection) API and video from camera at the same time. I use MediaRecorder in both cases. I need a way to find out whether device is actually capable of recording two video streams simultaneously. I assume there is some limit on number of streams that can be encoded simultaneously on given devices but I cannot find any API on android platform to query for that information.
Things I discovered so far:
Documentation for MediaRecorder.release() advises to release MediaRecorder as soon as possible as:
" Even if multiple instances of the same codec are supported, some performance degradation may be expected when unnecessary multiple instances are used at the same time."
This suggests that there's a limit on number of instances of the coded which directly limits number of MediaRecorders.
I've wrote testing code that creates MediaRecorders (configured to use MPEG4/H264) and starts them in a loop - On Nexus 5 it always fails with java.io.IOException: prepare failed when calling prepare() on 6th instance. This suggests you can have only 5 instances of MediaRecorder on Nexus5.

I'm not aware of anything you can query for this information, though it's possible something went into Lollipop that I didn't see.
There is a limit on the number of hardware codec instances that is largely dependent on hardware bandwidth. It's not a simple question of how many streams the device can handle -- some devices might be able to encode two 720p streams but not two 1080p streams.
On some devices the codec may fall back to a software implementation if it runs out of hardware resources. Things will work but will be considerably slower. (I've seen this for H.264 decoding, but I don't know if it also happens for encoding.)
I don't believe there is a minimum system requirement in CTS. It would be useful to know that all devices could, say, decode two 1080p streams and encode one 1080p simultaneously, so that a video editor could be made for all devices, but I don't know if such a thing has been added. (Some very inexpensive devices would struggle to meet that.)

I think it really depends on devices and ram capacity ... you could read the buffers for screen and cam as much as you like but only one read at a time not simultaneously I think to prevent concurrency but honestly I don't really know for sure

expected live streaming lag from an android device

How much lag is expected when an android device's camera is streamed live?
I checked Miracast and it has a lag of around 150-300 ms(usually its around 160-190 ms).
I have a bluetooth and a wifi direct app and both lag by around 400-550 ms. I was wondering if it would be possible to reproduce or come closer to Miracast's performance.
I am encoding the camera frames in H.264 and using my own custom protocol to transmit the encoded frames over a TCP connection(in case of WiFi Direct).

Video encoding is compute intensive so any help you can get from the hardware usually speeds things up. It is worth making sure you are using codecs which leverage the hardware - i.e avoid having your H.264 encoding all in software.
There is a video here which discusses how to access HW accelerated video codecs, although it is a little old, but will apply to older devices:
https://www.youtube.com/watch?v=tLH-oGfiGaA
I believe that MediaCodec now provides a Java API to the Hardware Codecs (have not tried or seen speed comparison tests myself), which makes things easier:
http://developer.android.com/about/versions/android-4.1.html#Multimedia
See the note here also about using the camera's Surface preview as a speed aid:
https://stackoverflow.com/a/17243244/334402
As an aside, 500ms does not seem that bad, and you are probably into diminishing returns so the effort to reduce may be high if you can live with your current lag.
Also, I assume you are measuring the lag (or latency) on the server side? If so you need to look at how the server is decoding and presenting also, especially if you are comparing your own player with a third party one.
It is worth looking at your jitter buffer on the receiving side, irrespective of the stream etc latency - in simple terms waiting for a larger number of packets before you start playback may cause more startup delay but it also may provide a better overall user experience. This is because the large buffer will be more tolerant of delayed packets, before going into a 'buffering' mode that users tend not to like.
Its a balance, and your needs may dictate a bias one way or the other. If you were planning a video chat like application for example, then the delay is very important as it starts to become annoying to users above 200-300 ms. If you are providing a feed from a sports event, then delay may be less important and avoiding buffering pauses may give better user perceived quality.

Android OpenSL ES crystal frequency

we have an app with mobile audio clients written in low-level OpenSL ES to achieve low-latency input from microphone. Than we are sending 10ms frames encapsulated in UDP datagram to server.
On server we are doing some post-processing which is curucially dependent on aan assumption that frames from mobile clients comes in fixed intervals (eg. 10ms per frame), so we can align them.
It seems that internal crystal frequencies on mobile phones can vary a lot and due to this, we are getting perfect alignment on the beggining but poor alignment after few minutes.
I know, that ALSA on Linux can tell you exact frequency of the crystal - so you can correct your counts based on this. Unfortunatelly I don't know how to get this information in Android.
Thx for help

The essence of the problem you face is that you have an ADC and a DAC on separate systems with different local oscillators. You're presumably timing your packets against a 3rd (and possibly 4th) CPU clock.
The correct solution to this problem is some kind of clock recovery algorithm. To do this properly you need some means of accurately timestamping (e.g. to bit accuracy) transmitted packets, and then use a PLL to drive the clock-rate of the receiver's sample clock. This is is precisely the approach that both IEEE1394 audio and MPEG2 Transport streams use.
Since probably can't do either of these things, your approach is most likely going to involve dropping or repeating samples (or even entire packets) periodically to keep your receive buffer from under- or over-flowing.
USB Audio has a similar lack of hardware support for clock recovery, and the approaches used there may be applicable to your situation.
Relying on the transmission and reception timing of network packets is a terrible idea. The jitter on delivery times is horrendous - particularly with Wifi or cellular connections. You'd be well advised to use not rely on it at all, and instead do as both IEEE1394 audio and MPEG 2 TS do, which is to decouple audio data transport from consumption using a model FIFO in which data is consumed at a constant rate and delivered to it in packets of unreliable timing.
As for ALSA, all it can do (unless if has an accurate external timing reference) is to measure the drift between the sample clock of the audio interface and the CPU's clock. This does not yield 'the exact frequency' of anything as neither oscillator is likely to be accurate, and both may drift dependent on temperature.

Android Getting distance using sound between two devices

The idea is Phone A sends a sound signal and bluetooth signal at the same time and Phone B will calculate the delay between the two signals.
In practice I am getting inconsistent results with delays from 90ms-160ms.
I tried optimizing both ends as much as possible.
On the output end:
Tone is generated once
Bluetooth and audio output each have their own thread
Bluetooth only outputs after AudioTrack.write and AudioTrack is in streaming mode so it should start outputting before the write is even completed.
On the receiving end:
Again two separate threads
System time is recorded before each AudioRecord.read
Sampling specs:
44.1khz
Reading entire buffer
Sampling 100 samples at a time using fft
Taking into account how many samples transformed since initial read()

Your method relies on basically zero latency throughout the whole pipeline, which is realistically impossible. You just can't synchronize it with that degree of accuracy. If you could get the delays down to 5-6ms, it might be possible, but you'll beat your head into your keyboard before that happens. Even then, it could only possibly be accurate to 1.5 meters or so.
Consider the lower end of the delays you're receiving. In 90ms, sound can travel slightly over 30m. That's the very end of the marketed bluetooth range, without even considering that you'll likely be in non-ideal transmitting conditions.
Here's a thread discussing low latency audio in Android. TL;DR is that it sucks, but is getting better. With the latest APIs and recent devices, you may be able to get it down to 30ms or so, assuming you run some hand-tuned audio functions. No simple AudioTrack here. Even then, that's still a good 10-meter circular error probability.
Edit:
A better approach, assuming you can synchronize the devices' clocks, would be to embed a timestamp into the audio signal, using a simple am/fm modulation or pulse train. Then you could decode it at the other end and know when it was sent. You still have to deal with the latency problem, but it simplifies the whole thing nicely. There's no need for bluetooth at all, since it isn't really a reliable clock anyway, since it can be regarded as having latency problems of its own.

This gives you a pretty good approach
http://netscale.cse.nd.edu/twiki/pub/Main/Projects/Analyze_the_frequency_and_strength_of_sound_in_Android.pdf
You have to create an 1 kHz sound with some amplitude (measure in dB) and try to measure the amplitude of the sound arrived to the other device. From the sedation you might be able to measure the distance.
As I remember: a0 = 20*log (4*pi*distance/lambda) where a0 is the sedation and lambda is given (you can count it from the 1kHz)
But in such a sensitive environment, the noise might spoil the whole thing, just an idea, how I would do if I were you.

Low-latency audio playback on Android

I'm currently attempting to minimize audio latency for a simple application:
I have a video on a PC, and I'm transmitting the video's audio through RTP to a mobile client. With a very similar buffering algorithm, I can achieve 90ms of latency on iOS, but a dreadful ±180ms on Android.
I'm guessing the difference stems from the well-known latency issues on Android.
However, after reading around for a bit, I came upon this article, which states that:
Low-latency audio is available since Android 4.1/4.2 in certain devices.
Low-latency audio can be achieved using libpd, which is Pure Data library for Android.
I have 2 questions, directly related to those 2 statements:
Where can I find more information on the new low-latency audio in Jellybean? This is all I can find but it's sorely lacking in specific information. Should the changes be transparent to me, or is there some new class/API calls I should be implementing for me to notice any changes in my application? I'm using the AudioTrack API, and I'm not even sure if it should reap benefits from this improvement or if I should be looking into some other mechanism for audio playback.
Should I look into using libpd? It seems to me like it's the only chance I have of achieving lower latencies, but since I've always thought of PD as an audio synthesis utility, is it really suited for a project that just grabs frames from a network stream and plays them back? I'm not really doing any synthesizing. Am I following the wrong trail?
As an additional note, before someone mentions OpenSL ES, this article makes it quite clear that no improvements in latency should be expected from using it:
"As OpenSL ES is a native C API, non-Dalvik application threads which
call OpenSL ES have no Dalvik-related overhead such as garbage
collection pauses. However, there is no additional performance benefit
to the use of OpenSL ES other than this. In particular, use of OpenSL
ES does not result in lower audio latency, higher scheduling priority,
etc. than what the platform generally provides."

For lowest latency on Android as of version 4.2.2, you should do the following, ordered from least to most obvious:
Pick a device that supports FEATURE_AUDIO_PRO if possible, or FEATURE_AUDIO_LOW_LATENCY if not. ("Low latency" is 50ms one way; pro is <20ms round trip.)
Use OpenSL. The Dalvik GC has a low amortized cost, but when it runs it takes more time than a low-latency audio thread can allow.
Process audio in a buffer queue callback. The system runs buffer queue callbacks in a thread that has more favorable scheduling than normal user-mode threads.
Make your buffer size a multiple of AudioManager.getProperty(PROPERTY_OUTPUT_FRAMES_PER_BUFFER). Otherwise your callback will occasionally get two calls per timeslice rather than one. Unless your CPU usage is really light, this will probably end up glitching. (On Android M, it is very important to use EXACTLY the system buffer size, due to a bug in the buffer handling code.)
Use the sample rate provided by AudioManager.getProperty(PROPERTY_OUTPUT_SAMPLE_RATE). Otherwise your buffers take a detour through the system resampler.
Never make a syscall or lock a synchronization object inside the buffer callback. If you must synchronize, use a lock-free structure. For best results, use a completely wait-free structure such as a single-reader single-writer ring buffer. Loads of developers get this wrong and end up with glitches that are unpredictable and hard to debug.
Use vector instructions such as NEON, SSE, or whatever the equivalent instruction set is on your target processor.
Test and measure your code. Track how long it takes to run--and remember that you need to know the worst-case performance, not the average, because the worst case is what causes the glitches. And be conservative. You already know that if it takes more time to process your audio than it does to play it, you'll never get low latency. But on Android this is even more important, because the CPU frequency fluctuates so much. You can use perhaps 60-70% of CPU for audio, but keep in mind that this will change as the device gets hotter or cooler, or as the wifi or LTE radios start and stop, and so on.
Low-latency audio is no longer a new feature for Android, but it still requires device-specific changes in the hardware, drivers, kernel, and framework to pull off. This means that there's a lot of variation in the latency you can expect from different devices, and given how many different price points Android phones sell at, there probably will always be differences. Look for FEATURE_AUDIO_PRO or FEATURE_AUDIO_LOW_LATENCY to identify devices that meet the latency criteria your app requires.

From the link at your point 1:
"Low-latency audio
Android 4.2 improves support for low-latency audio playback, starting
from the improvements made in Android 4.1 release for audio output
latency using OpenSL ES, Soundpool and tone generator APIs. These
improvements depend on hardware support — devices that offer these
low-latency audio features can advertise their support to apps through
a hardware feature constant."
Your citation in complete form:
"Performance
As OpenSL ES is a native C API, non-Dalvik application threads which
call OpenSL ES have no Dalvik-related overhead such as garbage
collection pauses. However, there is no additional performance benefit
to the use of OpenSL ES other than this. In particular, use of OpenSL
ES does not result in lower audio latency, higher scheduling priority,
etc. than what the platform generally provides. On the other hand, as
the Android platform and specific device implementations continue to
evolve, an OpenSL ES application can expect to benefit from any future
system performance improvements."
So, the api to comunicate with drivers and then hw is OpenSl (in the same fashion Opengl does with graphics). The earlier versions of Android have a bad design in drivers and/or hw, though. These problems were addressed and corrected with 4.1 and 4.2 versions, so if the hd have the power, you get low latency using OpenSL.
Again, from this note from the puredata library website, is evident that the library uses OpenSL itself to achieve low latency:
Low latency support for compliant devices
The latest version of Pd for
Android (as of 12/28/2012) supports low-latency audio for compliant
Android devices. When updating your copy, make sure to pull the latest
version of both pd-for-android and the libpd submodule from GitHub.
At the time of writing, Galaxy Nexus, Nexus 4, and Nexus 10 provide a
low-latency track for audio output. In order to hit the low-latency
track, an app must use OpenSL, and it must operate at the correct
sample rate and buffer size. Those parameters are device dependent
(Galaxy Nexus and Nexus 10 operate at 44100Hz, while Nexus 4 operates
at 48000Hz; the buffer size is different for each device).
As is its wont, Pd for Android papers over all those complexities as
much as possible, providing access to the new low-latency features
when available while remaining backward compatible with earlier
versions of Android. Under the hood, the audio components of Pd for
Android will use OpenSL on Android 2.3 and later, while falling back
on the old AudioTrack/AudioRecord API in Java on Android 2.2 and
earlier.

When using OpenSL ES you should fulfil the following requirements to get low latency output on Jellybean and later versions of Android:
The audio should be mono or stereo, linear PCM.
The audio sample rate should be the same same sample rate as the output's native rate (this might not actually be required on some devices, because the FastMixer is capable of resampling if the vendor configures it to do so. But in my tests I got very noticeable artifacts when upsampling from 44.1 to 48 kHz in the FastMixer).
Your BufferQueue should have at least 2 buffers. (This requirement has since been relaxed. See this commit by Glenn Kasten. I'm not sure in which Android version this first appeared, but a guess would be 4.4).
You can't use certain effects (e.g. Reverb, Bass Boost, Equalization, Virtualization, ...).
The SoundPool class will also attempt to make use of fast AudioTracks internally when possible (the same criteria as above apply, except for the BufferQueue part).

Those of you more interested in Android’s 10 Millisecond Problem ie low latency audio on Android. We at Superpowered created the Android Audio Path Latency Explainer. Please see here:
http://superpowered.com/androidaudiopathlatency/#axzz3fDHsEe56

Another database of audio latencies and buffer sizes used:
http://superpowered.com/latency/#table
Source code:
https://github.com/superpoweredSDK/SuperpoweredLatency

There is a new C++ Library Oboe which help with reducing Audio Latency. I have used it in my projects and it works good.
It has this features which help in reducing audio latency:
Automatic latency tuning
Chooses the audio API (OpenSL ES on API 16+ or AAudio on API 27+)

Application for measuring sampleRate and bufferSize: https://code.google.com/p/high-performance-audio/source/checkout and http://audiobuffersize.appspot.com/ DB of results

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.