AudioTrack - how to know when a sound begins/ends?

AudioTrack - how to know when a sound begins/ends? - android

I am using Audio track to play a different sounds, in stream mode.
I would like to know if there is a way to know when each sound beings/ends playing.
I create the audio track like this:
AudioTrack tmpAudioTrack = new AudioTrack(
STREAM_TYPE,
SAMPLE_RATE,
CHANNEL_CONFIG_TYPE,
AUDIO_FORMAT_TYPE,
getOptimalBufferSize(),
AUDIO_TRACK_MODE);'
And start it in a background thread:
backround_thread = new Thread(new MyRunnable(aTrack));
backround_thread.start();
I write each sound like this inside the runnable:
byte generatedSnd[] = new byte[2 * beepSamples];
<code for filling the buffer with sound here>
int bytesWritten = track.write(generatedSnd, 0, generatedSnd.length);
It is possible to use any of the AudioTrack APIs such setNotificationMarkerPosition, or setLoopPoints, or setPositionNotificationPeriod to accomplish this? and how do they work?
Each sound can be different duration of time. I think this is key.
I don't fully understand the documentation for these APIs. Is each frame the same as a sample? How do I specify a marker for where each sound begin/end?
Thanks,

This is what I have found out:
To me, frames are samples -- the duration of the sound multiplied times the sample rate.
To use AudioTrack.setPositionNotificationPeriod, you pass an amount of samples. Meaning if you pass 200 samples, the callback will be called every 200 samples, recurrent.
tmpAudioTrack.setPositionNotificationPeriod(duration * sampleRate);
To use .setNotificationMarkerPosition, you also pass an amount of samples. However, this is absolute and not relative like for the period API. So if you want to determine when a sound ends, you pass the sample amount (total sound track duration * sampleRate).
tmpAudioTrack.setNotificationMarkerPosition(duration * sampleRate);
But, if you are already playing, in the middle of your sound track, and want to add a marker so you will get called back let's say 3 seconds from now, then you would need to add the current playhead of the audio track, like this: (where your duration is 3 seconds)
tmpAudioTrack.setNotificationMarkerPosition(tmpAudioTrack.getPlaybackHeadPosition() + (duration * sampleRate));
And this is how you get registered for the period and marker notifications:
tmpAudioTrack.setPlaybackPositionUpdateListener(new OnPlaybackPositionUpdateListener(){
#Override
public void onMarkerReached(AudioTrack arg0) {
}
#Override
public void onPeriodicNotification(AudioTrack arg0) {
}
});

Related

audio latency issues

In the application which I want to create, I face some technical obstacles. I have two music tracks in the application. For example, a user imports the music background as a first track. The second path is a voice recorded by the user to the rhythm of the first track played by the speaker device (or headphones). At this moment we face latency. After recording and playing back in the app, the user hears the loss of synchronisation between tracks, which occurs because of the microphone and speaker latencies.
Firstly, I try to detect the delay by filtering the input sound. I use android’s AudioRecord class, and the method read(). This method fills my short array with audio data.
I found that the initial values of this array are zeros so I decided to cut them out before I will start to write them into the output stream.
So I consider those zeros as a „warmup” latency of the microphone. Is this approach correct? This operation gives some results, but it doesn’t resolve the problem, and at this stage, I’m far away from that.
But the worse case is with the delay between starting the speakers and playing the music. This delay I cannot filter or detect. I tried to create some calibration feature which counts the delay. I play a „beep” sound through the speakers, and when I start to play it, I also begin to measure time. Then, I start recording and listen for this sound being detected by the microphone. When I recognise this sound in the app, I stop measuring time. I repeat this process several times, and the final value is the average from those results. That is how I try to measure the latency of the device. Now, when I have this value, I can simply shift the second track backwards to achieve synchronisation of both records (I will lose some initial milliseconds of the recording, but I skip this case, for now, there are some possibilities to fix it).
I thought that this approach would resolve the problem, but it turned out this is not as simple as I thought. I found two issues here:
1. Delay while playing two tracks simultaneously
2. Random in device audio latency.
The first: I play two tracks using AudioTrack class and I run method play() like this:
val firstTrack = //creating a track
val secondTrack = //creating a track
firstTrack.play()
secondTrack.play()
This code causes delays at the stage of playing tracks. Now, I don’t even have to think about latency while recording; I cannot play two tracks simultaneously without delays. I tested this with some external audio file (not recorded in my app) - I’m starting the same audio file using the code above, and I can see a delay. I also tried it with MediaPlayer class, and I have the same results. In this case, I even try to play tracks when callback OnPreparedListener invoke:
val firstTrack = //AudioPlayer
val secondTrack = //AudioPlayer
second.setOnPreparedListener {
first.start()
second.start()
}
And it doesn’t help.
I know that there is one more class provided by Android called SoundPool. According to the documentation, it can be better with playing tracks simultaneously, but I can’t use it because it supports only small audio files and that can't limit me.
How can I resolve this problem? How can I start playing two tracks precisely at the same time?
The second: Audio latency is not deterministic - sometimes it is smaller, and sometimes it’s huge, and it’s out of my hands. So measuring device latency can help but again - it cannot resolve the problem.
To sum up: is there any solution, which can give me exact latency per device (or app session?) or other triggers which detect actual delay, to provide the best synchronisation while playback two tracks at the same time?
Thank you in advance!

Synchronising audio for karaoke apps is tough. The main issue you seem to be facing is variable latency in the output stream.
This is almost certainly caused by "warm up" latency: the time it takes from hitting "play" on your backing track to the first frame of audio data being rendered by the audio device (e.g. headphones). This can have large variance and is difficult to measure.
The first (and easiest) thing to try is to use MODE_STREAM when constructing your AudioTrack and prime it with bufferSizeInBytes of data prior to calling play (more here). This should result in lower, more consistent "warm up" latency.
A better way is to use the Android NDK to have a continuously running audio stream which is just outputting silence until the moment you hit play, then start sending audio frames immediately. The only latency you have here is the continuous output latency.
If you decide to go down this route I recommend taking a look at the Oboe library (full disclosure: I am one of the authors).
To answer one of your specific questions...
Is there a way to calculate the latency of the audio output stream programatically?
Yes. The easiest way to explain this is with a code sample (this is C++ for the AAudio API but the principle is the same using Java AudioTrack):
// Get the index and time that a known audio frame was presented for playing
int64_t existingFrameIndex;
int64_t existingFramePresentationTime;
AAudioStream_getTimestamp(stream, CLOCK_MONOTONIC, &existingFrameIndex, &existingFramePresentationTime);
// Get the write index for the next audio frame
int64_t writeIndex = AAudioStream_getFramesWritten(stream);
// Calculate the number of frames between our known frame and the write index
int64_t frameIndexDelta = writeIndex - existingFrameIndex;
// Calculate the time which the next frame will be presented
int64_t frameTimeDelta = (frameIndexDelta * NANOS_PER_SECOND) / sampleRate_;
int64_t nextFramePresentationTime = existingFramePresentationTime + frameTimeDelta;
// Assume that the next frame will be written into the stream at the current time
int64_t nextFrameWriteTime = get_time_nanoseconds(CLOCK_MONOTONIC);
// Calculate the latency
*latencyMillis = (double) (nextFramePresentationTime - nextFrameWriteTime) / NANOS_PER_MILLISECOND;
A caveat: This method relies on accurate timestamps being reported by the audio hardware. I know this works on Google Pixel devices but have heard reports that it isn't so accurate on other devices so YMMV.

Following the answer of donturner, here's a Java version (that also uses other methods depending on the SDK version)
/** The audio latency has not been estimated yet */
private static long AUDIO_LATENCY_NOT_ESTIMATED = Long.MIN_VALUE+1;
/** The audio latency default value if we cannot estimate it */
private static long DEFAULT_AUDIO_LATENCY = 100L * 1000L * 1000L; // 100ms
/**
* Estimate the audio latency
*
* Not accurate at all, depends on SDK version, etc. But that's the best
* we can do.
*/
private static void estimateAudioLatency(AudioTrack track, long audioFramesWritten) {
long estimatedAudioLatency = AUDIO_LATENCY_NOT_ESTIMATED;
// First method. SDK >= 19.
if (Build.VERSION.SDK_INT >= 19 && track != null) {
AudioTimestamp audioTimestamp = new AudioTimestamp();
if (track.getTimestamp(audioTimestamp)) {
// Calculate the number of frames between our known frame and the write index
long frameIndexDelta = audioFramesWritten - audioTimestamp.framePosition;
// Calculate the time which the next frame will be presented
long frameTimeDelta = _framesToNanoSeconds(frameIndexDelta);
long nextFramePresentationTime = audioTimestamp.nanoTime + frameTimeDelta;
// Assume that the next frame will be written at the current time
long nextFrameWriteTime = System.nanoTime();
// Calculate the latency
estimatedAudioLatency = nextFramePresentationTime - nextFrameWriteTime;
}
}
// Second method. SDK >= 18.
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED && Build.VERSION.SDK_INT >= 18) {
Method getLatencyMethod;
try {
getLatencyMethod = AudioTrack.class.getMethod("getLatency", (Class<?>[]) null);
estimatedAudioLatency = (Integer) getLatencyMethod.invoke(track, (Object[]) null) * 1000000L;
} catch (Exception ignored) {}
}
// If no method has successfully gave us a value, let's try a third method
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
AudioManager audioManager = (AudioManager) CRT.getInstance().getSystemService(Context.AUDIO_SERVICE);
try {
Method getOutputLatencyMethod = audioManager.getClass().getMethod("getOutputLatency", int.class);
estimatedAudioLatency = (Integer) getOutputLatencyMethod.invoke(audioManager, AudioManager.STREAM_MUSIC) * 1000000L;
} catch (Exception ignored) {}
}
// No method gave us a value. Let's use a default value. Better than nothing.
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
estimatedAudioLatency = DEFAULT_AUDIO_LATENCY;
}
return estimatedAudioLatency
}
private static long _framesToNanoSeconds(long frames) {
return frames * 1000000000L / SAMPLE_RATE;
}

The android MediaPlayer class is notoriously slow to begin audio playback, I experienced an issue in an app I was creating where there was a greater than one second delay to begin playing an audio clip. I resolved it by switching to ExoPlayer which resulted in the playback starting within 100ms. I've also read that ffmpeg has even faster start audio startup time than ExoPlayer but I haven't used it so I can't make any promises.

precision of Android MediaPlayer seekTo

I have a number of mp3 files that I use with Android MediaPlayer to play from certain offsets.
Using seekTo() seems to stop at correct location. player.getCurrrentPosition() returns the correct offset, but in some cases the real position is off for as much as 200 ms. The files are about 3 minutes worth of recording and the incorrect offsets seem to appear at the end. Of some of the files.
I have the same effect either trying with Android 4.0.3 device or 4.3 emulator.
Anybody has experience with "finetuning" MediaPlayer offsets? Any experience why MediaPlayer might not be working correctly with some files? They are all CBR, stereo, some have sampling frequency 22050, some 44100, different bitrates.
I'm setting the offsets from another program and saving to mp3 tags, then in case of doubt verifying manually using Audacity. Audacity agrees with my estimate of what the correct offset is, MediaPlayer seems to disagree.
I'm aware that I could use AudioTrack with raw sound files and have a better control, however it might be impractical as there are many mp3 files, so using raw sound data will make pretty large application or many large data files.
The code is nothing fancy:
player.seekTo(start);
player.start();
CountDownTimer timer = new CountDownTimer(length, 100) {
#Override
public void onTick(long millisUntilFinished) {
if (player!=null) setInt(R.id.nLocation, player.getCurrentPosition());
}
#Override
public void onFinish() {
if (player!=null) {
if (player.isPlaying()) {
player.pause();
}
setInt(R.id.nLocation, player.getCurrentPosition());
player.stop();
player.release();
player = null;
}
}
};
timer.start();

I did not manage to find the rule why the MediaPlayer interprets offset (seekTo) differently for a group of MP3 files. For example when creating a new MP3 file with the same parameters from Audacity+Lame (MPEG1, Layer III, 44100 Hz, 192 Kb/s) it worked perfectly.
However:
this can be reproduced - rip MP3 file using Windows Media Player, settings: MP3, 192 kb/s [added when edited]
I found the workaround that seems to work for any recording.
The background - in order to tell MediaPlayer to play from certain offset, I store certain data in MP3 tags. I use a separate program to set up the playback (in frames): Label A, start frame=1000, length=100 frames, Label B, start #1500 etc. Now when I need to play it back, I read the MP3 headers, determine the frame length, for example 26.12245 ms/frame and calculate the offset (1000 frames will be 26122 ms).
The workaround is to store in MP3 tag also the frame count and length in ms (or pass through again and count the frames). Then when start MediaPlayer, compare MediaPlayer.getDuration() (MediaPlayer estimate) with the duration stored in MP3 tag. Then adjust the frame size:
adjustedFrameSizeMs = realFrameSizeMs + (player.getDuration()-storedDurationMs)/storedframeCount;
In my case (for the files with incorrect offset) the adjusted frame length always was between 26.08 and 26.09 ms (instead of 26.12245).
I attempted to try see if this is because Android plays the recording quicker (so it estimates the "real time", not the time according to frame size and frame count). It seems that it really does plays quicker. But even quicker than its own estimate. For example a recording of about 1 hour:
my estimate: 2448 s
MediaPlayer: 2444 s (4 sec difference)
Audacity: 2442 s (here we are in disagreement)
Foobar: 2448 s (another witness that agrees with my estimate :-)
MediaPlayer, real play time: 2438 s
The real playtime was 6 s (0.25%) less than MediaPlayer own estimate. Another attempt on a different sample gave the same percentage difference. However the fact that Audacity and Foobar did not always agree with my estimates, does not let me put all the blame on MediaPlayer.

Android active noise cancellation

I'm working a somewhat ambitious project to get active noise-reduction achieved on Android with earbuds or headphones on.
My objective is to record ambient noise with the android phone mic, invert the phase (a simple *-1 on the short-value pulled from the Audio Record?), and playback that inverted waveform through the headphones. If the latency and amplitude are close to correct, it should nullify a good amount of mechanical structured noise in the environment.
Here's what I've got so far:
#Override
public void run()
{
Log.i("Audio", "Running Audio Thread");
AudioRecord recorder = null;
AudioTrack track = null;
short[][] buffers = new short[256][160];
int ix = 0;
/*
* Initialize buffer to hold continuously recorded audio data, start recording, and start
* playback.
*/
try
{
int N = AudioRecord.getMinBufferSize(8000,AudioFormat.CHANNEL_IN_MONO,AudioFormat.ENCODING_PCM_16BIT);
recorder = new AudioRecord(MediaRecorder.AudioSource.MIC, 8000, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, N*10);
//NoiseSuppressor ns = NoiseSuppressor.create(recorder.getAudioSessionId());
//ns.setEnabled(true);
track = new AudioTrack(AudioManager.STREAM_MUSIC, 8000,
AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT, N*10, AudioTrack.MODE_STREAM);
recorder.startRecording();
track.play();
/*
* Loops until something outside of this thread stops it.
* Reads the data from the recorder and writes it to the audio track for playback.
*/
while(!stopped)
{
short[] buffer = buffers[ix++ % buffers.length];
N = recorder.read(buffer,0,buffer.length);
for(int iii = 0;iii<buffer.length;iii++){
//Log.i("Data","Value: "+buffer[iii]);
buffer[iii] = buffer[iii] *= -1;
}
track.write(buffer, 0, buffer.length);
}
}
catch(Throwable x)
{
Log.w("Audio", "Error reading voice audio", x);
}
/*
* Frees the thread's resources after the loop completes so that it can be run again
*/
finally
{
recorder.stop();
recorder.release();
track.stop();
track.release();
}
}
I was momentarily excited to find the Android API actually already has a NoiseSuppression algorithm (you'll see it commented out above). I tested with it and found NoiseSuppressor wasn't doing much to null out constant tones which leads me to believe it's actually just performing a band-pass filter at non-vocal frequencies.
So, my questions:
1) The above code takes about 250-500ms from mic record through playback in headphones. This latency sucks and it would be great to reduce it. Any suggestions there would be appreciated.
2) Regardless of how tight the latency is, my understanding is that the playback waveform WILL have phase offset from the actual ambient noise waveform. This suggests I need to execute some kind of waveform matching to calculate this offset and compensate. Thoughts on how that gets calculated?
3) When it comes to compensating for latency, what would that look like? I've got an array of shorts coming in every cycle, so what would a 30ms or 250ms latency look like?
I'm aware of fundamental problems with this approach being that the location of the phone being not next to the head is likely to introduce some error, but I'm hopeful with some either dynamic or fixed latency correction it maybe be possible to overcome it.
Thanks for any suggestions.

Even if you were able to do something about the latency, it's a difficult problem as you don't know the distance of the phone from the ear, plus there's the fact that distance is not fixed (as the user will move the phone), plus the fact that you don't have a microphone for each ear (so you can't know what the wave will be at one ear until after it's got there, even if you have zero latency)
Having said that, you might be able to do something that could cancel highly periodic waveforms. All you could do though is allow the user to manually adjust the time delay for each ear - as you have no microphones near the ears themselves, you can have no way in your code to know if you're making the problem better or worse.

How to provide consistent timing of audio sequencing across various Android devices and OS Versions?

I have created a metronome-type application with a specified swing interval of 750 milliseconds on the pendulum and playing a single audio file at the maximum swing arc... repeating the swinging of the pendulum and playing of the sound indefinitely. However, I am finding that the actual timing of execution of the code varies dramatically from device-to-device and even performs with variance on a single device. My intent is to swing the pendulum at the rate of 80 beats per minute and play the audio file with each "beat". I adjusted the 750 millisecond setting to accommodate the time required to play the audio file. This slightly reduced the millisecond setting from 750 down to about 680. I tested using various devices and found that the results of a one minute run of the metronome performed dramatically differently for timing as I tested with various Android devices even though I am defining my timing elements based on milliseconds.
I am using Android SoundPool to access a .wav file to play the sound.
I found quite a few references to Soundpool timing issues and concerns but have not yet found a viable and reliable solution to deliver consistent timing for an application like this.
It seems that the swing of the pendulum is pretty consistent based on the specified delay so I believe the variation is due to variable timing during execution of the SoundPool code playing the audio. Is there a reliable way to execute code to play sounds on a consistent and "exact" timing interval with Android?

One way to do this is a handler. This allows you to start the audio clip at exactly the same time regardless of how long the clip actually takes to play. You don't need SoundPool for this, just SoundPlayer.
A handler allows you to schedule a message to be delivered to your code some time in the future. Since playing a sound with SoundPlayer is asynchronous, you can use this simple mechanism to play a sound on a regular interval.
Here is some code to show how it might work.
handler = new Handler() {
/* (non-Javadoc)
* #see android.os.Handler#handleMessage(android.os.Message)
*/
#Override
public void handleMessage(Message msg) {
if (msg.what == NEXT_SOUND_MSG) {
playNextSound()
}
}
};
// Set up media player for sounds
player = new SoundPlayer(context);
player.start();
private void playNextSound() {
if (mRunning) {
// Play the sound
int iSoundResId = item.getSoundResId();
if (iSoundResId != -1) {
playSound(iSoundResId);
}
// schedule a message to advance to next item after duration
Message msg = handler.obtainMessage(NEXT_SOUND_MSG);
handler.sendMessageDelayed(msg, interval);
}
}

Binding MediaPlayer to be played at a specific time

When using MediaPlayer, I noticed that whenever my phone stucks, the MediaPlayer glitches and then continues playing from the position in the audio it glitched.
This is bad for my implementation since I want the audio to be played at a specific time.
If I have a song of 1000 millisecond length, I want is the ability to set MediaPlayer to start playing at some specific time t, and then exactly stop at at time t+1000.
This means that I actually need two things:
1) Start MediaPlayer at a specific time with a very small delay.
2) Making MediaPlayer glitches ignore the audio they glitched on and continue playing in order to finish the song on time.
The delay of the functions is very important to me and I need the audio to be played exactly(~) at the time it was supposed to be played.
Thanks!

You will need to use possibly mp.getDuration(); and/or mp.getCurrentPosition(); although it's impossible to know exactly what you mean by "I need the audio to be played exactly(~) at the time it was supposed to be played."
Something like this should get you started:
int a = (mp.getCurrentPosition() + b);

Thanks for the answer Mike. but unfortunately this won't help me. Let's say that I asked MediaPlayer to start playing a song of length 3:45 at 00:00. At 01:00 I started using the phone's resources, due to the heavy usage my phone glitched making MediaPlayer pause for 2 seconds.
Time:
00:00-01:00 - I heard the audio at 00:00-01:00
01:00-01:02 - I heard silence because the phone glitched
01:02-03:47 - I heard the audio at 01:00-03:45 with 2 second time skew
Now from what I understood MediaPlayer is a bad choice of usage on this problem domain, since MediaPlayer provides a high level API.I am currently experimenting with the
AudioTrack class which should provide me with what I need:
//Creating a new audio track
AudioTrack audioTrack = new AudioTrack(...)
//Get start time
long start = System.currentTimeMillis();
// loop until finished
for (...) {
// Get time in song
long now = System.currentTimeMillis();
long nowInSong = now - start;
// get a buffer from the song at time nowInSong with a length of 1 second
byte[] b = getAudioBuffer(nowInSong);
// play 1 second of music
audioTrack.write(b, 0, b.length);
// remove any unplayed data
audioTrack.flush();
}
Now if I glitch I only glitch for 1 second and then I correct myself by playing the right audio at the right time!
NOTE
I haven't tested this code but it seems like the right way to do it. If it will actually work I will update this post again.
P.S. seeking in MediaPlayer is:
1. A heavy operation that will surely delay my music (every millisecond counts here)
2. Is not thread safe and cannot be used from multiple threads (seeks, starts etc...)

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.