I'd like to play back audio that is synthesized at 1/50 s increments. With the asynchronous streaming interface of AudioTrack my plan is to basically do the following:
while (!done)
{
frame = synthesize();
audio.waitForWrite(); // XXX
audio.write(frame, 0, frameSize, WRITE_NON_BLOCKING);
}
audio.waitForWrite(); // XXX
However, there is no waitForWrite or similar method on AudioTrack that I could use here; and if I just do a non-blocking write, the second frame will replace the first one in the middle, i.e. let's say synthesis of a 20ms frame takes 5 ms, then the first frame will play for 5ms and then get replaced by the second one after 5ms and so on, which is clearly not what I want.
On the other hand, if I use blocking writes, then I can't synthesize the next frame while the previous one is already playing.
You misunderstand streaming mode. Write doesn't take the amount of time it takes to play it. Write copies it to another buffer. In blocking mode, it will wait until the entire buffer is copied, but not until its played. In non-blocking mode, it will copy as much as it can right now and return immediately. There is no need to wait for it to be played in either mode, and no reason to.
Related
In the application which I want to create, I face some technical obstacles. I have two music tracks in the application. For example, a user imports the music background as a first track. The second path is a voice recorded by the user to the rhythm of the first track played by the speaker device (or headphones). At this moment we face latency. After recording and playing back in the app, the user hears the loss of synchronisation between tracks, which occurs because of the microphone and speaker latencies.
Firstly, I try to detect the delay by filtering the input sound. I use android’s AudioRecord class, and the method read(). This method fills my short array with audio data.
I found that the initial values of this array are zeros so I decided to cut them out before I will start to write them into the output stream.
So I consider those zeros as a „warmup” latency of the microphone. Is this approach correct? This operation gives some results, but it doesn’t resolve the problem, and at this stage, I’m far away from that.
But the worse case is with the delay between starting the speakers and playing the music. This delay I cannot filter or detect. I tried to create some calibration feature which counts the delay. I play a „beep” sound through the speakers, and when I start to play it, I also begin to measure time. Then, I start recording and listen for this sound being detected by the microphone. When I recognise this sound in the app, I stop measuring time. I repeat this process several times, and the final value is the average from those results. That is how I try to measure the latency of the device. Now, when I have this value, I can simply shift the second track backwards to achieve synchronisation of both records (I will lose some initial milliseconds of the recording, but I skip this case, for now, there are some possibilities to fix it).
I thought that this approach would resolve the problem, but it turned out this is not as simple as I thought. I found two issues here:
1. Delay while playing two tracks simultaneously
2. Random in device audio latency.
The first: I play two tracks using AudioTrack class and I run method play() like this:
val firstTrack = //creating a track
val secondTrack = //creating a track
firstTrack.play()
secondTrack.play()
This code causes delays at the stage of playing tracks. Now, I don’t even have to think about latency while recording; I cannot play two tracks simultaneously without delays. I tested this with some external audio file (not recorded in my app) - I’m starting the same audio file using the code above, and I can see a delay. I also tried it with MediaPlayer class, and I have the same results. In this case, I even try to play tracks when callback OnPreparedListener invoke:
val firstTrack = //AudioPlayer
val secondTrack = //AudioPlayer
second.setOnPreparedListener {
first.start()
second.start()
}
And it doesn’t help.
I know that there is one more class provided by Android called SoundPool. According to the documentation, it can be better with playing tracks simultaneously, but I can’t use it because it supports only small audio files and that can't limit me.
How can I resolve this problem? How can I start playing two tracks precisely at the same time?
The second: Audio latency is not deterministic - sometimes it is smaller, and sometimes it’s huge, and it’s out of my hands. So measuring device latency can help but again - it cannot resolve the problem.
To sum up: is there any solution, which can give me exact latency per device (or app session?) or other triggers which detect actual delay, to provide the best synchronisation while playback two tracks at the same time?
Thank you in advance!
Synchronising audio for karaoke apps is tough. The main issue you seem to be facing is variable latency in the output stream.
This is almost certainly caused by "warm up" latency: the time it takes from hitting "play" on your backing track to the first frame of audio data being rendered by the audio device (e.g. headphones). This can have large variance and is difficult to measure.
The first (and easiest) thing to try is to use MODE_STREAM when constructing your AudioTrack and prime it with bufferSizeInBytes of data prior to calling play (more here). This should result in lower, more consistent "warm up" latency.
A better way is to use the Android NDK to have a continuously running audio stream which is just outputting silence until the moment you hit play, then start sending audio frames immediately. The only latency you have here is the continuous output latency.
If you decide to go down this route I recommend taking a look at the Oboe library (full disclosure: I am one of the authors).
To answer one of your specific questions...
Is there a way to calculate the latency of the audio output stream programatically?
Yes. The easiest way to explain this is with a code sample (this is C++ for the AAudio API but the principle is the same using Java AudioTrack):
// Get the index and time that a known audio frame was presented for playing
int64_t existingFrameIndex;
int64_t existingFramePresentationTime;
AAudioStream_getTimestamp(stream, CLOCK_MONOTONIC, &existingFrameIndex, &existingFramePresentationTime);
// Get the write index for the next audio frame
int64_t writeIndex = AAudioStream_getFramesWritten(stream);
// Calculate the number of frames between our known frame and the write index
int64_t frameIndexDelta = writeIndex - existingFrameIndex;
// Calculate the time which the next frame will be presented
int64_t frameTimeDelta = (frameIndexDelta * NANOS_PER_SECOND) / sampleRate_;
int64_t nextFramePresentationTime = existingFramePresentationTime + frameTimeDelta;
// Assume that the next frame will be written into the stream at the current time
int64_t nextFrameWriteTime = get_time_nanoseconds(CLOCK_MONOTONIC);
// Calculate the latency
*latencyMillis = (double) (nextFramePresentationTime - nextFrameWriteTime) / NANOS_PER_MILLISECOND;
A caveat: This method relies on accurate timestamps being reported by the audio hardware. I know this works on Google Pixel devices but have heard reports that it isn't so accurate on other devices so YMMV.
Following the answer of donturner, here's a Java version (that also uses other methods depending on the SDK version)
/** The audio latency has not been estimated yet */
private static long AUDIO_LATENCY_NOT_ESTIMATED = Long.MIN_VALUE+1;
/** The audio latency default value if we cannot estimate it */
private static long DEFAULT_AUDIO_LATENCY = 100L * 1000L * 1000L; // 100ms
/**
* Estimate the audio latency
*
* Not accurate at all, depends on SDK version, etc. But that's the best
* we can do.
*/
private static void estimateAudioLatency(AudioTrack track, long audioFramesWritten) {
long estimatedAudioLatency = AUDIO_LATENCY_NOT_ESTIMATED;
// First method. SDK >= 19.
if (Build.VERSION.SDK_INT >= 19 && track != null) {
AudioTimestamp audioTimestamp = new AudioTimestamp();
if (track.getTimestamp(audioTimestamp)) {
// Calculate the number of frames between our known frame and the write index
long frameIndexDelta = audioFramesWritten - audioTimestamp.framePosition;
// Calculate the time which the next frame will be presented
long frameTimeDelta = _framesToNanoSeconds(frameIndexDelta);
long nextFramePresentationTime = audioTimestamp.nanoTime + frameTimeDelta;
// Assume that the next frame will be written at the current time
long nextFrameWriteTime = System.nanoTime();
// Calculate the latency
estimatedAudioLatency = nextFramePresentationTime - nextFrameWriteTime;
}
}
// Second method. SDK >= 18.
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED && Build.VERSION.SDK_INT >= 18) {
Method getLatencyMethod;
try {
getLatencyMethod = AudioTrack.class.getMethod("getLatency", (Class<?>[]) null);
estimatedAudioLatency = (Integer) getLatencyMethod.invoke(track, (Object[]) null) * 1000000L;
} catch (Exception ignored) {}
}
// If no method has successfully gave us a value, let's try a third method
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
AudioManager audioManager = (AudioManager) CRT.getInstance().getSystemService(Context.AUDIO_SERVICE);
try {
Method getOutputLatencyMethod = audioManager.getClass().getMethod("getOutputLatency", int.class);
estimatedAudioLatency = (Integer) getOutputLatencyMethod.invoke(audioManager, AudioManager.STREAM_MUSIC) * 1000000L;
} catch (Exception ignored) {}
}
// No method gave us a value. Let's use a default value. Better than nothing.
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
estimatedAudioLatency = DEFAULT_AUDIO_LATENCY;
}
return estimatedAudioLatency
}
private static long _framesToNanoSeconds(long frames) {
return frames * 1000000000L / SAMPLE_RATE;
}
The android MediaPlayer class is notoriously slow to begin audio playback, I experienced an issue in an app I was creating where there was a greater than one second delay to begin playing an audio clip. I resolved it by switching to ExoPlayer which resulted in the playback starting within 100ms. I've also read that ffmpeg has even faster start audio startup time than ExoPlayer but I haven't used it so I can't make any promises.
I am developing an Android video player. I use ffmpeg in native code to decode video frame. In the native code, I have a thread called decode_thread that calls avcodec_decode_video2()
int decode_thread(void *arg) {
avcodec_decode_video2(codecCtx, pFrame, &frameFinished,pkt);
}
I have another thread called display_thread that uses aNativeWindow to display a decoded frame on a SurfaceView.
The problem is that if I let the decode_thread run continuously without a delay. It significantly reduces the performance of avcodec_decode_video2(). Sometimes it takes about 0.1 seconds to decode a frame. However if I put a delay on the decode_thread. Something likes this.
int decode_thread(void *arg) {
avcodec_decode_video2(codecCtx, pFrame, &frameFinished,pkt);
usleep(20*1000);
}
The performance of avcodec_decode_video2() is really good, about 0.001 seconds. However putting a delay on the decode_thread is not a good solution because it affects the playback. Could anyone explain the behavior of avcodec_decode_video2() and suggest me a solution?
It looks impossible that the performance of video decoding function would improve just because your thread sleeps. Most likely the video decoding thread gets preempted by another thread, and hence you get the increased timing (hence your thread did not work). When you add a call to usleep, this does the context switch to another thread. So when your decoding thread is scheduled again the next time, it starts with the full CPU slice, and is not interrupted in the decode_ video2 function anymore.
What should you do? You surely want to decode packets a little bit ahead than you show them - the performance of avcodec_decode_video2 certainly isn't constant, and if you try to stay just one frame ahead, you might not have enough time to decode one of the frames.
I'd create a producer-consumer queue with the decoded frames, with the top limit. The decoder thread is a producer, and it should run until it fills up the queue, and then it should wait until there's room for another frame. The display thread is a consumer, it would take frames from this queue and display them.
When using MediaPlayer, I noticed that whenever my phone stucks, the MediaPlayer glitches and then continues playing from the position in the audio it glitched.
This is bad for my implementation since I want the audio to be played at a specific time.
If I have a song of 1000 millisecond length, I want is the ability to set MediaPlayer to start playing at some specific time t, and then exactly stop at at time t+1000.
This means that I actually need two things:
1) Start MediaPlayer at a specific time with a very small delay.
2) Making MediaPlayer glitches ignore the audio they glitched on and continue playing in order to finish the song on time.
The delay of the functions is very important to me and I need the audio to be played exactly(~) at the time it was supposed to be played.
Thanks!
You will need to use possibly mp.getDuration(); and/or mp.getCurrentPosition(); although it's impossible to know exactly what you mean by "I need the audio to be played exactly(~) at the time it was supposed to be played."
Something like this should get you started:
int a = (mp.getCurrentPosition() + b);
Thanks for the answer Mike. but unfortunately this won't help me. Let's say that I asked MediaPlayer to start playing a song of length 3:45 at 00:00. At 01:00 I started using the phone's resources, due to the heavy usage my phone glitched making MediaPlayer pause for 2 seconds.
Time:
00:00-01:00 - I heard the audio at 00:00-01:00
01:00-01:02 - I heard silence because the phone glitched
01:02-03:47 - I heard the audio at 01:00-03:45 with 2 second time skew
Now from what I understood MediaPlayer is a bad choice of usage on this problem domain, since MediaPlayer provides a high level API.I am currently experimenting with the
AudioTrack class which should provide me with what I need:
//Creating a new audio track
AudioTrack audioTrack = new AudioTrack(...)
//Get start time
long start = System.currentTimeMillis();
// loop until finished
for (...) {
// Get time in song
long now = System.currentTimeMillis();
long nowInSong = now - start;
// get a buffer from the song at time nowInSong with a length of 1 second
byte[] b = getAudioBuffer(nowInSong);
// play 1 second of music
audioTrack.write(b, 0, b.length);
// remove any unplayed data
audioTrack.flush();
}
Now if I glitch I only glitch for 1 second and then I correct myself by playing the right audio at the right time!
NOTE
I haven't tested this code but it seems like the right way to do it. If it will actually work I will update this post again.
P.S. seeking in MediaPlayer is:
1. A heavy operation that will surely delay my music (every millisecond counts here)
2. Is not thread safe and cannot be used from multiple threads (seeks, starts etc...)
I created a simple application that generates a square wave of given frequency and plays it using AudioTrack in STREAM mode (STREAM_MUSIC). Everything seems to be working fine and the sound plays okay, however when the stream is finished I get messages in the log:
W/AudioTrack( 7579): obtainBuffer() track 0x14c228 disabled, restarting ...
Even after calling the stop() function I still get these.
I believe I properly set the AudioTrack buffer size, based on minimal size required by AudioTrack (in my case 6x1024). I feed it with smaller buffers of 1024 shorts.
Is it okay that I'm getting these and should I leave it like that?
Ok, I think the problem is solved. The error is generated when the buffer is not completely filled with data on time (buffer underrun) . I have no idea what the timeout is but if you experience this make sure that:
You don't call the play method until you have some data in the buffer.
You can generate the data fast enough to beat the timeout.
After you are finished feeding the buffer with data, before you call stop() method, make sure that the "last" buffer was completely filled with data before timeout.
I dealt with the last issue by always waiting a little (until timeout) then sending 1 buffer full of zeroes and finally calling the stop() function.
Keep in mind that you must always send the buffer in smaller chunks, even if you have the big chunk ready. It still bothers me a bit that I'm not 100% sure if that is the right way but the errors are gone so I guess I can live with that :)
I've found that even when the buffer is technically long enough, and filled with bytes, if they aren't properly formatted (audio shorts converted to a byte array) it will still throw you that error.
I was getting that warning when I instantiated the Audiotrack, called audioTrack.play() and there was a slight delay between the play() call and the audioTrack.write(). If I called play() right before write() the warning disappeared.
I've solved by this
if (mAudioTrack.getPlayState()!=AudioTrack.PLAYSTATE_PLAYING)
mAudioTrack.play();
mAudioTrack.write(b, 0, sz * 2);
mAudioTrack.stop();
mAudioTrack.flush();
I am creating a program which requires me to change a setting in the program when the video reaches specific points (at 1/3 of the video's completion, 2/3's and on completion). Android has a built in callback method for completion so performing an action at that point in time is not difficult. However, I don't know how to go about checking when the video has reached 1/3 and 2/3's of completion.
Using a MediaPlayer control you will get
the total duration of your media file in milliseconds:
myMediaPlayer.getDuration()
you will implement a thread that check every second for the current position at 1/3, 2/3 and 3/3 of the videos completion, with
myMediaPlayer.getCurrentPosition(); //***current Position in milliseconds.