I am using AudioTrack class to play a stream of raw sound data:
AudioTrack audioTrack;
int sampleRate = 11025;
int channelConfigIn = AudioFormat.CHANNEL_IN_MONO;
int channelConfigOut = AudioFormat.CHANNEL_OUT_STEREO;
int audioFormat = AudioFormat.ENCODING_PCM_16BIT;
.....
int bufferSize = AudioTrack.getMinBufferSize(sampleRate,channelConfigOut,audioFormat);
audioTrack = new AudioTrack(AudioManager.STREAM_VOICE_CALL,sampleRate,channelConfigOut,audioFormat,bufferSize,AudioTrack.MODE_STREAM);
audioTrack.play();
Then on a separate thread:
while(true)
{
short [] buffer = new short[14500];
//fill buffer with sound data
long time = System.currentTimeMillis();
audioTrack.write(buffer,0,14500);
Log.i("time",(System.currentTimeMillis() - time) + "");
}
My problem is that the log always shows that the write method blocks for about 0.6 second, that is the same as the played sound length (14500 samples), moreover the phone is not responding during the playback, the main tread almost can not do anything anyone can help...
You are using the blocking version of the write() method. You can use write(float[],int,int,int) instead, passing WRITE_NON_BLOCKING as the 4th parameter, but that would only write as much data as can fit in the play buffer. I generally prefer the approach you already have (dedicating a thread to writing). You should expect each call to block; after all, the play buffer can only fit so much data, and to make more space, sound must be played (which takes time). 14500 samples is ~1.3 seconds of sound at your chosen sample rate, so I'm guessing it takes you about .7 seconds to fill buffer each time.
I cannot tell, based on the code presented, why your UI thread is not responding.
Related
I'm currently trying to playback audio using AudioTrack. Audio is received over the network and application continuously read data and add to an internal buffer. A separate thread is consuming data and using AudioTrack to playback.
Problems:
Audio playback fluctuate (feels like audio drop at a regular interval) continuously making it unclear.
Playback speed is too high or too low making them unrealistic.
In order to avoid the network latency and other factors I made the application to wait till it read enough data and playback at the end.
This makes the audio to play really fast. Here is a basic sample of logic I use.
sampleRate = AudioTrack.getNativeOutputSampleRate(AudioManager.STREAM_MUSIC);
audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate,
AudioFormat.CHANNEL_OUT_STEREO,
AudioFormat.ENCODING_PCM_16BIT,
AudioTrack.getMinBufferSize(sampleRate, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT),
AudioTrack.MODE_STREAM);
audioTrack.play();
short shortBuffer[] = new short[AudioTrack.getMinBufferSize(sampleRate, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT)];
while (!stopRequested){
readData(shortBuffer);
audioTrack.write(shortBuffer, 0, shortBuffer.length, AudioTrack.WRITE_BLOCKING);
}
Is it correct to say that Android AudiTrack class doesn't have in built functionality to control the audio playback based on environment conditions? If so, are there better libraries available with a simplified way for audio playback?
The first issue that I see, it is an arbitrary sampling rate.
AudioTrack.getNativeOutputSampleRate will return the sampling rate that used by the sound system. It may be 44100, 48000, 96000, 192000 or whatever. But looks like you have audio data from some independent source, which produces the data on the very exact sampling rate.
Let's say audio data from the source is sampled at 44100 samples per second. If you start playing it at 96000 it will be speeded up and higher pitched.
So, use the sampling rate setting, along with the number of channels, sample format etc, as it given by the source, not relying on system defaults.
The second: are you sure the readData procedure always will be fast enough to successfully fill the buffer, whatever small the buffer is, and return back faster than the buffer is played?
You have created AudioTrack with AudioTrack.getMinBufferSize passed as bufferSizeInBytes parameter.
The getMinBufferSize function returns a minimum possible size of the buffer that can be used at this parameter. Let's say it returned the size corresponding to a buffer of 10ms length.
That means the new data should be prepared within this time interval. I.e. The time interval between previous write returned control and new write is performed should be less than the time size of the buffer.
So, if the readData function may delay for some reason longer than that time interval, the playback will be paused for that time, you'll hear small gaps in the playback.
The reasons why readData may delay could be various: if it's reading data from the file, then it may delay waiting for IO operations; if it allocates java objects, it may be bumped into garbage collector's delay; if it uses some kind of decoder of another kind of audio source which uses it's own buffering, it may periodically delay refilling the buffer.
But anyway, if you're not creating some kind of real-time synthesizer which should react as soon as possible to the user input, always use the buffer size reasonably high, but not less than getMinBufferSize returned. I.e.:
sampleRate = 44100;// sampling rate of the source
int bufSize = sampleRate * 4; // 1 second length; 4 - is the frame size: 2 chanels * 2 bytes per each sample
bufSize = max(bufSize, AudioTrack.getMinBufferSize(sampleRate, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT)); // Not less than getMinBufferSize returns
audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate,
AudioFormat.CHANNEL_OUT_STEREO,
AudioFormat.ENCODING_PCM_16BIT,
bufSize,
AudioTrack.MODE_STREAM);
Like user #pskink said,
Most likely your sampleRate (or any other parameter passed to the
AudioTrack constructor) is invalid.
So I would start by checking what value you are actually setting the sample rate.
For reference, you can also set the speed of AudioTrack by calling the setPlayBackParams method:
public void setPlaybackParams (PlaybackParams params)
If you check the AudioTrack docs, you can see the PlaybackParams docs and can set the speed and pitch of the output audio. This object can then be passed to set the playback parameters within your AudioTrack object.
However, it is unlikely that you will need to use this if your only issue is the original constructor sampleRate (since we cannot see where the variable sampleRate comes from).
I am creating AudioTrack with following definition.
audioTrack = new AudioTrack(
AudioManager.STREAM_MUSIC,
44100,
AudioFormat.CHANNEL_OUT_MONO,
AudioFormat.ENCODING_PCM_16BIT,
buffer.length * 2,
AudioTrack.MODE_STATIC);
audioTrack.write(buffer, 0, buffer.length);
audioTrack.setPositionNotificationPeriod(500);
audioTrack.setNotificationMarkerPosition(buffer.length);
progressListener = new PlaybackProgress(buffer.length);
audioTrack.setPlaybackPositionUpdateListener(progressListener);
When the audioTrack finishes, the following is called to stop the audio and reset the head position.
private void resetAudioPlayback() {
ViewGroup.LayoutParams params = playbackView.getLayoutParams();
params.width = 0;
playbackView.setLayoutParams(params);
audioTrack.stop();
audioTrack.reloadStaticData();
playImage.animate().alpha(100).setDuration(500).start();
}
The above code works perfectly fine with Android 5.1. But I having issues with 4.4.4. audioTrack.stop() is called but the audio is not stopped, since the reloadStaticData rewinds the audio back to the start position, it replays the audio. but with 5.1, it correctly stops and resets the buffer back to the start of the playback and when play button is pressed, plays from beginning.
Can someone help me how can I this issue with Android 4.4.4?
I'm not absolutely certain if this will solve your problem, but consider using pause() instead of stop(). By documentation, stop() for MODE_STREAM will actually keep playing the remainder of the last buffer that was written. You're using MODE_STATIC, but it might be worth trying.
Also (possibly unrelated), consider that write() returns the number of bytes written, so you shouldn't depend on a single write filling the entire buffer of the AudioTrack every time. write() should be treated like an OutputStream write in that it may not write the entire contents of the buffer it was given, so it's better to write a loop and check how much has been written with each call to write(), then continue to write from a new index in the buffer array until the sum of all the writes equals the length of the buffer.
I am currently writing some code for a sample sequencer in Android. I am using the AudioTrack class. I have been told the only proper way to have accurate timing is to use the timing of the AudioTrack. EG I know that if I write a buffer of X samples to AudioTrack playing at a rate of 44100 samples per second, that the time to write will be (1/44100)X secs.
Then you use that info to know what samples should be written when.
I am trying to implement my first attempt using this approach. I am using only one sample and am writing it as continuous 16th notes at a tempo of 120bpm. But for some reason it is playing at a rate of 240bpm.
First I checked my code to derive the time of a 16th (nanoseconds) note at tempo X. It checks outs.
private void setPeriod()
{
period=(int)((1/(((double)TEMPO)/60))*1000);
period=(period*1000000)/4;
Log.i("test",String.valueOf(period));
}
Then I verified that my code to get the time for my buffer to be played at 44100khz in nanoseconds and it is correct.
long bufferTime=(1000000000/SAMPLE_RATE)*buffSize;
So now I am left thinking that the audio track is playing at a rate that is different from 44100. Maybe 96000khz, which would explain the doubling of speed. But when I instantiate the audioTrack, it was indeed set to 44100khz.
final int SAMPLE_RATE is set to 44100
buffSize = AudioTrack.getMinBufferSize(SAMPLE_RATE, AudioFormat.CHANNEL_OUT_MONO,
AudioFormat.ENCODING_PCM_16BIT);
track = new AudioTrack(AudioManager.STREAM_MUSIC, SAMPLE_RATE,
AudioFormat.CHANNEL_OUT_MONO,
AudioFormat.ENCODING_PCM_16BIT,
buffSize,
AudioTrack.MODE_STREAM);
So I am confused as to why my tempo is being doubled. I ran a debug to compare time elapsed audioTrack to time elapsed system time, and the it seems that the audiotrack is indeed playing twice as fast as it should be. I am confused.
Just to make sure, this is my play loop.
public void run() {
// TODO Auto-generated method stub
int buffSize=192;
byte[] output = new byte[buffSize];
int pos1=0;//index for output array
int pos2=0;//index for sample array
long bufferTime=(1000000000/SAMPLE_RATE)*buffSize;
long elapsed=0;
int writes=0;
currTrigger=trigger[triggerPointer];
Log.i("test","period="+String.valueOf(period));
Log.i("test","bufferTime="+String.valueOf(bufferTime));
long time=System.nanoTime();
while(play)
{
//fill up the buffer
while(pos1<buffSize)
{
output[pos1]=0;
if(currTrigger&&pos2<sample.length)
{
output[pos1]=sample[pos2];
pos2++;
}
pos1++;
}
track.write(output, 0, buffSize);
elapsed=elapsed+bufferTime;
writes++;
//time passed is more than one 16th note
if(elapsed>=period)
{
Log.i("test",String.valueOf(writes));
Log.i("test","elapsed A.T.="+String.valueOf(elapsed)+" elapsed S.T.="+String.valueOf(System.nanoTime()-time));
time=System.nanoTime();
writes=0;
elapsed=0;
triggerPointer++;
if(triggerPointer==16)
triggerPointer=0;
currTrigger=trigger[triggerPointer];
pos2=0;
}
pos1=0;
}
}
}
edited : rephrased and updated due to initial erroneous assumption that system time was used to synchronize sequenced audio :)
As for audio playing back at twice the speed, this is a bit strange as the "write"-method of the AudioTrack is blocking until the native layer has enqueued the next buffer, are you sure the render loop isn't invoked from two different sources (although I assume from your example you invoke the loop from within a thread).
However, what is certain is that there is a time synchronization issue to address: the problem herein lies with the calculation of the buffer time you use in your example:
(1000000000/SAMPLE_RATE)*buffSize;
Which will always return 4353741 at a buffer size of 192 samples at a sample rate of 44100 Hz, thus disregarding any cues in tempo (for instance this will be the same at 300 BPM or 40 BPM), Now, in your example this doesn't have any consequences for the actual syncing per se, but I'd like to point this out as we'll return to it shortly further on in this text.
Also, nanoseconds are a nicely precise unit, but too much as milliseconds will suffice for audio operations. As such, I will continue the illustration in milliseconds.
Your calculation for the period of a 16th note at 120 BPM indeed checks out at the correct value of 125 ms. The previously mentioned calculation for the period corresponding to each buffer size is 4.3537 ms. This indicates you will iterate the buffer loop 28.7112 times before the time of a single sixteenth note passes. In your example however, you check whether the "offset" for this sixteenth note has passed at the END of the buffer iteration loop (where the period for a single buffer has already been added to the elapsed time!), by using:
elapsed>=period
Which will already lead to drift at the first occasion as at this moment "elapsed" would be at (192 * 29 iterations) 5568 samples (or 126.26 ms), rather than at (192 * 28.7112 iterations) 5512 samples (or 126 ms). This is a difference of 56 samples (or when speaking in time : 1.02 ms). This wouldn't of course lead to samples playing back FASTER than expected (as you stated), but already leads to a irregularity in playback. For the second 16th note (which would occur at the 57.4224th iteration, the drift would be 11136 - 11025 = 111 samples or 2.517 ms (more than half your buffer time!) As such, you must perform this check WITHIN the
while(pos1<buffSize)
loop, where you are incrementing the read pointer up until the size of the buffer has been reached. As such you will need to increase another variable by a fraction of the buffer period PER buffer sample.
I hope the above example illustrates why I'd initially proposed counting time by sample iterations rather than elapsed time (of course the samples DO indicate time, as they are merely translations of a unit of time to an amount of samples in a buffer, but you can use these numbers as the markers, rather than adding a fixed interval to a counter as in your render loop).
First of all, some convenience math to help you with getting these values :
// calculate the amount of samples are necessary for storing the given length of time
// ( in milliSeconds ) at the given sample rate ( in Hz )
int millisecondsToSamples( int milliSeconds, int sampleRate )
{
return ( int ) ( milliSeconds * ( sampleRate / 1000 ));
}
OR : These calculations which are more convenient when thinking in a musical context like you mentioned in your post. Calculate the amount of samples that are present in a single bar of music at the given sample rate ( in Hz ), tempo ( in BPM ) and time signature ( timeSigBeatUnit being the "4" and timeSigBeatAmount being the "3" in a time signature of 3/4 - although most sequencers limit themselves to 4/4 I've added the calculation for explaining the logic).
int samplesPerBeat = ( int ) (( sampleRate * 60 ) / tempo );
int samplesPerBar = samplesPerBeat * timeSigBeatAmount;
int samplesPerSixteenth = ( int ) ( samplesPerBeat / 4 ); // 1/4 of a beat being a 16th
etc.
The way you then write the timed samples into the output buffer is by keeping track of the "playback position' in your buffer callback, i.e. each time you write a buffer, you'll be incrementing the playback position with the length of the buffer. Returning to a musical context: if you were to be "looping a single bar of 120 bpm in 4/4 time", when the playback position would exceed (( sampleRate * 60 ) / 120 * 4 = 88200 samples, you reset it to 0 to "loop" from the beginning.
So let's assume you have two "events" of audio that occur in a sequence of a single bar of 4/4 time at 120 BPM. One event is to play on the 1st beat of a bar and lasts for a quaver (1/8 of a bar) and the other is to play on the 3rd beat of a bar and lasts for another quaver. These two "events" (which you could represent in a value object) would have the following properties, for the first event:
int start = 0; // buffer position 0 is at the 1st beat/start of the bar
int length = 11025; // 1/8 of the full bar size
int end = 11025; // start + length
and the second event:
int start = 44100; // 3rd beat (or half-way through the bar)
int length = 11025;
int end = 55125; // start + length
These value objects could have two additional properties such as "sample" which could be the buffer containing the actual audio and "readPointer" which would hold the last sample-buffer index the sequencer has read from last.
Then in the buffer write loop:
int playbackPosition = 0; // at start of bar
int maximumPlaybackPosition = 88200; // i.e. a single bar of 4/4 at 120 bpm
public void run()
{
// loop through list of "audio events" / samples
for ( CustomValueObject audioEvent : audioEventList )
{
// loop through the buffer length this cycle will write
for ( int i = 0; i < bufferSize; ++i )
{
// calculate "sequence position" from playback position and current iteration
int seqPosition = playbackPosition + i;
// sequence position within start and end range of audio event ?
if ( seqPosition >= audioEvent.start && seqPosition <= audioEvent.end )
{
// YES! write its sample content into the output buffer
output[ i ] += audioEvent.sample[ audioEvent.readPointer ];
// update the sample read pointer to the next slot (but keep in bounds)
if ( ++audioEvent.readPointer == audioEvent.length )
audioEvent.readPointer = 0;
}
}
// update playback position and keep within sequencer range for looping
if ( playbackPosition += bufferSize > maximumPosition )
playbackPosition -= maximumPosition;
}
}
This should give you a perfectly timed approach in writing audio. There's still some magic you have to work out when you're hitting the iteration where the sequence will loop (i.e. read the remaining unprocessed buffer length from the start of the sample for seamless looping) but I hope this gives you a general idea on a working approach.
I'm working a somewhat ambitious project to get active noise-reduction achieved on Android with earbuds or headphones on.
My objective is to record ambient noise with the android phone mic, invert the phase (a simple *-1 on the short-value pulled from the Audio Record?), and playback that inverted waveform through the headphones. If the latency and amplitude are close to correct, it should nullify a good amount of mechanical structured noise in the environment.
Here's what I've got so far:
#Override
public void run()
{
Log.i("Audio", "Running Audio Thread");
AudioRecord recorder = null;
AudioTrack track = null;
short[][] buffers = new short[256][160];
int ix = 0;
/*
* Initialize buffer to hold continuously recorded audio data, start recording, and start
* playback.
*/
try
{
int N = AudioRecord.getMinBufferSize(8000,AudioFormat.CHANNEL_IN_MONO,AudioFormat.ENCODING_PCM_16BIT);
recorder = new AudioRecord(MediaRecorder.AudioSource.MIC, 8000, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, N*10);
//NoiseSuppressor ns = NoiseSuppressor.create(recorder.getAudioSessionId());
//ns.setEnabled(true);
track = new AudioTrack(AudioManager.STREAM_MUSIC, 8000,
AudioFormat.CHANNEL_OUT_MONO, AudioFormat.ENCODING_PCM_16BIT, N*10, AudioTrack.MODE_STREAM);
recorder.startRecording();
track.play();
/*
* Loops until something outside of this thread stops it.
* Reads the data from the recorder and writes it to the audio track for playback.
*/
while(!stopped)
{
short[] buffer = buffers[ix++ % buffers.length];
N = recorder.read(buffer,0,buffer.length);
for(int iii = 0;iii<buffer.length;iii++){
//Log.i("Data","Value: "+buffer[iii]);
buffer[iii] = buffer[iii] *= -1;
}
track.write(buffer, 0, buffer.length);
}
}
catch(Throwable x)
{
Log.w("Audio", "Error reading voice audio", x);
}
/*
* Frees the thread's resources after the loop completes so that it can be run again
*/
finally
{
recorder.stop();
recorder.release();
track.stop();
track.release();
}
}
I was momentarily excited to find the Android API actually already has a NoiseSuppression algorithm (you'll see it commented out above). I tested with it and found NoiseSuppressor wasn't doing much to null out constant tones which leads me to believe it's actually just performing a band-pass filter at non-vocal frequencies.
So, my questions:
1) The above code takes about 250-500ms from mic record through playback in headphones. This latency sucks and it would be great to reduce it. Any suggestions there would be appreciated.
2) Regardless of how tight the latency is, my understanding is that the playback waveform WILL have phase offset from the actual ambient noise waveform. This suggests I need to execute some kind of waveform matching to calculate this offset and compensate. Thoughts on how that gets calculated?
3) When it comes to compensating for latency, what would that look like? I've got an array of shorts coming in every cycle, so what would a 30ms or 250ms latency look like?
I'm aware of fundamental problems with this approach being that the location of the phone being not next to the head is likely to introduce some error, but I'm hopeful with some either dynamic or fixed latency correction it maybe be possible to overcome it.
Thanks for any suggestions.
Even if you were able to do something about the latency, it's a difficult problem as you don't know the distance of the phone from the ear, plus there's the fact that distance is not fixed (as the user will move the phone), plus the fact that you don't have a microphone for each ear (so you can't know what the wave will be at one ear until after it's got there, even if you have zero latency)
Having said that, you might be able to do something that could cancel highly periodic waveforms. All you could do though is allow the user to manually adjust the time delay for each ear - as you have no microphones near the ears themselves, you can have no way in your code to know if you're making the problem better or worse.
How to calculate the elapsed time from AudioRecord in Android?
What I am trying to do is similar to this: figure 5(Second graph). To further explain myself I'm recording sound in real time then graphing out the pitch and calculating the buffered data within a certain time.
Just calculate it, you know the size of your recorded data and you also should know sample rate, and number of channels. So the comment posted by Michael is correct.
I followed Michael's comment and succesfully calculated the audio length in seconds after recording it.
int SAMPLE_RATE_IN_HZ = 44100;
int AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT; // Guaranteed to be supported by devices
int CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO; // Is guaranteed to work on all devices
File audioFile = yourRecordingLogicHere();
long audioLengthInSeconds = audioFile.length() / (2 * SAMPLE_RATE_IN_HZ);
I record in mono that's why there is one *2 multiplication less than in Michael's comment.