Playing back sound coming from microphone in real-time

Playing back sound coming from microphone in real-time - android

I've been trying to get my application recording the sound coming from the microphone and playing it back in (approximately) real-time, however without success.
I'm using AudioRecord and AudioTrack classes for record and playback, respectively. I've tried different approaches, I've tried to record the incoming sound and write it to a file and it worked fine. I've also tried to playback sound from that file AFTER with AudioTrack and it worked fine too. The problem is when I try to play the sound in real-time, instead of reading a file after it's written.
Here is the code:
//variables
private int audioSource = MediaRecorder.AudioSource.MIC;
private int samplingRate = 44100; /* in Hz*/
private int channelConfig = AudioFormat.CHANNEL_CONFIGURATION_MONO;
private int audioFormat = AudioFormat.ENCODING_PCM_16BIT;
private int bufferSize = AudioRecord.getMinBufferSize(samplingRate, channelConfig, audioFormat);
private int sampleNumBits = 16;
private int numChannels = 1;
// …
AudioRecord recorder = new AudioRecord(audioSource, samplingRate, channelConfig, audioFormat, bufferSize);
recorder.startRecording();
isRecording = true;
AudioTrack audioPlayer = new AudioTrack(AudioManager.STREAM_MUSIC, 44100, AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT, bufferSize, AudioTrack.MODE_STREAM);
if(audioPlayer.getPlayState() != AudioTrack.PLAYSTATE_PLAYING)
audioPlayer.play();
//capture data and record to file
int readBytes=0, writtenBytes=0;
do{
readBytes = recorder.read(data, 0, bufferSize);
if(AudioRecord.ERROR_INVALID_OPERATION != readBytes){
writtenBytes += audioPlayer.write(data, 0, readBytes);
}
}
while(isRecording);
It is thrown a java.lang.IllegalStateException with the reason being caused by "play() called on a uninitialized AudioTrack".
However, if I change the AudioTrack initialization for example to use sampling rate 8000Hz and sample format 8 bits (instead of 16), it doesn't throw the exception anymore and the application runs, although it produces horrible noise.
When I play AudioTrack from a file, there is no problem with the initialization of the AudioTrack, I tried 44100 and 16 bits and it worked properly, producing the correct sound.
Any help ?

All native Android audio is encoded. You can only play out PCM formats in real time, or use a special streaming codec, which I don't think is trivial on Android.
The point is that if you want to record/play out audio simultaneously, you would have to create your own audio buffer and store raw PCM-encoded audio samples in there (I'm not sure if you're thinking duh! or whether this is all over your head, so I'll try to be clear but not to chew your own gum).
PCM is a digital representation of an analog signal in which your audio samples are a set of "snapshots" of the original acoustic wave. Because all kinds of clever mathematicians and engineers saw the potential in trying to reduce the number of bits you represent this data with, they came up with all sorts of encoders. The encoded (compressed) signal is represented very differently from the raw PCM signal and has to be decoded (en-cod-er+dec-oder = codec). Unless you're using special algorithms and media streaming codecs, it's impossible to play back an encoded signal like you're trying to, because it's not encoded sample by sample, but rather frame by frame, where you need the whole frame of samples, if not the complete signal, to decode this frame.
The way to do it is to manually store audio samples coming from the microphone buffer and manually feeding them to the output buffer. You will have to do some coding for that, but I believe there are some open-source apps that you can look at and take a peak at their source (unless you're willing to sell your app later on, of course, but that's a whole different discussion).
If you're developing for Android 2.3 or later and are not too scared of programming in native code, you can try using OpenSL ES. The Android-specific features of OpenSL ES are listed here. This platform allows you somewhat more flexible audio manipulation and you might find just what you need, if your app will be highly reliant on audio processing.

It is thrown a java.lang.IllegalStateException with the reason being
caused by "play() called on a uninitialized AudioTrack".
It is because the buffer size too small. I tried "bufferSize += 2048;", it's ok then.

I had a similar problem and I solved it by adding this permission to the manifest file:
<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS"/>

make sure that your var data is enough for samplingRate
Ex: if you use samplingRate as 44100 your data bytearrays's length should be 44101 or more

Related

Use android AudioRecord to get 32 bit samples?

I've been doing a big research and could not find lots of info about this. I want to record audio samples using android AudioRecorder but I don't know how to do it in 32 bit format. Im using also AudioTrack to "monitor" the mic and listen to the audio while recording but if I set AudioManager and AudioTrack to ENCODING_PCM_FLOAT the audio literally is null, I can't listen anything.
So the question is how to record and listen at the same time with AudioFormat.ENCODING_PCM_FLOAT in 32 bit samples so I can get the best quality audio.
This is a quick look of how Im initializing AudioRecord using AXET Audio Library:
final AudioRecord recorder = Sound.createAudioRecorder(context, sampleRate, ss, 0);
AudioManager audioManager = (AudioManager) context.getSystemService(Context.AUDIO_SERVICE);
final int maxJitter = AudioTrack.getMinBufferSize(sampleRate,AudioFormat.CHANNEL_OUT_MONO,AudioFormat.ENCODING_PCM_16BIT);
audioTrack = new AudioTrack(AudioManager.MODE_IN_COMMUNICATION,sampleRate,AudioFormat.CHANNEL_OUT_MONO,
AudioFormat.ENCODING_PCM_FLOAT,maxJitter, AudioTrack.MODE_STREAM);
audioManager.setSpeakerphoneOn(false);
audioManager.setMode(AudioManager.MODE_IN_COMMUNICATION);
Inside the Thread I just write the buffer to the audioTrack:
audioTrack.write(buffer,0,bufferSize);
My buffer is a short[] array.
Finally I run thread and play the track, as I told in PCM_16BIT works great, but in PCM_FLOAT doesn't:
thread.start();
audioTrack.play();
So if anybody has experience recording in 32 bit I would be appreciated in knowing how it's done. Thanks.

Did you change short array to float when using ENCODING_PCM_FLOAT? When using 32-bit encoding there is also a different method for reading data from AudioRecord:
read(float[] audioData, int offsetInFloats, int sizeInFloats, int readMode)

Regulate Android AudioTrack playback speed

I'm currently trying to playback audio using AudioTrack. Audio is received over the network and application continuously read data and add to an internal buffer. A separate thread is consuming data and using AudioTrack to playback.
Problems:
Audio playback fluctuate (feels like audio drop at a regular interval) continuously making it unclear.
Playback speed is too high or too low making them unrealistic.
In order to avoid the network latency and other factors I made the application to wait till it read enough data and playback at the end.
This makes the audio to play really fast. Here is a basic sample of logic I use.
sampleRate = AudioTrack.getNativeOutputSampleRate(AudioManager.STREAM_MUSIC);
audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate,
AudioFormat.CHANNEL_OUT_STEREO,
AudioFormat.ENCODING_PCM_16BIT,
AudioTrack.getMinBufferSize(sampleRate, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT),
AudioTrack.MODE_STREAM);
audioTrack.play();
short shortBuffer[] = new short[AudioTrack.getMinBufferSize(sampleRate, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT)];
while (!stopRequested){
readData(shortBuffer);
audioTrack.write(shortBuffer, 0, shortBuffer.length, AudioTrack.WRITE_BLOCKING);
}
Is it correct to say that Android AudiTrack class doesn't have in built functionality to control the audio playback based on environment conditions? If so, are there better libraries available with a simplified way for audio playback?

The first issue that I see, it is an arbitrary sampling rate.
AudioTrack.getNativeOutputSampleRate will return the sampling rate that used by the sound system. It may be 44100, 48000, 96000, 192000 or whatever. But looks like you have audio data from some independent source, which produces the data on the very exact sampling rate.
Let's say audio data from the source is sampled at 44100 samples per second. If you start playing it at 96000 it will be speeded up and higher pitched.
So, use the sampling rate setting, along with the number of channels, sample format etc, as it given by the source, not relying on system defaults.
The second: are you sure the readData procedure always will be fast enough to successfully fill the buffer, whatever small the buffer is, and return back faster than the buffer is played?
You have created AudioTrack with AudioTrack.getMinBufferSize passed as bufferSizeInBytes parameter.
The getMinBufferSize function returns a minimum possible size of the buffer that can be used at this parameter. Let's say it returned the size corresponding to a buffer of 10ms length.
That means the new data should be prepared within this time interval. I.e. The time interval between previous write returned control and new write is performed should be less than the time size of the buffer.
So, if the readData function may delay for some reason longer than that time interval, the playback will be paused for that time, you'll hear small gaps in the playback.
The reasons why readData may delay could be various: if it's reading data from the file, then it may delay waiting for IO operations; if it allocates java objects, it may be bumped into garbage collector's delay; if it uses some kind of decoder of another kind of audio source which uses it's own buffering, it may periodically delay refilling the buffer.
But anyway, if you're not creating some kind of real-time synthesizer which should react as soon as possible to the user input, always use the buffer size reasonably high, but not less than getMinBufferSize returned. I.e.:
sampleRate = 44100;// sampling rate of the source
int bufSize = sampleRate * 4; // 1 second length; 4 - is the frame size: 2 chanels * 2 bytes per each sample
bufSize = max(bufSize, AudioTrack.getMinBufferSize(sampleRate, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT)); // Not less than getMinBufferSize returns
audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, sampleRate,
AudioFormat.CHANNEL_OUT_STEREO,
AudioFormat.ENCODING_PCM_16BIT,
bufSize,
AudioTrack.MODE_STREAM);

Like user #pskink said,
Most likely your sampleRate (or any other parameter passed to the
AudioTrack constructor) is invalid.
So I would start by checking what value you are actually setting the sample rate.
For reference, you can also set the speed of AudioTrack by calling the setPlayBackParams method:
public void setPlaybackParams (PlaybackParams params)
If you check the AudioTrack docs, you can see the PlaybackParams docs and can set the speed and pitch of the output audio. This object can then be passed to set the playback parameters within your AudioTrack object.
However, it is unlikely that you will need to use this if your only issue is the original constructor sampleRate (since we cannot see where the variable sampleRate comes from).

Android AudioRecord clipping

I'm trying to record some audio from the microphone using AudioRecord.
The recording works, but the volume is way to loud and I'm getting horrible clipping.
I tried to use AutomaticGainControl, but it is not available on my device.
Is there any other way to lower the volume either automatically or manually?
This is the code I'm using:
sampleRate = 44100
channel = AudioFormat.CHANNEL_IN_MONO
encoding = AudioFormat.ENCODING_PCM_16BIT
audioRecord =new AudioRecord(MediaRecorder.AudioSource.DEFAULT, //also tried VOICE_RECOGNITION
sampleRate, channel, encoding,
bufferSize)
audioRecord.startRecording()

Turns out what I was hearing wasn't clipping caused by volume, but by wrong byte order. The clip was recorded with little-endian and I was replaying it with big-endian. Because of that lower volume values were just amplified and otherwise unaffected, since the most significant bits were all 0, but when the volume was higher, the MSB were not 0 and the overall value got corrupted into white noise.

need to understad how AudioRecord and AudioTrack work for raw PCM capture and playback

I use the following code in a Thread to capture raw audio samples from the microphone and play it back through the speaker.
public void run(){
short[] lin = new short[SIZE_OF_RECORD_ARRAY];
int num = 0;
// am = (AudioManager) this.getSystemService(Context.AUDIO_SERVICE); // -> MOVED THESE TO init()
// am.setMode(AudioManager.MODE_IN_COMMUNICATION);
record.startRecording();
track.play();
while (passThroughMode) {
// while (!isInterrupted()) {
num = record.read(lin, 0, SIZE_OF_RECORD_ARRAY);
for(i=0;i<lin.length;i++)
lin[i] *= WAV_SAMPLE_MULTIPLICATION_FACTOR;
track.write(lin, 0, num);
}
// /*
record.stop();
track.stop();
record.release();
track.release();
// */
}
where record is an AudioRecord and track is an Audiotrack. I need to know in detail (and in a simplified way if possible) how the AudioRecord stores PCM data and AudioTrack plays PCM data. This is how I have understood it so far:
As the while() loop is continuously running, record obtains SIZE_OF_RECORD_ARRAY number of samples (which is 1024 for now) as shown in the figure. The samples get saved contiguously in the lin[] array of shorts (16 bit shorts, as I am using 16 bit PCM encoding). This is done by record.read(). Then track.write() places these samples in the speaker which is played by the hardware. Is this correct or am I missing something here?

As for how the samples are laid out in memory; they're just arrays of linear approximations to a sound wave, taken at discrete times (like your figure shows). In the case of stereo, the samples will be interleaved (LRLRLRLR...).
When it comes to the path the audio takes, you're essentially right, although there are a few more steps involved:
Writing data to your Java AudioTrack causes it to make a JNI (Java Native Interface) call to a native helper class, which in turn calls the native AudioTrack class.
The AudioTracks are owned by the AudioFlinger, which periodically takes data from all the AudioTracks on a given output thread (which have been mixed by the AudioMixer) and writes it to the audio HAL output stream class.
From there the data goes to the user-space ALSA library, and through a couple of intermediate steps to the kernel-space PCM driver. Then further on from there; typically going through some kind of DSP that applies various acoustic compensation filters, and eventually making it's way to the hardware codec, which controls the speaker DAC and amplifiers.
When recording from the internal microphone(s) you'd have more or less the same steps, except that they'd be done in the opposite order.
Note that some of these steps (essentially everything from the audio HAL and below) are platform-specific, and therefore might differ between platforms from different vendors (and even different platforms from the same vendor).

AudioRecord and AudioTrack latency

I'm trying to develop an aplication like iRig for android, so the first step is to capture the mic input and play it at the same time.
I have it, but the problem is that i get some latency that makes this unusable, and if I start processing the buffer i'm afraid it will get totally unusable.
I use audiorecord and audiotrack like this:
new Thread(new Runnable() {
public void run() {
while(mRunning){
mRecorder.read(mBuffer, 0, mBufferSize);
//Todo: Apply filters here into the buffer and then play it modified
mPlayer.write(mBuffer, 0, mBufferSize);
//Log.v("MY AMP","ARA");
}
And the inicialization this way:
// ==================== INITIALIZE ========================= //
public void initialize(){
mBufferSize = AudioRecord.getMinBufferSize(mHz,
AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT);
mBufferSize2 = AudioTrack.getMinBufferSize(mHz,
AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT);
mBuffer = new byte[mBufferSize];
Log.v("MY AMP","Buffer size:" + mBufferSize);
mRecorder = new AudioRecord(MediaRecorder.AudioSource.MIC,
mHz,
AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT,
mBufferSize);
mPlayer = new AudioTrack(AudioManager.STREAM_MUSIC,
mHz,
AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT,
mBufferSize2,
AudioTrack.MODE_STREAM);
}
do you know how to get a faster response?
Thanks!

Android's AudioTrack\AudioRecord classes have high latency due to minimum buffer sizes.
The reason for those buffer sizes is to minimize drops when GC's occur according to Google (which is a wrong decision in my opinion, you can optimize your own memory management).
What you want to do is use OpenSL, which is available from 2.3. It contains native APIs for streaming audio.
Here's some docs:
http://mobilepearls.com/labs/native-android-api/opensles/index.html

Just a thought, but shouldn't you be reading < mBufferSize

As mSparks pointed out, streaming should be made using smaller read size: you don't need to read the full buffer to stream data!
int read = mRecorder.read(mBuffer, 0, 256); /* Or any other magic number */
if (read>0) {
mPlayer.write(mBuffer, 0, read);
}
This will reduce drastically your latency. If mHz is 44100 and your are in MONO configuration with 256 your latency will be no less then 1000 * 256/44100 milliseconds = ~5.8 ms.
256/44100 is the conversion from samples to seconds, so multiplying by 1000 gives you milliseconds.
The problems is internal implementation of the player. You don't have control about that from java. Hope this helps someone :)

My first instict was to suggest initting AudioTrack into static mode rather than streaming mode, since static mode has notably smaller latency. However, Static Mode is more appropriate for short sounds that fit entirely in memory rather than a sound you are capturing from elsewhere. But just as a wild guess, what if you set AudioTrack to static mode and feed it discrete chunks of your input audio?
If you want tighter control over audio, I'd recommend taking a look at OpenSL ES for Android. The learning curve will be a bit steeper, but you get much more fine-grained control and lower latency.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.