Android AudioTrack latency with playback

Android AudioTrack latency with playback - android

I am trying to play raw sound data using AudioTrack class in Android, I am using the write method, but I noticed that there is a latency between the write method returns and the actual sound is played, to make it simple let us use AudioRecord class as the following psedu code:
//init AudioTrack
//init AudioRecord
while(true){
byte [] buffer = new byte[1000];
int read = audioRecord(buffer,0,1000);
audioTrack.write(buffer,0,read);
}
I expect to get latency that is read / sample rate seconds but the actual sound is played after and extra of about 0.5 seconds, I really need the audio to be played with minimum latency, so does anyone has an explanation of what is going on and is there any available solution or should I accept this as it is a hardware issue?

I'm assuming your goal is to come up with some interactive audio solution (that is, where sound is played in response to some user action), because in this scenario low latency really matters.
On Android, to achieve the lowest latency you need to use Open SL ES API which is available to native (C++) code via NDK. The only Java side mechanism that can achieve low latency is SoundPool class, but it has limitations in what kind of sounds you can play.
For more information, see the page on high-performance audio, and also check out this SO answer: Low-latency audio playback on Android

Related

What is the amplification factor required for android's media recorder to match the output of iOS's AVAudioRecorder?

I have a cross-platform(iOS and Android) app where I will record audio clips then send it to the server to do some machine learning operations. In my iOS app, I use AVAudioRecorder for recording the audio. In the Android app, I use MediaRecorder for recording the audio. In the mobile initially, I use m4a format because of size constrictions. After reaching the server I will convert it to wav format before using it in the ML operations.
My Problem is, in iOS the AVAudioRecorder by OS default does a factor of Amplification to the raw audio data before we the developer get access to the raw data. But in Android, the MediaRecorder doesn't provide any sort of default Amplification to the raw data. In other words, in iOS I will never get the raw audio stream from the microphone whereas in Android I will always only get the raw audio stream from the microphone. The distinction is clearly visible if you can record the same audio in both iPhone and Android phones side by side with a common audio source, then import the recorded audio in Audacity for visual representation. I have attached a sample representation screenshot below.
In the image, the first track is the Android recording and the second track is from the iOS recording. When I hear both the audio through headphones I can vaguely distinguish them but when I visualize the data points, you can clearly see the difference in the image. These distinctions are bad for ML operations.
Clearly in the iPhone, there is a certain amplification factor involved which I would like to implement in the Android also.
Is anyone aware of the amplification factor? OR are there any other possible alternatives?

It's quite possible that the difference is that the effect of Automatic Gain Control.
You can disable this in your app's AVAudioSession by setting its mode to AVAudioSessionModeMeasurement which you do once in your application - usually at startup. This disables a great deal of input signal processing.
Reading your problem description, you might be better off enabling AGC on Android.
If neither of these yields results, you might want to gain scale both signals so they are just below clipping.
let audioSession = AVAudioSession.sharedInstance()
audio.session.setMode(AVAudioSessionModeMeasurement)

How to record microphone to more compressed format during WebRTC call on Android?

I have an app calling using WebRTC. But during a call, I need to record microphone. WebRTC has an object WebRTCAudioRecord to record audio but the audio file is so large (PCM_16bit). I want to record but to a smaller size.
I've tried MediaRecorder but it doesn't work because WebRTC is recorded and MediaRecorder does not have permission to record while calling.
Has anyone done this, or have any idea that could help me?

Webrtc is considered as comparatively much better pre-processing tool for Audio and Video.
Webrtc native development includes fully optimized native C and C++ classes, In order to maintain wonderful Speech Quality and Intelligibility of audio and video which is quite interesting.
Visit Reference Link: https://github.com/jitsi/webrtc/tree/master/examples regularly.
As Problem states;
I want to record but smaller size. I've tried MediaRecorder and it doesn't work because WebRtc is recorded and MediaRecorder has not permission to record while calling.
First of all, to reduce or minimize the size of your recorded data (audio bytes), you should look at different types of speech codecs which basically reduce the size of recorded data by maintaining sound quality at a level. To see different voice codecs, here are well-known speech codecs as follows:
OPUS
SPEEX
G7.11 (G-Series Speech Codecs)
As far as size of the audio data is concerned, it basically depends upon the Sample Rate and Time for which you record a chunk or audio packet.
Supppose time = 40ms ---then---> Reocrded Data = 640 bytes (or 320 short)
Size of recorded data is **directly proportional** to both Time and Sample rate.
Sample Rate = 8000 or 16000 etc. (greater the sample rate, greater would be the size)
To see in more detail visit: fundamentals of audio data representation. But Webrtc mainly process 10ms audio data for pre-processing in which packet size is reduced up to 160 bytes.
Secondly, If you want to use multiple AudioRecorder instances at a time, then it is practically impossible. As WebRtc is already recording from microphone then practically MediaRecorder instance would not perform any function as this answer depicts audio-record-multiple-audio-at-a-time. Webrtc has following methods to manage audio bytes such as;
1. Push input PCM data into `ProcessCaptureStream` to process in place.
2. Get the processed PCM data from `ProcessCaptureStream` and send to far-end.
3. The far end pushed the received data into `ProcessRenderStream`.
I have maintained a complete tutorial related to audio processing using Webrtc, you can visit to see more details; Android-Audio-Processing-Using-Webrtc.

There are two parts for the solution:
Get the raw PCM audio frames from webrtc
Save them to a local file in compressed size so that it can be played out later
For the first part you have to attach the SamplesReadyCallback while creating audioDeviceManager by calling the setSamplesReadyCallback method of JavaAudioDeviceModule. This callback will give you the raw audio frames captured by webrtc's AudioRecord from the mic.
For the second part you have to encode the raw frames and write into a file. Check out this sample from google on how to do it - https://android.googlesource.com/platform/frameworks/base/+/master/packages/SystemUI/src/com/android/systemui/screenrecord/ScreenInternalAudioRecorder.java#234

what is the best audiosource setting for calls?

I am using the following code in my messenger calling app :
this.audioRecord = new AudioRecord(
MediaRecorder.AudioSource.DEFAULT,
Constants.SAMPLE_RATE,
AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT,
Constants.BUFFER_SIZE_RECORDING);
Is this the best setting for audio in calls? I have a couple of issues with echos. I tried AudioSource.MIC and VOICE_COMMUNICATION, but they perform worse. I wonder if changing any of the other variables would improve the audio quality? Any ideas about the best variable for a calling app.Also, I don't hear any audio often on Nexus 6 or pixel 2

Audio on Android is always a tough problem because manufacturers put in different audio chips with different capabilities in all phones.
That being said VOICE_COMMUNICATION should be your best bet. It is "Microphone audio source tuned for voice communications such as VoIP. It will, for instance, take advantage of echo cancellation or automatic gain control if available."
So it should already use AcousticEchoCanceler and NoiseSuppressor to get rid of echoes and other disturbing noises. But in the end, it comes down to your use-case, if you rather want unfiltered or filtered audio.
You can also try to increase the sampling rate (Constants.SAMPLE_RATE 48000 should be best since that is the sampling rate of most modern phones) and bit depth (ENCODING_PCM_16BIT to ENCODING_PCM_FLOAT) to get a better signal. Note that supported sampling rates differ from phone to phone. To find out what your phone supports adapt the solution from audio sampling rates discussion. More information about sampling rates is covered in the Sampling Audio docs.
To the problem that you often don't hear anything, that can happen if either your gain is too low (can happen with AudioSource.MIC) or if your recorder is not ready yet (I'm making an educated guess here since I don't know your code).

Gap buffering in Android devices

I'm building a buffering engine to play streams from url.. I need to buffering both mp3 and aac ( on device that can support it ) so I can't pass directly the url to MediaPlayer.. I tried this method: I have 2 synchronized thread, one that running creates some file with data from buffer and the second playing files created: the problem is that when mediaplayer switch from a file to another, there is a little gap... how can I remove it?? is very annoying...
Maybe my method is wrong, if so can anyone provide a working method without chopping sound??
Thank you very much in advance..

It seems you are trying to implement Gapless Playback. (Right ? )
Towards this you need to define level of Gapless Playback you want to achieve. Should it be across fileformats / codecs, audio attributes like sample rate, number of channels etc.
With your approach, you ll surely see gaps across different streams. (Fileformats , compression, audio attributes).
To achive true Gapless playback at application level (My Approach) you need to do the following
Implement custom stack, that would take the input files, decode it and produce pcm samples. This stack will have Parsers (MP3, AAC), and decoders (MP3, AAC..)
Pass pcm samples through resampler, to produce pcm samples having same sample rate.
Add buffering modules at input (File) and output (resampled pcm data).
Use AudioTrack class of Android SDK for playout.
If you stick to one fileformat, Codec and audio attributes, then at application level, you can concatenate all the files in the playlist and provide it to MediaPLayer for playback. (Since audio streams have less size, this solution can be practical. Only obstacle would be streams attributes. If the Audio OMX Components within Android Multimedia stack support dynamic reconfiguration, then this should be no issue at all)
Shash

Improve Android Audio Recording quality?

Is there any way to record audio in high quality?
And how can I read information that user is saying something? In Audio Recording application you can see such indicator (I don't know the right name for it).

At the moment, a big reason for poor audio quality recording on Android is the codec used by the MediaRecorder class (the AMR-NB codec). However, you can get access to uncompressed audio via the AudioRecord class, and record that into a file directly.
The Rehearsal Assistant app does this to save uncompressed audio into a WAV file - take a look at the RehearsalAudioRecord class source code.
The RehearsalAudioRecord class also provides a getMaxAmplitude method, which you can use to detect the maximum audio level since the last time you called the method (MediaRecorder also provides this method).

For recording and monitoring: You can use the sound recorder activity.
Here's a snippet of code:
Intent recordIntent = new Intent(
MediaStore.Audio.Media.RECORD_SOUND_ACTION);
startActivityForResult(recordIntent, REQUEST_CODE_RECORD);
For a perfect working example of how to record audio which includes an input monitor, download the open source Ringdroid project: https://github.com/google/ringdroid
Look at the screenshots and you'll see the monitor.
For making the audio higher quality, you'd need a better mic. The built in mic can only capture so much (which is not that good). Again, look at the ringdroid project, glean some info from there. At that point you could implement some normalization and amplification routines to improve the sound.

I give you a simple answer.
for samplerate, about the quality, 48000 is almost the same as 16000.
for bitrate, about the quality, 96Kbps is much better than 16Kbps.
you can try stereo(channelCount = 2), but make little change.
So, for android phones, just set the audio bit rate bigger, you will get the better quality.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.