Android Superpowered SDK Record and Playback simultaneously - android

My goal is to play local file while recording device's microphone input with low-latency.
I've come to Superpowered library, because from the documentation it provides low-latency feature.
I've created the player using SuperpoweredAdvancedAudioPlayer and SuperpoweredAndroidAudioIO and it plays fine.
SuperpoweredAndroidAudioIO has the construcor with parameters boolean enableInput, boolean enableOutput. Currently I'm using enableInput == false and enableOutput == true. When I put these parameters to true - no effect.
I wonder if it is possible to record file and play other file simultaneously?
Also there is SuperpoweredRecorder class in library but it says not for direct writing to disk. And need to use createWAV, fwrite, closeWAV methods.
I've tried implement Recorder separately but the quality is not good (it is two-three times faster than real recording + sound is distored).
Here is the simplest piece of code for recording I used:
void SuperpoweredFileRecorder::start(const char *destinationPath) {
file = createWAV(destinationPath, sampleRate, 2);
audioIO = new SuperpoweredAndroidAudioIO(sampleRate, bufferSize, true, false, audioProcessing, NULL, bufferSize); // Start audio input/output.
}
void SuperpoweredFileRecorder::stop() {
closeWAV(file);
audioIO->stop();
}
static bool audioProcessing(void *clientdata, short int *audioInputOutput, int numberOfSamples, int samplerate) {
fwrite(audioInputOutput, sizeof(short int), numberOfSamples, file);
return false;
}
Probably I cannot use Superpowered for that purpose and need to just make recording with OpenSL ES directly.
Thanks in advance!

After experiments I found the solution.
SuperpoweredRecorder works fine for recording tracks;
I've created to separate SuperpoweredAndroidAudioIO sources - one for playback and another for recorder. After some synchronization manipulation it works well (I minimized latency to very low level, so it suits my needs).
I post some code snippet with the idea I implemented:
https://bitbucket.org/snippets/kasurd/Mynnp/nativesuperpoweredrecorder-with
Hope it helps somebody!

You can do this with one instance of the SuperpoweredAndroidAudioIO with enableInput and enableOutput set to true.
The audio processing callback (audioProcessing() in your case) receives audio (microphone) in the audioInputOutput parameter. Just pass that to your SuperpoweredRecorder, and it will write it onto disk.
After that, do your SuperpoweredAdvancedAudioPlayer processing, and convert the result into audioInputOutput. That will go to the audio output.
So it's like, in pseudo-code:
audioProcessing(audioInputOutput) {
recorder->process(audioInputOutput)
player->process(some_buffer)
float_to_short_int(some_buffer, audioInputOutput)
}
Never do any fwrite in the audio processing callback, as it must complete within a very short time, and disk operations may be too slow.

For me this works when I double the numberOfSamples
fwrite(audioInputOutput, sizeof(short int), numberOfSamples * 2, file);
This will lead to a clear stereo output

Related

How to amplify Audio Data in Oboe onAudioReady Method?

I want to amplify the audioData that is recorded by microphone using Oboe Library.
I created AudioEngine.cpp like this: https://github.com/google/oboe/blob/master/samples/LiveEffect/src/main/cpp/LiveEffectEngine.cpp
Here's is the class that has audioData:
DataCallbackResult
AudioEngine::onAudioReady(AudioStream *oboeStream, void *audioData, int32_t numFrames) {
/* some code */
// add your audio processing here
return DataCallbackResult::Continue;
}
In the LiveEffect sample both the recording and playback streams are AudioFormat::I16 i.e. 16 bit integers. On this line you're casting to float:
auto *outputData = static_cast<float *>(audioData);
This is going to cause the distortion you hear so instead just cast to int16_t and multiply by a constant amplitude.
Make sure to check that the scaled up sample value isn't above INT16_MAX otherwise you'll get wraparound and distortion.

Android oboe c++ Some sounds distorted on playback

I'm using the Android oboe library for high performance audio in a music game.
In the assets folder I have 2 .raw files (both 48000Hz 16 bit PCM wavs and about 60kB)
std_kit_sn.raw
std_kit_ht.raw
These are loaded into memory as SoundRecordings and added to a Mixer. kSampleRateHz is 48000:
stdSN= SoundRecording::loadFromAssets(mAssetManager, "std_kit_sn.raw");
stdHT= SoundRecording::loadFromAssets(mAssetManager, "std_kit_ht.raw");
mMixer.addTrack(stdSN);
mMixer.addTrack(stdFT);
// Create a builder
AudioStreamBuilder builder;
builder.setFormat(AudioFormat::I16);
builder.setChannelCount(1);
builder.setSampleRate(kSampleRateHz);
builder.setCallback(this);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);
LOGD("After creating a builder");
// Open stream
Result result = builder.openStream(&mAudioStream);
if (result != Result::OK){
LOGE("Failed to open stream. Error: %s", convertToText(result));
}
LOGD("After openstream");
// Reduce stream latency by setting the buffer size to a multiple of the burst size
mAudioStream->setBufferSizeInFrames(mAudioStream->getFramesPerBurst() * 2);
// Start the stream
result = mAudioStream->requestStart();
if (result != Result::OK){
LOGE("Failed to start stream. Error: %s", convertToText(result));
}
LOGD("After starting stream");
They are called appropriately to play with standard code (as per Google tutorials) at required times:
stdSN->setPlaying(true);
stdHT->setPlaying(true); //Nasty Sound
The audio callback is standard (as per Google tutorials):
DataCallbackResult SoundFunctions::onAudioReady(AudioStream *mAudioStream, void *audioData, int32_t numFrames) {
// Play the stream
mMixer.renderAudio(static_cast<int16_t*>(audioData), numFrames);
return DataCallbackResult::Continue;
}
The std_kit_sn.raw plays fine. But std_kit_ht.raw has a nasty distortion. Both play with low latency. Why is one playing fine and the other has a nasty distortion?
I loaded your sample project and I believe the distortion you hear is caused by clipping/wraparound during mixing of sounds.
The Mixer object from the sample is a summing mixer. It just adds the values of each track together and outputs the sum.
You need to add some code to reduce the volume of each track to avoid exceeding the limits of an int16_t (although you're welcome to file a bug on the oboe project and I'll try to add this in an upcoming version). If you exceed this limit you'll get wraparound which is causing the distortion.
Additionally, your app is hardcoded to run at 22050 frames/sec. This will result in sub-optimal latency across most mobile devices because the stream is forced to upsample to the audio device's native frame rate. A better approach would be to leave the sample rate undefined when opening the stream - this will give you the optimal frame rate for the current audio device - then use a resampler on your source files to supply audio at this frame rate.

audio latency issues

In the application which I want to create, I face some technical obstacles. I have two music tracks in the application. For example, a user imports the music background as a first track. The second path is a voice recorded by the user to the rhythm of the first track played by the speaker device (or headphones). At this moment we face latency. After recording and playing back in the app, the user hears the loss of synchronisation between tracks, which occurs because of the microphone and speaker latencies.
Firstly, I try to detect the delay by filtering the input sound. I use android’s AudioRecord class, and the method read(). This method fills my short array with audio data.
I found that the initial values of this array are zeros so I decided to cut them out before I will start to write them into the output stream.
So I consider those zeros as a „warmup” latency of the microphone. Is this approach correct? This operation gives some results, but it doesn’t resolve the problem, and at this stage, I’m far away from that.
But the worse case is with the delay between starting the speakers and playing the music. This delay I cannot filter or detect. I tried to create some calibration feature which counts the delay. I play a „beep” sound through the speakers, and when I start to play it, I also begin to measure time. Then, I start recording and listen for this sound being detected by the microphone. When I recognise this sound in the app, I stop measuring time. I repeat this process several times, and the final value is the average from those results. That is how I try to measure the latency of the device. Now, when I have this value, I can simply shift the second track backwards to achieve synchronisation of both records (I will lose some initial milliseconds of the recording, but I skip this case, for now, there are some possibilities to fix it).
I thought that this approach would resolve the problem, but it turned out this is not as simple as I thought. I found two issues here:
1. Delay while playing two tracks simultaneously
2. Random in device audio latency.
The first: I play two tracks using AudioTrack class and I run method play() like this:
val firstTrack = //creating a track
val secondTrack = //creating a track
firstTrack.play()
secondTrack.play()
This code causes delays at the stage of playing tracks. Now, I don’t even have to think about latency while recording; I cannot play two tracks simultaneously without delays. I tested this with some external audio file (not recorded in my app) - I’m starting the same audio file using the code above, and I can see a delay. I also tried it with MediaPlayer class, and I have the same results. In this case, I even try to play tracks when callback OnPreparedListener invoke:
val firstTrack = //AudioPlayer
val secondTrack = //AudioPlayer
second.setOnPreparedListener {
first.start()
second.start()
}
And it doesn’t help.
I know that there is one more class provided by Android called SoundPool. According to the documentation, it can be better with playing tracks simultaneously, but I can’t use it because it supports only small audio files and that can't limit me.
How can I resolve this problem? How can I start playing two tracks precisely at the same time?
The second: Audio latency is not deterministic - sometimes it is smaller, and sometimes it’s huge, and it’s out of my hands. So measuring device latency can help but again - it cannot resolve the problem.
To sum up: is there any solution, which can give me exact latency per device (or app session?) or other triggers which detect actual delay, to provide the best synchronisation while playback two tracks at the same time?
Thank you in advance!
Synchronising audio for karaoke apps is tough. The main issue you seem to be facing is variable latency in the output stream.
This is almost certainly caused by "warm up" latency: the time it takes from hitting "play" on your backing track to the first frame of audio data being rendered by the audio device (e.g. headphones). This can have large variance and is difficult to measure.
The first (and easiest) thing to try is to use MODE_STREAM when constructing your AudioTrack and prime it with bufferSizeInBytes of data prior to calling play (more here). This should result in lower, more consistent "warm up" latency.
A better way is to use the Android NDK to have a continuously running audio stream which is just outputting silence until the moment you hit play, then start sending audio frames immediately. The only latency you have here is the continuous output latency.
If you decide to go down this route I recommend taking a look at the Oboe library (full disclosure: I am one of the authors).
To answer one of your specific questions...
Is there a way to calculate the latency of the audio output stream programatically?
Yes. The easiest way to explain this is with a code sample (this is C++ for the AAudio API but the principle is the same using Java AudioTrack):
// Get the index and time that a known audio frame was presented for playing
int64_t existingFrameIndex;
int64_t existingFramePresentationTime;
AAudioStream_getTimestamp(stream, CLOCK_MONOTONIC, &existingFrameIndex, &existingFramePresentationTime);
// Get the write index for the next audio frame
int64_t writeIndex = AAudioStream_getFramesWritten(stream);
// Calculate the number of frames between our known frame and the write index
int64_t frameIndexDelta = writeIndex - existingFrameIndex;
// Calculate the time which the next frame will be presented
int64_t frameTimeDelta = (frameIndexDelta * NANOS_PER_SECOND) / sampleRate_;
int64_t nextFramePresentationTime = existingFramePresentationTime + frameTimeDelta;
// Assume that the next frame will be written into the stream at the current time
int64_t nextFrameWriteTime = get_time_nanoseconds(CLOCK_MONOTONIC);
// Calculate the latency
*latencyMillis = (double) (nextFramePresentationTime - nextFrameWriteTime) / NANOS_PER_MILLISECOND;
A caveat: This method relies on accurate timestamps being reported by the audio hardware. I know this works on Google Pixel devices but have heard reports that it isn't so accurate on other devices so YMMV.
Following the answer of donturner, here's a Java version (that also uses other methods depending on the SDK version)
/** The audio latency has not been estimated yet */
private static long AUDIO_LATENCY_NOT_ESTIMATED = Long.MIN_VALUE+1;
/** The audio latency default value if we cannot estimate it */
private static long DEFAULT_AUDIO_LATENCY = 100L * 1000L * 1000L; // 100ms
/**
* Estimate the audio latency
*
* Not accurate at all, depends on SDK version, etc. But that's the best
* we can do.
*/
private static void estimateAudioLatency(AudioTrack track, long audioFramesWritten) {
long estimatedAudioLatency = AUDIO_LATENCY_NOT_ESTIMATED;
// First method. SDK >= 19.
if (Build.VERSION.SDK_INT >= 19 && track != null) {
AudioTimestamp audioTimestamp = new AudioTimestamp();
if (track.getTimestamp(audioTimestamp)) {
// Calculate the number of frames between our known frame and the write index
long frameIndexDelta = audioFramesWritten - audioTimestamp.framePosition;
// Calculate the time which the next frame will be presented
long frameTimeDelta = _framesToNanoSeconds(frameIndexDelta);
long nextFramePresentationTime = audioTimestamp.nanoTime + frameTimeDelta;
// Assume that the next frame will be written at the current time
long nextFrameWriteTime = System.nanoTime();
// Calculate the latency
estimatedAudioLatency = nextFramePresentationTime - nextFrameWriteTime;
}
}
// Second method. SDK >= 18.
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED && Build.VERSION.SDK_INT >= 18) {
Method getLatencyMethod;
try {
getLatencyMethod = AudioTrack.class.getMethod("getLatency", (Class<?>[]) null);
estimatedAudioLatency = (Integer) getLatencyMethod.invoke(track, (Object[]) null) * 1000000L;
} catch (Exception ignored) {}
}
// If no method has successfully gave us a value, let's try a third method
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
AudioManager audioManager = (AudioManager) CRT.getInstance().getSystemService(Context.AUDIO_SERVICE);
try {
Method getOutputLatencyMethod = audioManager.getClass().getMethod("getOutputLatency", int.class);
estimatedAudioLatency = (Integer) getOutputLatencyMethod.invoke(audioManager, AudioManager.STREAM_MUSIC) * 1000000L;
} catch (Exception ignored) {}
}
// No method gave us a value. Let's use a default value. Better than nothing.
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
estimatedAudioLatency = DEFAULT_AUDIO_LATENCY;
}
return estimatedAudioLatency
}
private static long _framesToNanoSeconds(long frames) {
return frames * 1000000000L / SAMPLE_RATE;
}
The android MediaPlayer class is notoriously slow to begin audio playback, I experienced an issue in an app I was creating where there was a greater than one second delay to begin playing an audio clip. I resolved it by switching to ExoPlayer which resulted in the playback starting within 100ms. I've also read that ffmpeg has even faster start audio startup time than ExoPlayer but I haven't used it so I can't make any promises.

Modifying in-call voice playback in Android custom ROM

I would like to modify Android OS (official image from AOSP) to add preprocessing to a normal phone call playback sound.
I've already achieved this filtering for app audio playback (by modifying HAL and audioflinger).
I'm OK with targeting only a specific device (Nexus 5X). Also, I only need to filter playback - I don't care about recording (uplink).
UPDATE #1:
To make it clear - I'm OK with modifying Qualcomm-specific drivers, or whatever part that it is that runs on Nexus 5X and can help me modify in-call playback.
UPDATE #2:
I'm attempting to create a Java layer app that routes the phone playback to the music stream in real time.
I've already succeeded in installing it as a system app, getting permissions for initializing AudioRecord with AudioSource.VOICE_DOWNLINK. However, the recording gives blank samples; it doesn't record the voice call.
This is the code inside my worker thread:
// Start recording
int recBufferSize = AudioRecord.getMinBufferSize(44100, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT);
mRecord = new AudioRecord(MediaRecorder.AudioSource.VOICE_DOWNLINK, 44100, AudioFormat.CHANNEL_IN_STEREO, AudioFormat.ENCODING_PCM_16BIT, recBufferSize);
// Start playback
int playBufferSize = AudioTrack.getMinBufferSize(44100, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT);
mTrack = new AudioTrack(AudioManager.STREAM_MUSIC, 44100, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT, playBufferSize, AudioTrack.MODE_STREAM);
mRecord.startRecording();;
mTrack.play();
int bufSize = 1024;
short[] buffer = new short[bufSize];
int res;
while (!interrupted())
{
// Pull recording buffers and play back
res = mRecord.read(buffer, 0, bufSize, AudioRecord.READ_NON_BLOCKING);
mTrack.write(buffer, 0, res, AudioTrack.WRITE_BLOCKING);
}
// Stop recording
mRecord.stop();
mRecord.release();
mRecord = null;
// Stop playback
mTrack.stop();
mTrack.release();;
mTrack = null;
I'm running on a Nexus 5X, my own AOSP custom ROM, Android 7.1.1. I need to find the place which will allow call recording to work - probably somewhere in hardware/qcom/audio/hal in platform code.
Also I've been looking at the function voice_check_and_set_incall_rec_usecase at hardware/qcom/audio/hal/voice.c However, I wasn't able to make sense of it (how to make it work the way I want it to).
UPDATE #3:
I've opened a more-specific question about using AudioSource.VOICE_DOWNLINK, which might draw the right attention and will eventually help me solve this question's problem as well.
There are several possible issues that come to my mind. The blank buffer might indicate that you have the wrong source selected. Also since according to https://developer.android.com/reference/android/media/AudioRecord.html#AudioRecord(int,%20int,%20int,%20int,%20int) you might not always get an exception even if something's wrong with the configuration, you might want to confirm whether your object has been initialized properly. If all else fails, you could also do an
"mRecord.setPreferredDevice(AudioDeviceInfo.TYPE_BUILTIN_EARPIECE);"
to route the phone's built-in earpiece directly to the input of your recorder. Yeah, it's kinda dirty and hacky, but perhaps suits the purpose.
The other thing what was puzzling me that instead of using the builder class you've tried to configure the object directly via its constructor. Is there a specific reason why you don't want to use AudioRecord.Builder (there's even a nice example at https://developer.android.com/reference/android/media/AudioRecord.Builder.html ) instead?

need to understad how AudioRecord and AudioTrack work for raw PCM capture and playback

I use the following code in a Thread to capture raw audio samples from the microphone and play it back through the speaker.
public void run(){
short[] lin = new short[SIZE_OF_RECORD_ARRAY];
int num = 0;
// am = (AudioManager) this.getSystemService(Context.AUDIO_SERVICE); // -> MOVED THESE TO init()
// am.setMode(AudioManager.MODE_IN_COMMUNICATION);
record.startRecording();
track.play();
while (passThroughMode) {
// while (!isInterrupted()) {
num = record.read(lin, 0, SIZE_OF_RECORD_ARRAY);
for(i=0;i<lin.length;i++)
lin[i] *= WAV_SAMPLE_MULTIPLICATION_FACTOR;
track.write(lin, 0, num);
}
// /*
record.stop();
track.stop();
record.release();
track.release();
// */
}
where record is an AudioRecord and track is an Audiotrack. I need to know in detail (and in a simplified way if possible) how the AudioRecord stores PCM data and AudioTrack plays PCM data. This is how I have understood it so far:
As the while() loop is continuously running, record obtains SIZE_OF_RECORD_ARRAY number of samples (which is 1024 for now) as shown in the figure. The samples get saved contiguously in the lin[] array of shorts (16 bit shorts, as I am using 16 bit PCM encoding). This is done by record.read(). Then track.write() places these samples in the speaker which is played by the hardware. Is this correct or am I missing something here?
As for how the samples are laid out in memory; they're just arrays of linear approximations to a sound wave, taken at discrete times (like your figure shows). In the case of stereo, the samples will be interleaved (LRLRLRLR...).
When it comes to the path the audio takes, you're essentially right, although there are a few more steps involved:
Writing data to your Java AudioTrack causes it to make a JNI (Java Native Interface) call to a native helper class, which in turn calls the native AudioTrack class.
The AudioTracks are owned by the AudioFlinger, which periodically takes data from all the AudioTracks on a given output thread (which have been mixed by the AudioMixer) and writes it to the audio HAL output stream class.
From there the data goes to the user-space ALSA library, and through a couple of intermediate steps to the kernel-space PCM driver. Then further on from there; typically going through some kind of DSP that applies various acoustic compensation filters, and eventually making it's way to the hardware codec, which controls the speaker DAC and amplifiers.
When recording from the internal microphone(s) you'd have more or less the same steps, except that they'd be done in the opposite order.
Note that some of these steps (essentially everything from the audio HAL and below) are platform-specific, and therefore might differ between platforms from different vendors (and even different platforms from the same vendor).

Categories

Resources