Android audio downsampling

Android audio downsampling - android

I need to record 10 seconds of audio and than to perform convolution with some other signal. I need to record audio with sampling rate of 512hz. As my phone(i guess it's hard that any phone supports sample rate of 512hz) doesn't support that sampling rate, i need to record audio in higher sampling rate and than downsample to 512hz. For recording audio I use AudioRecord and the only frequency that is guaranteed to work is 44100hz. Every lib or code that i found perform downsampling by reading and writing to file. As i need it to be very fast, i need to perform this action couple times in one second(at least two), is any way to perform dowsampling on raw PCM data written in byte array and to be very fast?

Any reason you are aiming for a sample rate of 512hz? Seems a bit low to be useful!
I am not sure what language you are using but I use libsoxr within C++. A quick google comes up with libresample4j for java. Both these will let you do the resampling in real time, without have to save a file first.

Related

Microphone input RMS measurement on Android device

I want to measure the RMS value of the input audio signal over periods of 100-200 ms on an Android device without recording the audio to a file. This is not to measure the noise level, I only need the RMS value of the ADC. Please suggest an appropriate approach. Should I look into the AudioRecord class, the Oboe library or something completely different? I am an amateur with very basic Android/Kotlin knowledge. Thank you!)
I tried to find examples of Kotlin code regarding recording audio from a microphone to a byte array, but I did not succeed.

Real time audio analysis Android

I've got a rather complicated problem that I need to solve at work. It's pretty far out of my remit of "Android App Developer" - I would class it as a very specialized audio engineering problem.
I am tasked with developing an application, which needs to be able to stream either a local audio file or audio from streaming service apps such as, but not limited to, Spotify, to another device over Bluetooth.
In addition, the app needs to be able to estimate the BPM of the streamed audio (it is assumed all audio will be musical) and use this BPM value to control the playback speed of a lighting sequence.
This question is about how to estimate the BPM of the streamed music.
For the case where the audio file is local, I can think of some solutions for this, such as hardcoding the BPM into the app, in a map against the audio resources URL.
I have also investigated and experimented with "static" library (aubio) than can estimate BPM from an audio file, but not on the fly. It assumes .wav format. This won't be sufficient for what we are trying to achieve here.
However, given the requirement for streaming external audio from streaming service apps such as Spotify, a static analysis solution is pointless as the solution wouldn't work for the streaming service case, and the streaming service case solution will work for both cases.
Therefore, I have come to the conclusion that somehow, I need to on the fly analyze the streamed audio, perhaps with FFT or peak detection algorithms.
This question isn't about the actual BPM estimation algorithm itself (or the implementation details of how I would get there) and is about the basic starting point of such a solution:
How might I go about getting A) the raw bytes of streamed audio for both the local file case and the external streaming service app case and B) how might I process these bytes into a data structure representing the audio stream in a way amenable to running audio analysis algorithms on it.
I realize this is very open ended, quite vague question, but this is so far out of my comfort zone I've no idea how to even formulate a more coherent question.
Any help would be greatly appreciated!

I'd start by creating some separate, more tightly defined questions for the different pieces. For example, ask how to get access to the raw bytes when streaming local file, or streaming URL-sourced audio. Android has some nice support for streaming, including the ability to stream PCM, so I'd be pretty surprised if getting a hook for access to the byte stream were not possible.
Once you have a hooking point, to convert the bytes to "something useful" I'd look at using the audio format to tell you how to read the incoming bytes. The format should tell you how many channels (mono or stereo), the encoding (e.g., signed PCM is common, might be normalized floats), the number of bits per value (16 is common) and the order of the bytes (big-endian vs little endian).
I know that there are posts that will explain how to convert the raw audio bytes to PCM values based on this info, including some on stackoverflow. They should be reachable via search. I think signed normalized floats is the most common data representation used for processing audio signals.

How to record microphone to more compressed format during WebRTC call on Android?

I have an app calling using WebRTC. But during a call, I need to record microphone. WebRTC has an object WebRTCAudioRecord to record audio but the audio file is so large (PCM_16bit). I want to record but to a smaller size.
I've tried MediaRecorder but it doesn't work because WebRTC is recorded and MediaRecorder does not have permission to record while calling.
Has anyone done this, or have any idea that could help me?

Webrtc is considered as comparatively much better pre-processing tool for Audio and Video.
Webrtc native development includes fully optimized native C and C++ classes, In order to maintain wonderful Speech Quality and Intelligibility of audio and video which is quite interesting.
Visit Reference Link: https://github.com/jitsi/webrtc/tree/master/examples regularly.
As Problem states;
I want to record but smaller size. I've tried MediaRecorder and it doesn't work because WebRtc is recorded and MediaRecorder has not permission to record while calling.
First of all, to reduce or minimize the size of your recorded data (audio bytes), you should look at different types of speech codecs which basically reduce the size of recorded data by maintaining sound quality at a level. To see different voice codecs, here are well-known speech codecs as follows:
OPUS
SPEEX
G7.11 (G-Series Speech Codecs)
As far as size of the audio data is concerned, it basically depends upon the Sample Rate and Time for which you record a chunk or audio packet.
Supppose time = 40ms ---then---> Reocrded Data = 640 bytes (or 320 short)
Size of recorded data is **directly proportional** to both Time and Sample rate.
Sample Rate = 8000 or 16000 etc. (greater the sample rate, greater would be the size)
To see in more detail visit: fundamentals of audio data representation. But Webrtc mainly process 10ms audio data for pre-processing in which packet size is reduced up to 160 bytes.
Secondly, If you want to use multiple AudioRecorder instances at a time, then it is practically impossible. As WebRtc is already recording from microphone then practically MediaRecorder instance would not perform any function as this answer depicts audio-record-multiple-audio-at-a-time. Webrtc has following methods to manage audio bytes such as;
1. Push input PCM data into `ProcessCaptureStream` to process in place.
2. Get the processed PCM data from `ProcessCaptureStream` and send to far-end.
3. The far end pushed the received data into `ProcessRenderStream`.
I have maintained a complete tutorial related to audio processing using Webrtc, you can visit to see more details; Android-Audio-Processing-Using-Webrtc.

There are two parts for the solution:
Get the raw PCM audio frames from webrtc
Save them to a local file in compressed size so that it can be played out later
For the first part you have to attach the SamplesReadyCallback while creating audioDeviceManager by calling the setSamplesReadyCallback method of JavaAudioDeviceModule. This callback will give you the raw audio frames captured by webrtc's AudioRecord from the mic.
For the second part you have to encode the raw frames and write into a file. Check out this sample from google on how to do it - https://android.googlesource.com/platform/frameworks/base/+/master/packages/SystemUI/src/com/android/systemui/screenrecord/ScreenInternalAudioRecorder.java#234

Android AudioTrack latency with playback

I am trying to play raw sound data using AudioTrack class in Android, I am using the write method, but I noticed that there is a latency between the write method returns and the actual sound is played, to make it simple let us use AudioRecord class as the following psedu code:
//init AudioTrack
//init AudioRecord
while(true){
byte [] buffer = new byte[1000];
int read = audioRecord(buffer,0,1000);
audioTrack.write(buffer,0,read);
}
I expect to get latency that is read / sample rate seconds but the actual sound is played after and extra of about 0.5 seconds, I really need the audio to be played with minimum latency, so does anyone has an explanation of what is going on and is there any available solution or should I accept this as it is a hardware issue?

I'm assuming your goal is to come up with some interactive audio solution (that is, where sound is played in response to some user action), because in this scenario low latency really matters.
On Android, to achieve the lowest latency you need to use Open SL ES API which is available to native (C++) code via NDK. The only Java side mechanism that can achieve low latency is SoundPool class, but it has limitations in what kind of sounds you can play.
For more information, see the page on high-performance audio, and also check out this SO answer: Low-latency audio playback on Android

Streaming multiple OGG simultaneously in Android

I need to be able to play two or more (let's say, up to 5) short ogg files simultaneously. And by simultaneously I mean in perfect synchrony. I am able to load them to SoundPool and play, but this sometimes creates a noticeable difference in playback start time, which I want to get rid of.
From my understanding this can be avoided if mixing PCMs into one buffer and playing. But OGG's are not PCMs and need to be somehow efficiently decoded before playing and latency must be very low, ideally as soon as user presses the button. So I figured I need a way to stream OGG into PCM and as I receive buffers I would mix them and feed to AudioTrack. My requirement is Android 2.3.3+, so I cannot use any new codecs provided in Jelly Bean.
Also although OGGs themselves are small, there is a lot of them. So keeping them all decoded in memory (SoundPool or some pre-decoding) may case problems too.
Can someone give me a tip where to dig? Can OpenSL ES do that for me? Or should I think about integrating ffmpeg? And is it even possible to stream simultaneus files with low latency?
Thanks

You can play sounds using AssetPlayers, but this sometimes creates a noticeable difference in playback start time, yeh...
So, i recomend to decode ogg using Ogg Vorbis (like here) and then using this PCM buffer for BufferPlayer.
Btw, check this OpenSL ES wrappers
https://github.com/Suvitruf/Android-ndk/tree/master/OpenSLES

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.