I am trying to understand what do the values obtained with Audiorecord.read() actually mean.
I am trying to create an app that will start recording sound when it will detect an impulse reponse (therefore I have though to set a treshold in which any sound above it will be considered an impulse).
The problem is that I don´t really know what do the values stored in "data" represent when I call this method:
read = recorder.read(data, 0, bufferSize);
These are some of the values that I obtain:
[96, 2, 101, 3, 101, 2, 110, 1, -41, 2, -80, 2, -117, 2, 119, 2, -94, 0 .........]
The idea is to set the treshold from these values but firstly I need to know what do they represent.
Can you guys help me with this?
The data depends on the parameters you sent to the constructor. AudioRecord (int audioSource, int sampleRateInHz, int channelConfig, int audioFormat, int bufferSizeInBytes)
sampleRateInHz is the number of samples per second. channel config is either MONO or STEREO meaning 1 or 2 channels. format is PCM8 or PCM16 meaning 8 bits or 16 bits per sample.
So the data is an array of samples. Each sample is an array of channels. Each channel is going to have an 8 or 16 bit value, depending on what you asked for. No data will be skipped, it will always be a fixed size format.
So if you chose 1 channel and 8 bits, each byte is a single sound heard, and you should see sampleRateInHz sounds per second. If you choose 16 bits, each sound is 2 bytes long. If you use 2 channels, it should go in order channel 1 then channel 2 for each sample.
The individual values are the amplitude of the sound data when sampled at the requested frequency. See http://en.wikipedia.org/wiki/Pulse-code_modulation for more information on how it works.
Related
I have an Android App where I get Heart Rate Measurements from a Polar H10 Device.
I'm totally lost on how to interpret the heart rate. Various links to the bluetooth.com site are resulting in 404 errors unfortunately.
The characteristics value is i.e.
[16, 59, 83, 4]
From what I understood the second byte (59) is the heart rate in BPM. But this does not seem to be decimal as the value goes up to 127 and then goes on -127, -126, -125, ... It is not hex either.
I tried (in kotlin)
characteristic.value[1].toUInt()
characteristic.value[1].toInt()
characteristic.value[1].toShort()
characteristic.value[1].toULong()
characteristic.value[1].toDouble()
All values freak out as soon as the -127 appears.
Do I have to convert the 59 to binary (59=111011) and see it in there? Please give me some insight.
### Edit (12th April 2021) ###
What I do to get those values is a BluetoothDevice.connectGatt().
Then hold the GATT.
In order to get heart rate values I look for
Service 0x180d and its
characteristic 0x2a37 and its only
descriptor 0x2902.
Then I enable notifications by setting 0x01 on the descriptor. I then get ongoing events in the GattClientCallback.onCharacteristicChanged() callback. I will add a screenshot below with all data.
From what I understood the response should be 6 bytes long instead of 4, right? What am I doing wrong?
On the picture you see the characteristic on the very top. It is linked to the service 180d and the characteristic holds the value with 4 bytes on the bottom.
See Heart Rate Value in BLE for the links to the documents. As in that answer, here's the decode:
Byte 0 - Flags: 16 (0001 0000)
Bits are numbered from LSB (0) to MSB (7).
Bit 0 - Heart Rate Value Format: 0 => UINT8 beats per minute
Bit 1-2 - Sensor Contact Status: 00 => Not supported or detected
Bit 3 - Energy Expended Status: 0 => No Present
Bit 4 - RR-Interval: 1 => One or more values are present
So the first byte is a heart rate in UInt8 format, and the next two bytes are an RR interval.
To read this in Kotlin:
characteristic.getIntValue(FORMAT_UINT8, 1)
This return a heart rate of 56 bpm.
And ignore the other two bytes unless you want the RR.
It seems I found a way by retrieving the value as follows
val hearRateDecimal = characteristic.getIntValue(BluetoothGattCharacteristic.FORMAT_UINT8, 1)
2 things are important
first - the format of UINT8 (although I don't know when to use UINT8 and when UINT16. Actually I thought I need to use UINT16 as the first byte is actually 16 (see the question above)
second - the offset parameter 1
What I now get is an Integer even beyond 127 -> 127, 128, 129, 130, ...
I'm trying to understand the Superpowered SDK, but new to both Android and C++, as well as audio signals. I have Frequency Domain example from here:
https://github.com/superpoweredSDK/Low-Latency-Android-Audio-iOS-Audio-Engine/tree/master/Examples_Android/FrequencyDomain
running on my Nexus 5X. In the FrequencyDomain.cpp file:
static SuperpoweredFrequencyDomain *frequencyDomain;
static float *magnitudeLeft, *magnitudeRight, *phaseLeft, *phaseRight, *fifoOutput, *inputBufferFloat;
static int fifoOutputFirstSample, fifoOutputLastSample, stepSize, fifoCapacity;
#define FFT_LOG_SIZE 11 // 2^11 = 2048
static bool audioProcessing(void * __unused clientdata, short int *audioInputOutput, int numberOfSamples, int __unused samplerate) {
SuperpoweredShortIntToFloat(audioInputOutput, inputBufferFloat, (unsigned int)numberOfSamples); // Converting the 16-bit integer samples to 32-bit floating point.
frequencyDomain->addInput(inputBufferFloat, numberOfSamples); // Input goes to the frequency domain.
// In the frequency domain we are working with 1024 magnitudes and phases for every channel (left, right), if the fft size is 2048.
while (frequencyDomain->timeDomainToFrequencyDomain(magnitudeLeft, magnitudeRight, phaseLeft, phaseRight)) {
// You can work with frequency domain data from this point.
// This is just a quick example: we remove the magnitude of the first 20 bins, meaning total bass cut between 0-430 Hz.
memset(magnitudeLeft, 0, 80);
memset(magnitudeRight, 0, 80);
I understand how the first 20 bins is 0-430 Hz from here:
How do I obtain the frequencies of each value in an FFT?
but I don't understand the value of 80 in memset... being 4*20, is it 4 bytes for a float * 20 bins? Does magnitudeLeft hold data for all the frequencies? How would I then remove, for example, 10 bins of frequencies from the middle or the highest from the end? Thank you!
Every value in magnitudeLeft and magnitudeRight is a float, which is 32-bits, 4 bytes.
memset takes a number of bytes parameter, so 20 bins * 4 bytes = 80 bytes.
memset clears the first 20 bins this way.
Both magnitudeLeft and magnitudeRight represents the full frequency range with 1024 floats. Their size is always FFT size divided by two, so 2048 / 2 in the example.
Removing from the middle and the top looks something like:
memset(&magnitudeLeft[index_of_first_bin_to_remove], 0, number_of_bins * sizeof(float));
Note that the first parameter is not multiplied with sizeof(float), because the compiler knows that magnitudeLeft is a float, so it will automatically input the correct address.
I'm developing a VoIP application that runs at the sampling rate of 48 kHz. Since it uses Opus, which uses 48 kHz internally, as its codec, and most current Android hardware natively runs at 48 kHz, AEC is the only piece of the puzzle I'm missing now. I've already found the WebRTC implementation but I can't seem to figure out how to make it work. It looks like it corrupts the memory randomly and crashes the whole thing sooner or later. When it doesn't crash, the sound is kinda chunky as if it's quieter for the half of the frame. Here's my code that processes a 20 ms frame:
webrtc::SplittingFilter* splittingFilter;
webrtc::IFChannelBuffer* bufferIn;
webrtc::IFChannelBuffer* bufferOut;
webrtc::IFChannelBuffer* bufferOut2;
// ...
splittingFilter=new webrtc::SplittingFilter(1, 3, 960);
bufferIn=new webrtc::IFChannelBuffer(960, 1, 1);
bufferOut=new webrtc::IFChannelBuffer(960, 1, 3);
bufferOut2=new webrtc::IFChannelBuffer(960, 1, 3);
// ...
int16_t* samples=(int16_t*)data;
float* fsamples[3];
float* foutput[3];
int i;
float* fbuf=bufferIn->fbuf()->bands(0)[0];
// convert the data from 16-bit PCM into float
for(i=0;i<960;i++){
fbuf[i]=samples[i]/(float)32767;
}
// split it into three "bands" that the AEC needs and for some reason can't do itself
splittingFilter->Analysis(bufferIn, bufferOut);
// split the frame into 6 consecutive 160-sample blocks and perform AEC on them
for(i=0;i<6;i++){
fsamples[0]=&bufferOut->fbuf()->bands(0)[0][160*i];
fsamples[1]=&bufferOut->fbuf()->bands(0)[1][160*i];
fsamples[2]=&bufferOut->fbuf()->bands(0)[2][160*i];
foutput[0]=&bufferOut2->fbuf()->bands(0)[0][160*i];
foutput[1]=&bufferOut2->fbuf()->bands(0)[1][160*i];
foutput[2]=&bufferOut2->fbuf()->bands(0)[2][160*i];
int32_t res=WebRtcAec_Process(aecState, (const float* const*) fsamples, 3, foutput, 160, 20, 0);
}
// put the "bands" back together
splittingFilter->Synthesis(bufferOut2, bufferIn);
// convert the processed data back into 16-bit PCM
for(i=0;i<960;i++){
samples[i]=(int16_t) (CLAMP(fbuf[i], -1, 1)*32767);
}
If I comment out the actual echo cancellation and just do the float conversion and band splitting back and forth, it doesn't corrupt the memory, doesn't sound weird and runs indefinitely. (I do pass the farend/speaker signal into AEC, I just didn't want to make the mess of my code by including it in the question)
I've also tried Android's built-in AEC. While it does work, it upsamples the captured signal from 16 kHz.
Unfortunately, there is no free AEC package that support 48khz. So, either move to 32khz or use a commercial AEC package at 48khz.
I´m a bit confused using getMinBufferSize() and AudioRecord.read() while recording from the MIC of the phone.
I understand that getMinBufferSize() gives you the minimun amount of bytes required to create the audiorecord object (in 1 sec?).
bufferSize= AudioRecord.getMinBufferSize(RECORDER_SAMPLERATE,
RECORDER_CHANNELS,
RECORDER_AUDIO_ENCODING);
Then, when they call AudioRecord.read(), they have as an argument for the size of the bytes read "bufferSize".
read = recorder.read(data, 0, bufferSize);
Here are my questions:
1- Why bufferSize returns me 8192? I guess it´s making 8*1024 but I would like to know exactly what is the calculation that it is making (I´m using 8000 Hz sample rate, channel MONO and 16-bit PCM)
2- I suppose that bufferSize is the amount of data that I can store in 1 sec of duration but, what if I want to read more than 1 sec? Should I multiply this value to the number of seconds?
I guess you have a size of an array 8192
Since you encode your file in 16bit-PCM, the array size will be 16bit * 8192 which is around 130000
data capacity made in a second is 128000 ( = 8000 * 1 * 16)
so it becomes your min buffer size
I am trying to calculate the db value of sound coming from the microphone in Android.
I have used Audio Record class to get 16 bit PCM data from the microphone.
//init a recorder instance
recorder = new AudioRecord(MediaRecorder.AudioSource.MIC,ECORDER_SAMPLERATE, RECORDER_CHANNELS,RECORDER_AUDIO_ENCODING, bufferSize);
recorder.startRecording();
// and then read 16 bit PCM data
recorder.read(data, 0, bufferSize);
The range of values for this data is from -32768 to 32767 (signed 2^16).
I believe that these are the quantized values.But I would like to find out what the corresponding voltage value is.What is the voltage value range for a microphone
in android ? Is it more or less the same for all smartphones?
Eg : Say the microphone output is 0 to +5 V then 32767 corresponds to +5 V DC .