I am analyzing accelerometer data through FFT as it was suggested that I get information on frequency from the output of FFT.
How is the output of FFT correlated with frequency information.
The FFT function is passed an array of values (all real numbers).
The FFT function gives back 2 arrays of the same size - for the real and complex part.
I read up some of the previous posts and still confused as to how you can extract the frequency information from the output array of FFT.
1. Is the output array an array of frequencies? Is the array ordered?
1. What does each index of the output array mean? It was suggested that you can compute the magnitude of each index - sort (real[i]* real[i] + img[i] * img[i])
2. Is the magnitude at each index somehow related to the index in the input array - or is this a frequency?
3. How do I find the dominant frequency?
FFT gives you a complex pair in each Frequency Bin.
The first bin in the FFT is like the DC part of your signal (0 Hz), the second bin is Fs / N, where Fs is the sample rate and N is the windowsize of the FFT, next bin is 2 * Fs / N and so on.
To get the Power contained in a bin you will need the magnitude.
As for the dominant freqency: that is the highest peak in magnitude.
Related
I'm having understanding the Android Visualizer's
onFftDataCapture((Visualizer visualizer,
byte[] fft,
int samplingRate)
How do I know which frequency is represented by which byte in fft?
For example, would the index be a representation of frequency? Like index = frequency?
For index 0, frequency is 0 Hz, index 1 is 1 Hz, etc?
FFT produces complex numbers, where each number reflects frequency domain information about a bucket containing a continuous range frequencies. The usual approach for interpreting FFT data is as follows:
first determine the frequency bucket size by dividing the length of the FFT table by the sampling rate, let's call it bucketSize;
then each bucket with index i (assuming it's 0-based) contains information about frequencies in the range from i * bucketSize to (i + 1) * bucketSize Hz;
for real-valued signals, the values of the second half of the FFT table (for buckets of frequencies above samplingRate / 2) will be just a mirror of the first half, so they are usually discarded;
also, for the first (index 0) and for the last (index samplingRate / 2) buckets, the value of the FFT table will be real as well (assuming real-valued signal);
to find the magnitude (signal level) for the frequencies in the bucket, one needs to take the complex value from the FFT table for this bucket, say it's a + ib, and calculate sqrt(a*a + b*b).
Now back to the results of onFftDataCapture. Here the fft array contains complex numbers as consecutive pairs of bytes, except for the first two elements, so fft[2] and fft[3] comprise the complex number for the first bucket, fft[4] and fft[5] -- for the second, and so on. Whereas fft[0] is the FFT value (real) for the 0-th frequency (DC), and fft[1] is for the last frequency.
Because, as I've mentioned, for real-valued signals the second part of the FFT table doesn't bring any benefit, it's not provided in the fft array. But since each FFT bucket takes two array cells, the bucket size in Hz will still be calculated as fft.length / samplingRate. Note that fft.length is actually the capture size set by setCaptureSize method of the Visualizer.
The magnitude of the bucket of frequencies can be easily calculated using Math.hypot function, e.g. for the first bucket it's Math.hypot(fft[2], fft[3]). For the DC bucket it's simply Math.abs(fft[0]), and for last frequency bucket it's Math.abs(fft[1]).
I am trying to develop GCC_PHAT algorithm on Android devices.
For FFT I used this library.
The idea is to correlate two audio files (16-bit PCM mono) to find the delay between them. With Matlab it works perfectly.
My first problem is FFT output, it gives numbers higher than 32768. For example:
fft re -20830.895138576154
fft re -30639.569794501647
fft re -49850.48597621472
fft re -49335.28275604235
fft re -96060.94916529073
fft re -91409.17426504416
fft re -226903.051428709
Is there a way to normalize these numbers to an interval of [-1,1]?
The library's forward transform definition does match Matlab's, so you should get matching values after the forward transform (not that it is critical since G_PHAT does get normalized to [-1,1]).
However, the same cannot be said of the inverse transform. Indeed from
the code comments on inverseTransform:
This transform does not perform scaling, so the inverse is not a true inverse.
And from the library webpage:
This FFT does not perform any scaling. So for a vector of length n, after performing a transform and an inverse transform on it, the result will be the original vector multiplied by n (plus approximation errors).
So, to get values matching Matlab's FFT/IFFT implementation you would need to divide the result of the IFFT by n.
Im searching for a way to convert Wave files into a list of numbers to cross-correlate the resulting vectors (like the numbers you get when you read a wave file on MATLAB)
0.6653
-0.8445
0.9589
-0.9999
0.9643
-0.8547
0.6797
-0.4525
0.1907
0.0858
-0.3557
0.5983
-0.7951
0.9309
-0.9953
0.9835
-0.8962
0.7402
-0.5275
0.2742
is there a way to do that in Android or even C/C++? i really dont know how to start.
WAVE file format is fairly simple, especially if you're interested in linear PCM encoded data. Descriptions of the format is available from various sources, such as here and here.
Using these, you should be able to decode:
the header ("RIFF" chunk)
the "fmt " chunk which contains various information such as the number of bytes per samples, the number of channels (ie. mono, stereo, or more), sampling rate, etc.
and the "data" chunk which if the main thing that you'll want to look at in order to create a MATLAB-like vector.
If you're dealing with single channel (ie. mono) WAVE files, the data is fairly straightforward to decode. The number of samples should then corresponds (give-or-take a few bytes for padding) to the size of the data block divided by the number of bytes per sample (the number of bits per sample is available from the "fmt " chunk). The mapping of the samples' integer to a [0-1] floating point value can be done by multiplying by a constant (eg. 1.0/128 for 1-byte-per-sample).
For multi-channel WAVE files, keep in mind that channel data are interleaved (e.g. sample 1 left/right, sample 2 left/right, ...)
Note also that there are a number of tutorial/samples floating around (such as this sample in C or this sample in Java), and various open source sound libraries which you may use as a starting point.
I recorded an audio sample, and i want to apply FFT to it,,,
I did all the steps needed in order to use FFT in android such as getting the j-transform library and everything else needed...
and with in the code, i first defined the fft :
DoubleFFT_1D fft = new DoubleFFT_1D(1024);
and inside the code, after reading the audio file( stored as PCM) ... i applied FFT on it by using the following instruction:
fft.complexForward(audio_file_in_double_format);
Here is my question:
First of all the number (1024) used in the parameter of the fft definition, what is it based on? and what does it mean?
Does it mean that the fft will be applied on only 1024 samples?!
And what will be the output of the fft function? i know that it will give complex numbers, so is it gonna give a result double to the input??
I need help understanding how this FFT function works?!
The code is working fine with me, but i need to understand,, because i am inputting the while audio file into the FFT function which is alot bigger than 1024 samples. So is it applying FFT to its first 1024 and ignoring the rest? or what??
I am using the AudioRecord class to analize raw pcm bytes as it comes in the mic.
So thats working nicely. Now i need convert the pcm bytes into decibel.
I have a formula that takes sound presure in Pa into db.
db = 20 * log10(Pa/ref Pa)
So the question is the bytes i am getting from audiorecorder from the buffer what is it is it amplitude pascal sound pressure or what.
I tried to putting the value into te formula but it comes back with very hight db so i do not think its right
thanks
Disclaimer: I know little about Android.
Your device is probably recording in mono at 44,100 samples per second (maybe less) using two bytes per sample. So your first step is to combine pairs of bytes in your original data into two-byte integers (I don't know how this is done in Android).
You can then compute the decibel value (relative to the peak) of each sample by first taking the normalized absolute value of the sample and passing it to your Db function:
float Db = 20 * log10(ABS(sampleVal) / 32768)
A value near the peak (e.g. +32767 or -32768) will have a Db value near 0. A value of 3277 (0.1) will have a Db value of -20; a value of 327 (.01) will have a Db value of -40 etc.
The problem is likely the definition of the "reference" sound pressure at the mic. I have no idea what it would be or if it's available.
The only audio application I've ever used, defined 0db as "full volume", when the samples were at + or - max value (in unsigned 16 bits, that'd be 0 and 65535). To get this into db I'd probably do something like this:
// assume input_sample is in the range 0 to 65535
sample = (input_sample * 10.0) - 327675.0
db = log10(sample / 327675.0)
I don't know if that's right, but it feels right to the mathematically challenged me. As the input_sample approaches the "middle", it'll look more and more like negative infinity.
Now that I think about it, though, if you want a SPL or something that might require different trickery like doing RMS evaluation between the zero crossings, again something that I could only guess at because I have no idea how it really works.
The reference pressure in Leq (sound pressure level) calculations is 20 micro-Pascal (rms).
To measure absolute Leq levels, you need to calibrate your microphone using a calibrator. Most calibrators fit 1/2" or 1/4" microphone capsules, so I have my doubts about calibrating the microphone on an Android phone. Alternatively you may be able to use the microphone sensitivity (Pa/mV) and then calibrate the voltage level going into the ADC. Even less reliable results could be had from comparing the Android values with the measured sound level of a diffuse stationary sound field using a sound level meter.
Note that in Leq calculations you normally use the RMS values. A single sample's value doesn't mean much.
I held my sound level meter right next to the mic on my google ion and went 'Woooooo!' and noted that clipping occurred about 105 db spl. Hope this helps.
The units are whatever units are used for the reference reading. In the formula, the reading is divided by the reference reading, so the units cancel out and no longer matter.
In other words, decibels is a way of comparing two things, it is not an absolute measurement. When you see it used as if it is absolute, then the comparison is with the quietest sound the average human can hear.
In our case, it is a comparison to the highest reading the device handles (thus, every other reading is negative, or less than the maximum).