I've trying to calculate peak frequency from android mic buffer as per this How to get frequency from fft result? .
Unfortunatly i'm getting wrong values. Even i played a tone in 18Khz,but i'm not getting correct peak frequency.
This is my code,
int sampleRate=44100,bufferSize=4096;
AudioRecord audioRec=new AudioRecord(AudioSource.MIC,sampleRate,AudioFormat.CHANNEL_CONFIGURATION_MONO,AudioFormat.ENCODING_PCM_16BIT,bufferSize);
audioRec.startRecording();
audioRec.read(bufferByte, 0,bufferSize);
for(int i=0;i<bufferByte.length;i++){
bufferDouble2[i]=(double)bufferByte[i];
}
//here window techniq is appliying
for(int i=0;i<bufferSize-1;i++){
bufferDouble[2*i]=bufferDouble2[i];
bufferDouble[2*i+1] = 0.0;}
DoubleFFT_1D fft = new DoubleFFT_1D(bufferSize); //instance of DoubleFFT_1D class
fft.realForwardFull(bufferDouble); //this is where it falls
//here calculating magnitude
for (int i = 0; i < (bufferSize/2)-1; i++) {
double real= bufferDouble[2*i];
double imag=bufferDouble[2*i+1];
mag[i] =Math.sqrt((real*real)+(imag*imag));
}
//calculating maximum frequency
double max_mag =0.0;
int max_index = -1;
for(int j=0;j<(bufferSize/2)-1;j++){
if(mag[j]>max_mag){
max_mag = mag[j];
max_index = j;
}
}
final int peak=max_index * sampleRate/bufferSize;
Log.v(TAG2, "Peak Frequency = " +max_index * sampleRate/bufferSize);
This is the output while listining 18Khz sound in phone
Help me and please don't give theoretical answers.
Thank you
Related
I am building an app that needs to be able to display a real-time spectral analyzer. Here is the version I was able to successfully make on iOS:
I am using Wendykierp JTransforms library to perform the FFT calculations, and have managed to capture audio data and execute the FFT functions. See below:
short sData[] = new short[BufferElements2Rec];
int result = audioRecord.read(sData, 0, BufferElements2Rec);
try
{
//Initiate FFT
DoubleFFT_1D fft = new DoubleFFT_1D(sData.length);
//Convert sample data from short[] to double[]
double[] fftSamples = new double[sData.length];
for (int i = 0; i < sData.length; i++) {
//IMPORTANT: We cannot simply cast the short value to double.
//As a double is only 2 bytes (values -32768 to 32768)
//We must divide by 32768 before we cast to Double.
fftSamples[i] = (double) sData[i] / 32768;
}
//Perform fft calcs
fft.realForward(fftSamples);
//TODO - Convert FFT data into 20 "bands"
} Catch (Exception e)
{
}
In iOS, I was using a library (Tempi-FFT) which had built in functionality for calculating magnitude, frequency, and providing averaged data for any given number of bands (I am using 20 bands as you can see in the image above). It seems I don't have that luxury with this library and I need to calculate this myself.
Looking for any good examples or tutorials on how to interperate the data returned by the FFT calculations. Here is some sample data I am receiving:
-11387.0, 183.0, -384.9121475854448, -224.66315714636642, -638.0173005872095, -236.2318653974911, -1137.1498541119106, -437.71599514435786, 1954.683405957685, -2142.742125980924 ...
Looking for simple explanation of how to interpret this data. Some other questions I have looked at that I was either unable to understand, or did not provide information on how to determine a given number of bands:
Power Spectral Density from jTransforms DoubleFFT_1D
How to develop a Spectrum Analyser from a realtime audio?
Your question can be split into two parts: finding the magnitude of all frequencies (interpreting the output) and averaging the frequencies into bands
Finding the magnitude of all frequencies:
I won't go into the intricacies of the Fast Fourier Transform/Discrete Fourier Transform (if you would like to gain a basic understanding see this video), but know that there is a real and an imaginary part of each output.
The documentation of the realForward function describes where both the imaginary and the real parts are located in the output array (I'm assuming you have an even sample size):
a[2*k] = Re[k], 0 <= k < n / 2
a[2*k+1] = Im[k], 0 < k < n / 2
a[1] = Re[n/2]
a is equivalent to your fftSamples, which means we can translate this documentation into code as follows (I've changed Re and Im to realPart and imaginaryPart respectively):
int n = fftSamples.length;
double[] realPart = new double[n / 2];
double[] imaginaryPart = new double[n / 2];
for(int k = 0; k < n / 2; k++) {
realPart[k] = fftSamples[k * 2];
imaginaryPart[k] = fftSamples[k * 2 + 1];
}
realPart[n / 2] = fftSamples[1];
Now we have the real and imaginary parts of each frequency. We could plot these on an x-y coordinate plane using the real part as the x value and the imaginary part as the y value. This creates a triangle, and the length of the triangle's hypotenuse is the magnitude of the frequency. We can use the pythagorean theorem to get this magnitude:
double[] spectrum = new double[n / 2];
for(int k = 1; k < n / 2; k++) {
spectrum[k] = Math.sqrt(Math.pow(realPart[k], 2) + Math.pow(imaginaryPart[k], 2));
}
spectrum[0] = realPart[0];
Note that the 0th index of the spectrum doesn't have an imaginary part. This is the DC component of the signal (we won't use this).
Now, we have an array with the magnitudes of each frequency across your spectrum (If your sampling frequency is 44100Hz, this means you now have an array with the magnitudes of the frequencies between 0Hz and 44100Hz, and if you have 441 values in your array, then each index value represents a 100Hz step.)
Averaging the frequencies into bands:
Now that we've converted the FFT output to data that we can use, we can move on to the second part of your question: finding the averages of different bands of frequencies. This is relatively simple. We just need to split the array into different bands and find the average of each band. This can be generalized like so:
int NUM_BANDS = 20; //This can be any positive integer.
double[] bands = new double[NUM_BANDS];
int samplesPerBand = (n / 2) / NUM_BANDS;
for(int i = 0; i < NUM_BANDS; i++) {
//Add up each part
double total;
for(int j = samplesPerBand * i ; j < samplesPerBand * (i+1); j++) {
total += spectrum[j];
}
//Take average
bands[i] = total / samplesPerBand;
}
Final Code:
And that's it! You now have an array called bands with the average magnitude of each band of frequencies. The code above is purposefully not optimized in order to show how each step works. Here is a shortened and optimized version:
int numFrequencies = fftSamples.length / 2;
double[] spectrum = new double[numFrequencies];
for(int k = 1; k < numFrequencies; k++) {
spectrum[k] = Math.sqrt(Math.pow(fftSamples[k*2], 2) + Math.pow(fftSamples[k*2+1], 2));
}
spectrum[0] = fftSamples[0];
int NUM_BANDS = 20; //This can be any positive integer.
double[] bands = new double[NUM_BANDS];
int samplesPerBand = numFrequencies / NUM_BANDS;
for(int i = 0; i < NUM_BANDS; i++) {
//Add up each part
double total;
for(int j = samplesPerBand * i ; j < samplesPerBand * (i+1); j++) {
total += spectrum[j];
}
//Take average
bands[i] = total / samplesPerBand;
}
//Use bands in view!
This has been a really long answer, and I haven't tested the code yet (though I do plan to). Feel free to comment if you find any mistakes.
I'm building small android app for kids, which simply plays melody, then child tries to sing back. I record and compare with song for saying you sang correctly etc.
So far, I managed to record and calculate real time data from microphone with TarsosDSP.I also projected into a graph. But I can't get the FFT values of melodies. I'm really new on audio processing area.
Can someone give me an example of how to calculate FFT of simple mp3 file?
Thanks.
After you get the song into bytes, then you can use this code.
I used the JTranforms library and it works perfectly, you can compare it to the function used by Matlab.
here is my code with comments referencing how matlab transforms any signal and gets the frequency amplitudes (https://la.mathworks.com/help/matlab/ref/fft.html)
first, add the following in the build.gradle (app)
implementation 'com.github.wendykierp:JTransforms:3.1'
and here it is the code for for transforming a simple sine wave, works like a charm
double Fs = 8000;
double T = 1/Fs;
int L = 1600;
double freq = 338;
double sinValue_re_im[] = new double[L*2]; // because FFT takes an array where its positions alternate between real and imaginary
for( int i = 0; i < L; i++)
{
sinValue_re_im[2*i] = Math.sin( 2*Math.PI*freq*(i * T) ); // real part
sinValue_re_im[2*i+1] = 0; //imaginary part
}
// matlab
// tf = fft(y1);
DoubleFFT_1D fft = new DoubleFFT_1D(L);
fft.complexForward(sinValue_re_im);
double[] tf = sinValue_re_im.clone();
// matlab
// P2 = abs(tf/L);
double[] P2 = new double[L];
for(int i=0; i<L; i++){
double re = tf[2*i]/L;
double im = tf[2*i+1]/L;
P2[i] = sqrt(re*re+im*im);
}
// P1 = P2(1:L/2+1);
double[] P1 = new double[L/2]; // single-sided: the second half of P2 has the same values as the first half
System.arraycopy(P2, 0, P1, 0, L/2);
// P1(2:end-1) = 2*P1(2:end-1);
System.arraycopy(P1, 1, P1, 1, L/2-2);
for(int i=1; i<P1.length-1; i++){
P1[i] = 2*P1[i];
}
// f = Fs*(0:(L/2))/L;
double[] f = new double[L/2 + 1];
for(int i=0; i<L/2+1;i++){
f[i] = Fs*((double) i)/L;
}
Spectrum Analysis using Java, Sampling Frequency, Folding Frequency, and the FFT Algorithm http://www.developer.com/java/other/article.php/3380031
You can implement the AudioProcessor from ExoPlayer and use this FFT wrapper to get the data Noise
I am using android platform, from the following reference question I come to know that using AudioRecord class which returns raw data I can filter range of audio frequency depends upon my need but for that I will need algorithm, can somebody please help me out to find algorithm to filter range b/w 14,400 bph and 16,200 bph.
I tried "JTransform" but i don't know can I achieve this with JTransform or not ? Currently I am using "jfftpack" to display visual effects which works very well but i can't achieve audio filter using this.
Reference here
help appreciated Thanks in advance.
Following is my code as i mentioned above i am using "jfftpack" library to display you may find this library reference in the code please don't get confuse with that
private class RecordAudio extends AsyncTask<Void, double[], Void> {
#Override
protected Void doInBackground(Void... params) {
try {
final AudioRecord audioRecord = findAudioRecord();
if(audioRecord == null){
return null;
}
final short[] buffer = new short[blockSize];
final double[] toTransform = new double[blockSize];
audioRecord.startRecording();
while (started) {
final int bufferReadResult = audioRecord.read(buffer, 0, blockSize);
for (int i = 0; i < blockSize && i < bufferReadResult; i++) {
toTransform[i] = (double) buffer[i] / 32768.0; // signed 16 bit
}
transformer.ft(toTransform);
publishProgress(toTransform);
}
audioRecord.stop();
audioRecord.release();
} catch (Throwable t) {
Log.e("AudioRecord", "Recording Failed");
}
return null;
/**
* #param toTransform
*/
protected void onProgressUpdate(double[]... toTransform) {
canvas.drawColor(Color.BLACK);
for (int i = 0; i < toTransform[0].length; i++) {
int x = i;
int downy = (int) (100 - (toTransform[0][i] * 10));
int upy = 100;
canvas.drawLine(x, downy, x, upy, paint);
}
imageView.invalidate();
}
There are a lot of tiny details in this process that can potentially hang you up here. This code isn't tested and I don't do audio filtering very often so you should be extremely suspicious here. This is the basic process you would take for filtering audio:
Get audio buffer
Possible audio buffer conversion (byte to float)
(optional) Apply windowing function i.e. Hanning
Take the FFT
Filter frequencies
Take inverse FFT
I'm assuming you have some basic knowledge of Android and audio recording so will cover steps 4-6 here.
//it is assumed that a float array audioBuffer exists with even length = to
//the capture size of your audio buffer
//The size of the FFT will be the size of your audioBuffer / 2
int FFT_SIZE = bufferSize / 2;
FloatFFT_1D mFFT = new FloatFFT_1D(FFT_SIZE); //this is a jTransforms type
//Take the FFT
mFFT.realForward(audioBuffer);
//The first 1/2 of audioBuffer now contains bins that represent the frequency
//of your wave, in a way. To get the actual frequency from the bin:
//frequency_of_bin = bin_index * sample_rate / FFT_SIZE
//assuming the length of audioBuffer is even, the real and imaginary parts will be
//stored as follows
//audioBuffer[2*k] = Re[k], 0<=k<n/2
//audioBuffer[2*k+1] = Im[k], 0<k<n/2
//Define the frequencies of interest
float freqMin = 14400;
float freqMax = 16200;
//Loop through the fft bins and filter frequencies
for(int fftBin = 0; fftBin < FFT_SIZE; fftBin++){
//Calculate the frequency of this bin assuming a sampling rate of 44,100 Hz
float frequency = (float)fftBin * 44100F / (float)FFT_SIZE;
//Now filter the audio, I'm assuming you wanted to keep the
//frequencies of interest rather than discard them.
if(frequency < freqMin || frequency > freqMax){
//Calculate the index where the real and imaginary parts are stored
int real = 2 * fftBin;
int imaginary = 2 * fftBin + 1;
//zero out this frequency
audioBuffer[real] = 0;
audioBuffer[imaginary] = 0;
}
}
//Take the inverse FFT to convert signal from frequency to time domain
mFFT.realInverse(audioBuffer, false);
I'm beginner android programmer. (my home language is not English so, my English is poor.)
i want to make app, get frequency recorded human voice and show note like " C3 " or "G#4"...
so, i want to detect human voice frequency , but it is too difficult.
i try use FFT, it detect piano(or guitar) sound pretty good (some part, over octave4, it didn't detect low frequency piano (or guitar) sound.), but it can't detect human voice.
(i use piano program used general midi)
I found lots of information, but i can't understand.
most of people say use pitch detect algorithm and link just wiki.
Please tell me in detail about pitch detect algorithm.
(actually i want example code :(
or
is there any idea to use my app?
HERE IS MY SOURCE CODE:
public void Frequency(double[] array) {
int sampleSize = array.length;
double[] win = window.generate(sampleSize);
// signals for fft input
double[] signals = new double[sampleSize];
for (int i = 0; i < sampleSize; i++) {
signals[i] = array[i] * win[i];
}
double[] fftArray = new double[sampleSize * 2];
for (int i = 0; i < sampleSize - 1; i++) {
fftArray[2 * i] = signals[i];
fftArray[2 * i + 1] = 0;
}
FFT.complexForward(fftArray);
getFrequency(fftArray);
}
private void getFrequency(double[] array) {
// ========== Value ========== //
int RATE = sampleRate;
int CHUNK_SIZE_IN_SAMPLES = RECORDER_BUFFER_SIZE;
int MIN_FREQUENCY = 50; // HZ
int MAX_FREQUENCY = 2000; // HZ
int min_frequency_fft = Math.round(MIN_FREQUENCY * CHUNK_SIZE_IN_SAMPLES / RATE);
int max_frequency_fft = Math.round(MAX_FREQUENCY * CHUNK_SIZE_IN_SAMPLES / RATE);
// ============================ //
double best_frequency = min_frequency_fft;
double best_amplitude = 0;
for (int i = min_frequency_fft; i <= max_frequency_fft; i++) {
double current_frequency = i * 1.0 * RATE / CHUNK_SIZE_IN_SAMPLES;
double current_amplitude = Math.pow(array[i * 2], 2) + Math.pow(array[i * 2 + 1], 2);
double normalized_amplitude = current_amplitude * Math.pow(MIN_FREQUENCY * MAX_FREQUENCY, 0.5) / current_frequency;
if (normalized_amplitude > best_amplitude) {
best_frequency = current_frequency;
best_amplitude = normalized_amplitude;
}
}
FrequencyArray[FrequencyArrayIndex] = best_frequency;
FrequencyArrayIndex++;
}
I refer to this : http://code.google.com/p/android-guitar-tuner/
Pitch_detection_algorithm
use Jtransforms
The Wikipedia page on pitch detection links to another Wikipedia page explaining autocorrelation: http://en.m.wikipedia.org/wiki/Autocorrelation#section_3 , which is one of many pitch estimation methods you could try.
Running the example code you posted can show that FFT peak frequency estimation is quite poor at musical pitch detection and estimation for many common pitched sounds.
I've been playing with this now for sometime, I cant work out what I am meant to be doing here.
I am reading in PCM audio data into an audioData array:
recorder.read(audioData,0,bufferSize); //read the PCM audio data into the audioData array
I want to use Piotr Wendykier's JTransform library in order to preform an FFT on my PCM data in order to obtain the frequency.
import edu.emory.mathcs.jtransforms.fft.DoubleFFT_1D;
At the moment I have this:
DoubleFFT_1D fft = new DoubleFFT_1D(1024); // 1024 is size of array
for (int i = 0; i < 1023; i++) {
a[i]= audioData[i];
if (audioData[i] != 0)
Log.v(TAG, "audiodata=" + audioData[i] + " fft= " + a[i]);
}
fft.complexForward(a);
I cant make sense of how to work this, can somebody give me some pointers? Will i have to perform any calculations after this?
I'm sure I'm way off, anything would be greatly appreciated!
Ben
If you're just looking for the frequency of a single sinusoidal tone in the input waveform then you need to find the FFT peak with the largest magnitude, where:
Magnitude = sqrt(re*re + im*im)
The index i of this largest magnitude peak will tell you the approximate frequency of your sinusoid:
Frequency = Fs * i / N
where:
Fs = sample rate (Hz)
i = index of peak
N = number of points in FFT (1024 in this case)
Since I've spent some hours on getting this to work here's a complete implementation in Java:
import org.jtransforms.fft.DoubleFFT_1D;
public class FrequencyScanner {
private double[] window;
public FrequencyScanner() {
window = null;
}
/** extract the dominant frequency from 16bit PCM data.
* #param sampleData an array containing the raw 16bit PCM data.
* #param sampleRate the sample rate (in HZ) of sampleData
* #return an approximation of the dominant frequency in sampleData
*/
public double extractFrequency(short[] sampleData, int sampleRate) {
/* sampleData + zero padding */
DoubleFFT_1D fft = new DoubleFFT_1D(sampleData.length + 24 * sampleData.length);
double[] a = new double[(sampleData.length + 24 * sampleData.length) * 2];
System.arraycopy(applyWindow(sampleData), 0, a, 0, sampleData.length);
fft.realForward(a);
/* find the peak magnitude and it's index */
double maxMag = Double.NEGATIVE_INFINITY;
int maxInd = -1;
for(int i = 0; i < a.length / 2; ++i) {
double re = a[2*i];
double im = a[2*i+1];
double mag = Math.sqrt(re * re + im * im);
if(mag > maxMag) {
maxMag = mag;
maxInd = i;
}
}
/* calculate the frequency */
return (double)sampleRate * maxInd / (a.length / 2);
}
/** build a Hamming window filter for samples of a given size
* See http://www.labbookpages.co.uk/audio/firWindowing.html#windows
* #param size the sample size for which the filter will be created
*/
private void buildHammWindow(int size) {
if(window != null && window.length == size) {
return;
}
window = new double[size];
for(int i = 0; i < size; ++i) {
window[i] = .54 - .46 * Math.cos(2 * Math.PI * i / (size - 1.0));
}
}
/** apply a Hamming window filter to raw input data
* #param input an array containing unfiltered input data
* #return a double array containing the filtered data
*/
private double[] applyWindow(short[] input) {
double[] res = new double[input.length];
buildHammWindow(input.length);
for(int i = 0; i < input.length; ++i) {
res[i] = (double)input[i] * window[i];
}
return res;
}
}
FrequencyScanner will return an approximation of the dominant frequency in the presented sample data.
It applies a Hamming window to it's input to allow passing in arbitrary samples from an audio stream.
Precision is achieved by internally zero padding the sample data before doing the FFT transform.
(I know there are better - and far more complex - ways to do this but the padding approach is sufficient for my personal needs).
I testet it against raw 16bit PCM samples created from reference sounds for 220hz and 440hz and the results match.
Yes you need to use realForward function instead of complexForward, because you pass it a real array and not a complex array from doc.
EDIT:
Or you can get the real part and perform complex to complex fft like this :
double[] in = new double[N];
read ...
double[] fft = new double[N * 2];
for(int i = 0; i < ffsize; ++i)
{
fft[2*i] = mic[i];
fft[2*i+1] = 0.0;
}
fft1d.complexForward(fft);
I try and I compare results with matlab, and I don't get same results... (magnitude)
If you're looking for the FFT of an Audio input ( 1D, real data ), should'nt you be using the 1D REAL Fft?