Calculating song's FFT in Android - android

I'm building small android app for kids, which simply plays melody, then child tries to sing back. I record and compare with song for saying you sang correctly etc.
So far, I managed to record and calculate real time data from microphone with TarsosDSP.I also projected into a graph. But I can't get the FFT values of melodies. I'm really new on audio processing area.
Can someone give me an example of how to calculate FFT of simple mp3 file?
Thanks.

After you get the song into bytes, then you can use this code.
I used the JTranforms library and it works perfectly, you can compare it to the function used by Matlab.
here is my code with comments referencing how matlab transforms any signal and gets the frequency amplitudes (https://la.mathworks.com/help/matlab/ref/fft.html)
first, add the following in the build.gradle (app)
implementation 'com.github.wendykierp:JTransforms:3.1'
and here it is the code for for transforming a simple sine wave, works like a charm
double Fs = 8000;
double T = 1/Fs;
int L = 1600;
double freq = 338;
double sinValue_re_im[] = new double[L*2]; // because FFT takes an array where its positions alternate between real and imaginary
for( int i = 0; i < L; i++)
{
sinValue_re_im[2*i] = Math.sin( 2*Math.PI*freq*(i * T) ); // real part
sinValue_re_im[2*i+1] = 0; //imaginary part
}
// matlab
// tf = fft(y1);
DoubleFFT_1D fft = new DoubleFFT_1D(L);
fft.complexForward(sinValue_re_im);
double[] tf = sinValue_re_im.clone();
// matlab
// P2 = abs(tf/L);
double[] P2 = new double[L];
for(int i=0; i<L; i++){
double re = tf[2*i]/L;
double im = tf[2*i+1]/L;
P2[i] = sqrt(re*re+im*im);
}
// P1 = P2(1:L/2+1);
double[] P1 = new double[L/2]; // single-sided: the second half of P2 has the same values as the first half
System.arraycopy(P2, 0, P1, 0, L/2);
// P1(2:end-1) = 2*P1(2:end-1);
System.arraycopy(P1, 1, P1, 1, L/2-2);
for(int i=1; i<P1.length-1; i++){
P1[i] = 2*P1[i];
}
// f = Fs*(0:(L/2))/L;
double[] f = new double[L/2 + 1];
for(int i=0; i<L/2+1;i++){
f[i] = Fs*((double) i)/L;
}

Spectrum Analysis using Java, Sampling Frequency, Folding Frequency, and the FFT Algorithm http://www.developer.com/java/other/article.php/3380031

You can implement the AudioProcessor from ExoPlayer and use this FFT wrapper to get the data Noise

Related

AudioRecorder | Interpreting FFT data for Spectrum Analyzer

I am building an app that needs to be able to display a real-time spectral analyzer. Here is the version I was able to successfully make on iOS:
I am using Wendykierp JTransforms library to perform the FFT calculations, and have managed to capture audio data and execute the FFT functions. See below:
short sData[] = new short[BufferElements2Rec];
int result = audioRecord.read(sData, 0, BufferElements2Rec);
try
{
//Initiate FFT
DoubleFFT_1D fft = new DoubleFFT_1D(sData.length);
//Convert sample data from short[] to double[]
double[] fftSamples = new double[sData.length];
for (int i = 0; i < sData.length; i++) {
//IMPORTANT: We cannot simply cast the short value to double.
//As a double is only 2 bytes (values -32768 to 32768)
//We must divide by 32768 before we cast to Double.
fftSamples[i] = (double) sData[i] / 32768;
}
//Perform fft calcs
fft.realForward(fftSamples);
//TODO - Convert FFT data into 20 "bands"
} Catch (Exception e)
{
}
In iOS, I was using a library (Tempi-FFT) which had built in functionality for calculating magnitude, frequency, and providing averaged data for any given number of bands (I am using 20 bands as you can see in the image above). It seems I don't have that luxury with this library and I need to calculate this myself.
Looking for any good examples or tutorials on how to interperate the data returned by the FFT calculations. Here is some sample data I am receiving:
-11387.0, 183.0, -384.9121475854448, -224.66315714636642, -638.0173005872095, -236.2318653974911, -1137.1498541119106, -437.71599514435786, 1954.683405957685, -2142.742125980924 ...
Looking for simple explanation of how to interpret this data. Some other questions I have looked at that I was either unable to understand, or did not provide information on how to determine a given number of bands:
Power Spectral Density from jTransforms DoubleFFT_1D
How to develop a Spectrum Analyser from a realtime audio?
Your question can be split into two parts: finding the magnitude of all frequencies (interpreting the output) and averaging the frequencies into bands
Finding the magnitude of all frequencies:
I won't go into the intricacies of the Fast Fourier Transform/Discrete Fourier Transform (if you would like to gain a basic understanding see this video), but know that there is a real and an imaginary part of each output.
The documentation of the realForward function describes where both the imaginary and the real parts are located in the output array (I'm assuming you have an even sample size):
a[2*k] = Re[k], 0 <= k < n / 2
a[2*k+1] = Im[k], 0 < k < n / 2
a[1] = Re[n/2]
a is equivalent to your fftSamples, which means we can translate this documentation into code as follows (I've changed Re and Im to realPart and imaginaryPart respectively):
int n = fftSamples.length;
double[] realPart = new double[n / 2];
double[] imaginaryPart = new double[n / 2];
for(int k = 0; k < n / 2; k++) {
realPart[k] = fftSamples[k * 2];
imaginaryPart[k] = fftSamples[k * 2 + 1];
}
realPart[n / 2] = fftSamples[1];
Now we have the real and imaginary parts of each frequency. We could plot these on an x-y coordinate plane using the real part as the x value and the imaginary part as the y value. This creates a triangle, and the length of the triangle's hypotenuse is the magnitude of the frequency. We can use the pythagorean theorem to get this magnitude:
double[] spectrum = new double[n / 2];
for(int k = 1; k < n / 2; k++) {
spectrum[k] = Math.sqrt(Math.pow(realPart[k], 2) + Math.pow(imaginaryPart[k], 2));
}
spectrum[0] = realPart[0];
Note that the 0th index of the spectrum doesn't have an imaginary part. This is the DC component of the signal (we won't use this).
Now, we have an array with the magnitudes of each frequency across your spectrum (If your sampling frequency is 44100Hz, this means you now have an array with the magnitudes of the frequencies between 0Hz and 44100Hz, and if you have 441 values in your array, then each index value represents a 100Hz step.)
Averaging the frequencies into bands:
Now that we've converted the FFT output to data that we can use, we can move on to the second part of your question: finding the averages of different bands of frequencies. This is relatively simple. We just need to split the array into different bands and find the average of each band. This can be generalized like so:
int NUM_BANDS = 20; //This can be any positive integer.
double[] bands = new double[NUM_BANDS];
int samplesPerBand = (n / 2) / NUM_BANDS;
for(int i = 0; i < NUM_BANDS; i++) {
//Add up each part
double total;
for(int j = samplesPerBand * i ; j < samplesPerBand * (i+1); j++) {
total += spectrum[j];
}
//Take average
bands[i] = total / samplesPerBand;
}
Final Code:
And that's it! You now have an array called bands with the average magnitude of each band of frequencies. The code above is purposefully not optimized in order to show how each step works. Here is a shortened and optimized version:
int numFrequencies = fftSamples.length / 2;
double[] spectrum = new double[numFrequencies];
for(int k = 1; k < numFrequencies; k++) {
spectrum[k] = Math.sqrt(Math.pow(fftSamples[k*2], 2) + Math.pow(fftSamples[k*2+1], 2));
}
spectrum[0] = fftSamples[0];
int NUM_BANDS = 20; //This can be any positive integer.
double[] bands = new double[NUM_BANDS];
int samplesPerBand = numFrequencies / NUM_BANDS;
for(int i = 0; i < NUM_BANDS; i++) {
//Add up each part
double total;
for(int j = samplesPerBand * i ; j < samplesPerBand * (i+1); j++) {
total += spectrum[j];
}
//Take average
bands[i] = total / samplesPerBand;
}
//Use bands in view!
This has been a really long answer, and I haven't tested the code yet (though I do plan to). Feel free to comment if you find any mistakes.

Incorrect peak frequency in JTransform

I've trying to calculate peak frequency from android mic buffer as per this How to get frequency from fft result? .
Unfortunatly i'm getting wrong values. Even i played a tone in 18Khz,but i'm not getting correct peak frequency.
This is my code,
int sampleRate=44100,bufferSize=4096;
AudioRecord audioRec=new AudioRecord(AudioSource.MIC,sampleRate,AudioFormat.CHANNEL_CONFIGURATION_MONO,AudioFormat.ENCODING_PCM_16BIT,bufferSize);
audioRec.startRecording();
audioRec.read(bufferByte, 0,bufferSize);
for(int i=0;i<bufferByte.length;i++){
bufferDouble2[i]=(double)bufferByte[i];
}
//here window techniq is appliying
for(int i=0;i<bufferSize-1;i++){
bufferDouble[2*i]=bufferDouble2[i];
bufferDouble[2*i+1] = 0.0;}
DoubleFFT_1D fft = new DoubleFFT_1D(bufferSize); //instance of DoubleFFT_1D class
fft.realForwardFull(bufferDouble); //this is where it falls
//here calculating magnitude
for (int i = 0; i < (bufferSize/2)-1; i++) {
double real= bufferDouble[2*i];
double imag=bufferDouble[2*i+1];
mag[i] =Math.sqrt((real*real)+(imag*imag));
}
//calculating maximum frequency
double max_mag =0.0;
int max_index = -1;
for(int j=0;j<(bufferSize/2)-1;j++){
if(mag[j]>max_mag){
max_mag = mag[j];
max_index = j;
}
}
final int peak=max_index * sampleRate/bufferSize;
Log.v(TAG2, "Peak Frequency = " +max_index * sampleRate/bufferSize);
This is the output while listining 18Khz sound in phone
Help me and please don't give theoretical answers.
Thank you

Android AudioRecord filter range of frequency

I am using android platform, from the following reference question I come to know that using AudioRecord class which returns raw data I can filter range of audio frequency depends upon my need but for that I will need algorithm, can somebody please help me out to find algorithm to filter range b/w 14,400 bph and 16,200 bph.
I tried "JTransform" but i don't know can I achieve this with JTransform or not ? Currently I am using "jfftpack" to display visual effects which works very well but i can't achieve audio filter using this.
Reference here
help appreciated Thanks in advance.
Following is my code as i mentioned above i am using "jfftpack" library to display you may find this library reference in the code please don't get confuse with that
private class RecordAudio extends AsyncTask<Void, double[], Void> {
#Override
protected Void doInBackground(Void... params) {
try {
final AudioRecord audioRecord = findAudioRecord();
if(audioRecord == null){
return null;
}
final short[] buffer = new short[blockSize];
final double[] toTransform = new double[blockSize];
audioRecord.startRecording();
while (started) {
final int bufferReadResult = audioRecord.read(buffer, 0, blockSize);
for (int i = 0; i < blockSize && i < bufferReadResult; i++) {
toTransform[i] = (double) buffer[i] / 32768.0; // signed 16 bit
}
transformer.ft(toTransform);
publishProgress(toTransform);
}
audioRecord.stop();
audioRecord.release();
} catch (Throwable t) {
Log.e("AudioRecord", "Recording Failed");
}
return null;
/**
* #param toTransform
*/
protected void onProgressUpdate(double[]... toTransform) {
canvas.drawColor(Color.BLACK);
for (int i = 0; i < toTransform[0].length; i++) {
int x = i;
int downy = (int) (100 - (toTransform[0][i] * 10));
int upy = 100;
canvas.drawLine(x, downy, x, upy, paint);
}
imageView.invalidate();
}
There are a lot of tiny details in this process that can potentially hang you up here. This code isn't tested and I don't do audio filtering very often so you should be extremely suspicious here. This is the basic process you would take for filtering audio:
Get audio buffer
Possible audio buffer conversion (byte to float)
(optional) Apply windowing function i.e. Hanning
Take the FFT
Filter frequencies
Take inverse FFT
I'm assuming you have some basic knowledge of Android and audio recording so will cover steps 4-6 here.
//it is assumed that a float array audioBuffer exists with even length = to
//the capture size of your audio buffer
//The size of the FFT will be the size of your audioBuffer / 2
int FFT_SIZE = bufferSize / 2;
FloatFFT_1D mFFT = new FloatFFT_1D(FFT_SIZE); //this is a jTransforms type
//Take the FFT
mFFT.realForward(audioBuffer);
//The first 1/2 of audioBuffer now contains bins that represent the frequency
//of your wave, in a way. To get the actual frequency from the bin:
//frequency_of_bin = bin_index * sample_rate / FFT_SIZE
//assuming the length of audioBuffer is even, the real and imaginary parts will be
//stored as follows
//audioBuffer[2*k] = Re[k], 0<=k<n/2
//audioBuffer[2*k+1] = Im[k], 0<k<n/2
//Define the frequencies of interest
float freqMin = 14400;
float freqMax = 16200;
//Loop through the fft bins and filter frequencies
for(int fftBin = 0; fftBin < FFT_SIZE; fftBin++){
//Calculate the frequency of this bin assuming a sampling rate of 44,100 Hz
float frequency = (float)fftBin * 44100F / (float)FFT_SIZE;
//Now filter the audio, I'm assuming you wanted to keep the
//frequencies of interest rather than discard them.
if(frequency < freqMin || frequency > freqMax){
//Calculate the index where the real and imaginary parts are stored
int real = 2 * fftBin;
int imaginary = 2 * fftBin + 1;
//zero out this frequency
audioBuffer[real] = 0;
audioBuffer[imaginary] = 0;
}
}
//Take the inverse FFT to convert signal from frequency to time domain
mFFT.realInverse(audioBuffer, false);

how to voice frequency detect?

I'm beginner android programmer. (my home language is not English so, my English is poor.)
i want to make app, get frequency recorded human voice and show note like " C3 " or "G#4"...
so, i want to detect human voice frequency , but it is too difficult.
i try use FFT, it detect piano(or guitar) sound pretty good (some part, over octave4, it didn't detect low frequency piano (or guitar) sound.), but it can't detect human voice.
(i use piano program used general midi)
I found lots of information, but i can't understand.
most of people say use pitch detect algorithm and link just wiki.
Please tell me in detail about pitch detect algorithm.
(actually i want example code :(
or
is there any idea to use my app?
HERE IS MY SOURCE CODE:
public void Frequency(double[] array) {
int sampleSize = array.length;
double[] win = window.generate(sampleSize);
// signals for fft input
double[] signals = new double[sampleSize];
for (int i = 0; i < sampleSize; i++) {
signals[i] = array[i] * win[i];
}
double[] fftArray = new double[sampleSize * 2];
for (int i = 0; i < sampleSize - 1; i++) {
fftArray[2 * i] = signals[i];
fftArray[2 * i + 1] = 0;
}
FFT.complexForward(fftArray);
getFrequency(fftArray);
}
private void getFrequency(double[] array) {
// ========== Value ========== //
int RATE = sampleRate;
int CHUNK_SIZE_IN_SAMPLES = RECORDER_BUFFER_SIZE;
int MIN_FREQUENCY = 50; // HZ
int MAX_FREQUENCY = 2000; // HZ
int min_frequency_fft = Math.round(MIN_FREQUENCY * CHUNK_SIZE_IN_SAMPLES / RATE);
int max_frequency_fft = Math.round(MAX_FREQUENCY * CHUNK_SIZE_IN_SAMPLES / RATE);
// ============================ //
double best_frequency = min_frequency_fft;
double best_amplitude = 0;
for (int i = min_frequency_fft; i <= max_frequency_fft; i++) {
double current_frequency = i * 1.0 * RATE / CHUNK_SIZE_IN_SAMPLES;
double current_amplitude = Math.pow(array[i * 2], 2) + Math.pow(array[i * 2 + 1], 2);
double normalized_amplitude = current_amplitude * Math.pow(MIN_FREQUENCY * MAX_FREQUENCY, 0.5) / current_frequency;
if (normalized_amplitude > best_amplitude) {
best_frequency = current_frequency;
best_amplitude = normalized_amplitude;
}
}
FrequencyArray[FrequencyArrayIndex] = best_frequency;
FrequencyArrayIndex++;
}
I refer to this : http://code.google.com/p/android-guitar-tuner/
Pitch_detection_algorithm
use Jtransforms
The Wikipedia page on pitch detection links to another Wikipedia page explaining autocorrelation: http://en.m.wikipedia.org/wiki/Autocorrelation#section_3 , which is one of many pitch estimation methods you could try.
Running the example code you posted can show that FFT peak frequency estimation is quite poor at musical pitch detection and estimation for many common pitched sounds.

Android audio FFT to display fundamental frequency

I have been working on an Android project for awhile that displays the fundamental frequency of an input signal (to act as a tuner). I have successfully implemented the AudioRecord class and am getting data from it. However, I am having a hard time performing an FFT on this data to get the fundamental frequency of the input signal. I have been looking at the post here, and am using FFT in Java and Complex class to go with it.
I have successfully used the FFT function found in FFT in Java, but I am not sure if I am obtaining the correct results. For the magnitude of the FFT (sqrt[rere+imim]) I am getting values that start high, around 15000 Hz, and then slowly diminish to about 300 Hz. Doesn't seem right.
Also, as far as the raw data from the mic goes, the data seems fine, except that the first 50 values or so are always the number 3, unless I hit the tuning button again while still in the application and then I only get about 15. Is that normal?
Here is a bit of my code.
First of all, I convert the short data (obtained from the microphone) to a double using the following code which is from the post I have been looking at. This snippet of code I do not completely understand, but I think it works.
//Conversion from short to double
double[] micBufferData = new double[bufferSizeInBytes];//size may need to change
final int bytesPerSample = 2; // As it is 16bit PCM
final double amplification = 1.0; // choose a number as you like
for (int index = 0, floatIndex = 0; index < bufferSizeInBytes - bytesPerSample + 1; index += bytesPerSample, floatIndex++) {
double sample = 0;
for (int b = 0; b < bytesPerSample; b++) {
int v = audioData[index + b];
if (b < bytesPerSample - 1 || bytesPerSample == 1) {
v &= 0xFF;
}
sample += v << (b * 8);
}
double sample32 = amplification * (sample / 32768.0);
micBufferData[floatIndex] = sample32;
}
The code then continues as follows:
//Create Complex array for use in FFT
Complex[] fftTempArray = new Complex[bufferSizeInBytes];
for (int i=0; i<bufferSizeInBytes; i++)
{
fftTempArray[i] = new Complex(micBufferData[i], 0);
}
//Obtain array of FFT data
final Complex[] fftArray = FFT.fft(fftTempArray);
final Complex[] fftInverse = FFT.ifft(fftTempArray);
//Create an array of magnitude of fftArray
double[] magnitude = new double[fftArray.length];
for (int i=0; i<fftArray.length; i++){
magnitude[i]= fftArray[i].abs();
}
fft.setTextColor(Color.GREEN);
fft.setText("fftArray is "+ fftArray[500] +" and fftTempArray is "+fftTempArray[500] + " and fftInverse is "+fftInverse[500]+" and audioData is "+audioData[500]+ " and magnitude is "+ magnitude[1] + ", "+magnitude[500]+", "+magnitude[1000]+" Good job!");
for(int i = 2; i < samples; i++){
fft.append(" " + magnitude[i] + " Hz");
}
That last bit is just to check what values I am getting (and to keep me sane!). In the post referred to above, it talks about needing the sampling frequency and gives this code:
private double ComputeFrequency(int arrayIndex) {
return ((1.0 * sampleRate) / (1.0 * fftOutWindowSize)) * arrayIndex;
}
How do I implement this code? I don't realy understand where fftOutWindowSize and arrayIndex comes from?
Any help is greatly appreciated!
Dustin
Recently I'm working on a project which requires almost the same. Probably you don't need any help anymore but I will give my thoughts anyway. Maybe someone need this in the future.
I'm not sure whether the short to double function works, I don't understand that snippet of code neither. It is wrote for byte to double conversion.
In the code: "double[] micBufferData = new double[bufferSizeInBytes];" I think the size of micBufferData should be "bufferSizeInBytes / 2", since every sample takes two bytes and the size of micBufferData should be the sample number.
FFT algorithms do require a FFT window size, and it has to be a number which is the power of 2. However many algorithms can receive an arbitrary of number as input and it will do the rest. In the document of those algorithms should have the requirements of input. In your case, the size of the Complex array can be the input of FFT algorithms. And I don't really know the detail of the FFT algorithm but I think the inverse one is not needed.
To use the code you gave at last, you should firstly find the peak index in the sample array. I used double array as input instead of Complex, so in my case it is something like: double maxVal = -1;int maxIndex = -1;
for( int j=0; j < mFftSize / 2; ++j ) {
double v = fftResult[2*j] * fftResult[2*j] + fftResult[2*j+1] * fftResult[2*j+1];
if( v > maxVal ) {
maxVal = v;
maxIndex = j;
}
}
2*j is the real part and 2*j+1 is the imaginary part. maxIndex is the index of the peak magnitude you want (More detail here), and use it as input to the ComputeFrequency function. The return value is the frequency of the sample array you want.
Hopefully it can help someone.
You should pick an FFT window size depending on your time versus frequency resolution requirements, and not just use the audio buffer size when creating your FFT temp array.
The array index is your int i, as used in your magnitude[i] print statement.
The fundamental pitch frequency for music is often different from FFT peak magnitude, so you may want to research some pitch estimation algorithms.
I suspect that the strange results you're getting are because you might need to unpack the FFT. How this is done will depend on the library that you're using (see here for docs on how it's packed in GSL, for example). The packing may mean that the real and imaginary components are not in the positions in the array that you expect.
For your other questions about window size and resolution, if you're creating a tuner then I'd suggest trying a window size of about 20ms (eg 1024 samples at 44.1kHz). For a tuner you need quite high resolution, so you could try zero-padding by a factor of 8 or 16 which will give you a resolution of 3-6Hz.

Categories

Resources