I really fail at FFT and now I'm in need to communicate from the headphone jack of my Android to the Arduino there's currently a library for Arduino (talks about it in the blog post Real-time spectrum analyzer powered by Arduino) and one for Android too!
How should I start? How should I build audio signals which ultimately can be turned into FFTs and the Arduino can analyse the same using the library and I can actuate anything?
You are asking a very fuzzy question: "How should I build audio signals which ultimately can be turned into FFTs and the Arduino can analyse the same using the library and I can actuate anything?". I am going to help you think through the problem - asking yourself the right questions is essential to get any answers.
Presumably, your audio signals are "coming from somewhere" - i.e. they are sound. This means that you need to convert them into a stream of numbers first.
problem #1: converting audio signal into a stream of numbers
This breaks down into three separate sub problems:
Getting the signal to the right amplitude
Choosing the sampling rate needed
Digitizing and storing the data for later processing
Items (1) and (3) are related, since you need to know how you are going to digitize the signal before you can choose the right amplitude. For example, if you have a microphone as your sound input source, you will need to amplify the signal (and maybe add some automatic gain control) before feeding it into an ADC (analog to digital converter) that has a 5 V input range, since the microphone may have an output in the mV range. Without more information about the hardware you are using, there's not a lot to add here. It sounds from your tag that you are trying to do that inside an Android device - in which case I wonder how you intend to move the digital signal to the Arduino (over USB?).
The second point, "choosing the sampling rate", is actually very important. A sound signal contains many different frequencies - think of them as keys on the piano. In order to detect a high frequency, you need to sample the signal "faster than it is changing". There is a formal theorem called "Nyquist's Theorem" that states that you have to sample at 2x the highest frequency that is present in your signal. Note - it's not just "that you are interested in", but "that is present". If you sample a high frequency signal with a low frequency sample clock, it will appear "aliased" - it wil show up in your output as something completely different. So before you digitize a signal you have to decide what the frequencies of interest are, and remove all higher frequencies with a filter. Let's say you are interested in frequencies up to 500 Hz (about 1 octave above middle C on a piano). To give your filter a chance to work, you might choose to cut off all frequencies above 1 kHz (filters "roll off" - i.e. they increase in strength over a range of frequencies), and would sample at 2 kHz. This means you get 2000 samples per second, and you need to figure out where to put them on your Arduino (memory fills up quickly on the little board.)
Problem #2: analyzing the signal
Assuming that you have somehow captured a digital signal, your next task is analyzing it. The FFT is basicaly some clever math that tells you, for a given sound sample, "what keys on the piano were hit, and how hard". It breaks the sound signal into a series of frequency "bins", and determines how much energy is in each bin (it also computes the phase, but let's keep it simple). So if the input of a FFT algorithm is a sound sample, the output is an array of values telling you what frequencies were present in the signal. This is approximate, since it will find the "nearest bin". Sticking with the same analogy - if you were hitting a piano that's out of tune, the algorithm won't return "out of tune", but rather "a bit of C, and a bit of C sharp", since it cannot actually measure anything in between. The accuracy of an FFT is determined by the sampling frequency (which gives you the upper limit on the frequency you can detect) and the sample length: the longer you "listen" so the sample, the more subtle the differences you can "hear". So you have another trade-off to consider: if your audio signal changes rapidly, you have to sample for a short time (to capture the quick changes); but if you need an accurate frequency, you have to sample for a long time. For example if you are writing a Morse decoder, your sampling has to be short compared to a pause between "dits" and "dashes" - or they will slur together. Figuring out that a morse tone is present is pretty easy though, since there will be a single tone (one bin in the FFT) that is much larger than the others.
Exactly how you implement these things depends on your application. The third step, "doing something with it", requires you to decide what is a meaningful signal. Again, if you are making a Morse decoder, you would perhaps turn an LED ON when a single tone is present (one or two bins in the FFT have much bigger value than the mean of the others), and OFF when it is not (all noise - lots of bins with approximately the same size). But without a LOT more information from you, there's not much more one can say to help you.
You might learn a lot from reading the following articles:
http://www.arduinoos.com/2010/10/sound-capture/
http://www.arduinoos.com/2010/10/fast-fourier-transform-fft/
http://interface.khm.de/index.php/lab/experiments/frequency-measurement-library/
Related
I am using noise meter to read noise in decibels. When I run the app it is recording almost 120 readings per second. I don't want those many recordings. Is there any way to specify that I want only one or two recordings per second like that. Thanks in advance. noise_meter package.
I am using code from git hub which is already written using noise_meter github repo noise_meter example
I tried to calculate no. of samples using sample rate which is 40100 in the package. but I can't understand it.
As you see in the source code , audio streamer uses a fixed size buffer of a new thousand and an audio sample rate of 41000, and includes this comment Uses a buffer array of size 512. Whenever buffer is full, the content is sent to Flutter. So, small audio blocks will arrive at the consumer frequently (as you might expect from a streamer). It doesn't seem possible to adjust this.
The noise meter package simply takes each block of audio and calculates the noise level, so the rate of arrival of those is exactly the same as rate of arrival of audio blocks from the underlying package.
Given the simplicity of the noise meter calculation, you could replace it with your own code directly on top of audio streamer. You just need to collect multiple blocks of audio together before performing the simple decibel calculation.
Alternatively you could simply discard N out of each N+1 samples.
The idea is Phone A sends a sound signal and bluetooth signal at the same time and Phone B will calculate the delay between the two signals.
In practice I am getting inconsistent results with delays from 90ms-160ms.
I tried optimizing both ends as much as possible.
On the output end:
Tone is generated once
Bluetooth and audio output each have their own thread
Bluetooth only outputs after AudioTrack.write and AudioTrack is in streaming mode so it should start outputting before the write is even completed.
On the receiving end:
Again two separate threads
System time is recorded before each AudioRecord.read
Sampling specs:
44.1khz
Reading entire buffer
Sampling 100 samples at a time using fft
Taking into account how many samples transformed since initial read()
Your method relies on basically zero latency throughout the whole pipeline, which is realistically impossible. You just can't synchronize it with that degree of accuracy. If you could get the delays down to 5-6ms, it might be possible, but you'll beat your head into your keyboard before that happens. Even then, it could only possibly be accurate to 1.5 meters or so.
Consider the lower end of the delays you're receiving. In 90ms, sound can travel slightly over 30m. That's the very end of the marketed bluetooth range, without even considering that you'll likely be in non-ideal transmitting conditions.
Here's a thread discussing low latency audio in Android. TL;DR is that it sucks, but is getting better. With the latest APIs and recent devices, you may be able to get it down to 30ms or so, assuming you run some hand-tuned audio functions. No simple AudioTrack here. Even then, that's still a good 10-meter circular error probability.
Edit:
A better approach, assuming you can synchronize the devices' clocks, would be to embed a timestamp into the audio signal, using a simple am/fm modulation or pulse train. Then you could decode it at the other end and know when it was sent. You still have to deal with the latency problem, but it simplifies the whole thing nicely. There's no need for bluetooth at all, since it isn't really a reliable clock anyway, since it can be regarded as having latency problems of its own.
This gives you a pretty good approach
http://netscale.cse.nd.edu/twiki/pub/Main/Projects/Analyze_the_frequency_and_strength_of_sound_in_Android.pdf
You have to create an 1 kHz sound with some amplitude (measure in dB) and try to measure the amplitude of the sound arrived to the other device. From the sedation you might be able to measure the distance.
As I remember: a0 = 20*log (4*pi*distance/lambda) where a0 is the sedation and lambda is given (you can count it from the 1kHz)
But in such a sensitive environment, the noise might spoil the whole thing, just an idea, how I would do if I were you.
I am working a phone recording software (android) which record a conversation between 2 people on a phone call. The output of each phone call is an audio file of which contains the sound from both the caller and callee.
However, most of the time, the voice from the phone that this software run on is clearer than the other. Users request me to make the 2 sound equally clear.
So the problem I have now is: I have a sound file containing voices from 2 sources with different volume, what should I do make the volume of voice from those 2 sources equally regarding the noise should not be increased. Given that this is a phone call so at a specific time there is only one person speaking.
I see at least 1 straight solution for this: making a program analyzing the wave form of the sound file, identifying parts of the sound file coming from the source having smaller voice and increase it to a level seemingly balance with the another. However this will be not an easy one to implement and I also hope that there would be better solution out there. Do you have any suggestion for me?
Thank you.
Well, the first thing to do is to get rid of all of the noise that you do not care about.
The spectrum that you would want to use is: 300 Hz to 3500 Hz
You can cut all of the other frequencies which would substantially cut your noise. You can then apply an autoequalization gain profile or even tap into the DSP profiles available on several devices.
I would also take a look at this whitepaper if you have a chance. (IEEE or ACM membership required).
An Auto-Equalization System Based on DirectShow Technology and Its Application in Audio Broadcast System of Radio Station
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=5384659&contentType=Conference+Publications&searchWithin%3Dp_Authors%3A.QT.Bai+Xinyue.QT.
This is how I have solved this problem:
1. I decode the audio into a series of Integer value thank to the storing WAV format.
The result be [xi] ; 0 < xi < 255
2. Then I have to decide 2 custom value:
- Noise threshold? if xi > threshold => it is not noise (pretty naive!)
- How long should sound be a chunk of human voice?
I myself choose the first value to 5 and the second value to 100ms
3. My algorithm will analyze the [xi] in to [Yi] with each Y is an array of x and each Y represent a chunk of human sound.
After that, I apply k-mean with k=2 and got 2 different cluster of Y, one belongs to the person whose voice is louder and the other belongs to the one with softer voice.
4. What left is pretty straight forward, I have to decide a parameter M, each x belong to a Y of the softer voice will multiply with M and I get the final result.
Hello I was wondering if using the android tone generator class would it be possible to create a tone in one device and listen for this same tone in another device. If this is possible I do have a few other questions.
Taking backround noise into consideration is it possible to listen for only this specific tone?
Would this process be resource intensive?
Could I use a tone that would be inaudable to the human ear or close to it?
Lastly could I use a tone that could only be heard with a couple of feet from the sending device?
Thanks very much for yer time guys and girls :)
Edit >
Thanks For adding the audio processing tag sabastian. Much better discription.
It would be CPU intensive, yes.
The way to it is quite simple: you need a permanent recorder which puts the received data into a FFT (fast fourier transform). FFT basically does one thing: splits the audio into a frequency/power-scale. With this "background noise cleaned" result you can check things like "was there a tone with 1000Hz playing for at least 2 seconds" - and act accordingly.
There is a reasonable speed FFT implementation here: http://www.badlogicgames.com/wordpress/?p=449
FFT can also be used (actually, IS used) for detection of dualtone dialing (DTMF) - 2 frequencies at same time is much better than just using one (as the error rate drop significantly and you can go to shorter duration for the tone sending/detecting).
"Inaudible" won't be possible, as (a) the speaker can not produce such sounds (b) you are limited in sampling rate - so also limited in both producing and recording such high frequencies.
"couple of feet" will be naturally imposed (not very loud speaker, not very good microphone).
Have a look at this other question: "Android: Need to record mic input". I think you can modify that for your task, then with sound bytes you can have filtering or FFT.
Hope it helps
I'm trying to build a gadget that detects pistol shots using Android. It's a part of a training aid for pistol shooters that tells how the shots are distributed in time and I use a HTC Tattoo for testing.
I use the MediaRecorder and its getMaxAmplitude method to get the highest amplitude during the last 1/100 s but it does not work as expected; speech gives me values from getMaxAmplitude in the range from 0 to about 25000 while the pistol shots (or shouting!) only reaches about 15000. With a sampling frequency of 8kHz there should be some samples with considerably high level.
Anyone who knows how these things work? Are there filters that are applied before registering the max amplitude. If so, is it hardware or software?
Thanks,
/George
It seems there's an AGC (Automatic Gain Control) filter in place. You should also be able to identify the shot by its frequency characteristics. I would expect it to show up across most of the audible spectrum, but get a spectrum analyzer (there are a few on the app market, like SpectralView) and try identifying the event by its frequency "signature" and amplitude. If you clap your hands what do you get for max amplitude? You could also try covering the phone with something to muffle the sound like a few layers of cloth
It seems like AGC is in the media recorder. When I use AudioRecord I can detect shots using the amplitude even though it sometimes reacts on sounds other than shots. This is not a problem since the shooter usually doesn't make any other noise while shooting.
But I will do some FFT too to get it perfect :-)
Sounds like you figured out your agc problem. One further suggestion: I'm not sure the FFT is the right tool for the job. You might have better detection and lower CPU use with a sliding power estimator.
e.g.
signal => square => moving average => peak detection
All of the above can be implemented very efficiently using fixed point math, which fits well with mobile android platforms.
You can find more info by searching for "Parseval's Theorem" and "CIC filter" (cascaded integrator comb)
Sorry for the late response; I didn't see this question until I started searching for a different problem...
I have started an application to do what I think you're attempting. It's an audio-based lap timer (button to start/stop recording, and loud audio noises for lap setting). It' not finished, but might provide you with a decent base to get started.
Right now, it allows you to monitor the signal volume coming from the mic, and set the ambient noise amount. It's also using the new BSD license, so feel free to check out the code here: http://code.google.com/p/audio-timer/. It's set up to use the 1.5 API to include as many devices as possible.
It's not finished, in that it has two main issues:
The audio capture doesn't currently work for emulated devices because of the unsupported frequency requested
The timer functionality doesn't work yet - was focusing on getting the audio capture first.
I'm looking into the frequency support, but Android doesn't seem to have a way to find out which frequencies are supported without trial and error per-device.
I also have on my local dev machine some extra code to create a layout for the listview items to display "lap" information. Got sidetracked by the frequency problem though. But since the display and audio capture are pretty much done, using the system time to fill in the display values for timing information should be relatively straightforward, and then it shouldn't be too difficult to add the ability to export the data table to a CSV on the SD card.
Let me know if you want to join this project, or if you have any questions.