How can I detect when the user blows into the device microphone? This would then be used to trigger some action by the app.
The job of detecting when a user blows into the microphone is separable into two parts: (1) taking input from the microphone and (2) listening for a blowing sound.
The noise/sound of someone blowing into the mic is made up of low-frequency sounds. We’ll use a low pass filter to reduce the high frequency sounds coming in on the mic; when the level of the filtered signal spikes we’ll know someone’s blowing into the mic.
Source:
http://mobileorchard.com/tutorial-detecting-when-a-user-blows-into-the-mic/
EDIT
And here is some small SoundMeter class for Android:
http://code.google.com/p/android-labs/source/browse/trunk/NoiseAlert/src/com/google/android/noisealert/SoundMeter.java?r=2
I would make and FFT and compare the spectrum with that of more sensible sounds. The blow will likely resemble white noise. Before seeing the spectrums of blow, speech and white noise, I have no idea how to tell one from another.
Related
I am developing an android application which records audio in PCM using the AudioRecord API. I want to adjust the mic sensitivity to low, medium and high as the user chooses it in the settings.
Is it possible to adjust the mic sensitivity? Your answers will be highly appreciated :)
Not really. It's usually possible to get at least two different "sensitivities" (acoustic tunings used by the platform) implicitly by using different AudioSources.There should at least be one tuning for handset recording and one for far-field recording. On some devices you might also have different far-field tunings, e.g. one for recording audio a few decimeters away and one for recording audio a few meters away.
The problem is that you can't really know which AudioSource corresponds to which tuning, as there's no standard for it. CAMCORDER typically means far-field, and VOICE_RECOGNITION often means handset mode, but there's no guarantee for it. You should also keep in mind that vendors typically apply automatic gain control, noise reduction, etc that you as a user / app developer can't disable, in order to meet acoustic requirements for their products.
Your best bet would probably be to use a single AudioSource and then do attenuation of the signal in your app to simulate a lower mic sensitivity. You could do amplification as well, but that would be akin to using digital zoom in the camera app (it works, but doesn't look all that good because you're just scaling the existing data).
Hello I was wondering if using the android tone generator class would it be possible to create a tone in one device and listen for this same tone in another device. If this is possible I do have a few other questions.
Taking backround noise into consideration is it possible to listen for only this specific tone?
Would this process be resource intensive?
Could I use a tone that would be inaudable to the human ear or close to it?
Lastly could I use a tone that could only be heard with a couple of feet from the sending device?
Thanks very much for yer time guys and girls :)
Edit >
Thanks For adding the audio processing tag sabastian. Much better discription.
It would be CPU intensive, yes.
The way to it is quite simple: you need a permanent recorder which puts the received data into a FFT (fast fourier transform). FFT basically does one thing: splits the audio into a frequency/power-scale. With this "background noise cleaned" result you can check things like "was there a tone with 1000Hz playing for at least 2 seconds" - and act accordingly.
There is a reasonable speed FFT implementation here: http://www.badlogicgames.com/wordpress/?p=449
FFT can also be used (actually, IS used) for detection of dualtone dialing (DTMF) - 2 frequencies at same time is much better than just using one (as the error rate drop significantly and you can go to shorter duration for the tone sending/detecting).
"Inaudible" won't be possible, as (a) the speaker can not produce such sounds (b) you are limited in sampling rate - so also limited in both producing and recording such high frequencies.
"couple of feet" will be naturally imposed (not very loud speaker, not very good microphone).
Have a look at this other question: "Android: Need to record mic input". I think you can modify that for your task, then with sound bytes you can have filtering or FFT.
Hope it helps
I would like to listen to the mic (I guess using AudioRecord) and perform some action the very moment a person starts to speak. I know I can buffer audio with AudioRecord, but how do I analyze it ?
Well, the difficult part will be getting the phone to recognize that it's voice. You can set the voice recognition system as the input, instead of the mic, which might be able to do that. I don't think so though, because (I actually read all about this yesterday) the phone doesn't actually do the recognizing, it just opens up a live stream (like a phone call) to the Google servers, and they do the recognizing.
Also, the information that I have found so far points to the conclusion that Android does not support analysis of live audio from the mic. All these other apps that seem to be "live" are actually just taking a bunch of small samples and analyzing them really quickly so that they seem live. A 500 millisecond sample every 300 milliseconds seems to be common.
Luckily, on the side of my programming job, I'm also a sound technician, so I can tell you that (if you were willing to put in the work) there is a way to detect actual voice as opposed to just sound. Every voice is split into a few distinct ratios of frequencies which all combine to make the voice we hear, and every voice's ratios remains pretty constant, while each individual voice's ratios are different (which is why voice-based passwords work). So, if you were able to take a sample, break it up into frequencies of about 10hz each, and watch for the amplitude of each, and when you got a frequency/amplitude pattern that looked similar to a voice instead of just "white noise", you'd be in business. DOING that however, doesn't seem like it'd be easy at all. Something similar has been done before with the app called SpectralView, which displays the audio spectrum all broken up.
Also, as you can see by using the Voice Search, a voice also fluctuates a lot in how loud it is. You could look for that, but it wouldn't be as reliable.
In conclusion, how do you analyze it? Well, you would have to look for a pattern in the frequencies that looks like a voice. How do you do that? Well, to be honest, I don't know for sure. Sorry.
I am writing an application that will behave similar to the existing Voice recognition but will be sending the sound data to a proprietary web service to perform the speech recognition part. I am using the standard MediaRecord (which is AMR-NB encoded) which seems to be perfect to speech recognition. The only data provided by this is the Amplitude via the getMaxAmplitude() method.
I am trying to detect when the person starts to talk so that when the person stops talking for about 2 seconds I can proceed to send the sound data to the web service. Right now I am using a threshold for the amplitude that if its goes over a value (i.e. 1500) then I assume the person is speaking. My concern is that the amplitude levels may vary by device (i.e. Nexus One v Droid), so I am looking for a more standard approach to this that can be derived from the amplitude values.
P.S.
I looked at graphing-amplitude but it doesn't provide a way to do it with just the amplitude.
Well, this might not be of much help but how about starting by measuring the offset noise captured by the microphone of the device by the application, and apply the threshold dynamically based on that? That way you would make it adaptable to the different devices' microphones and also to the environment the user is using it at, at a given time.
1500 is too low of a number. Measuring the change in amplitude will work better.
However, it will still result in miss detections.
I fear the only way to solve this problem is to figure out how to recognize a simple word or tone rather than simply detect noise.
There are now multiple VAD library designed for Android. One of these are:
https://github.com/gkonovalov/android-vad
Most of the smartphones come with a proximity sensor. Android has API for using these sensors. This would be adequate for the job you described. When the user moves the phone near to his ear, you can code the app to start recording. It should be easy enough.
Sensor class for android
I'm trying to build a gadget that detects pistol shots using Android. It's a part of a training aid for pistol shooters that tells how the shots are distributed in time and I use a HTC Tattoo for testing.
I use the MediaRecorder and its getMaxAmplitude method to get the highest amplitude during the last 1/100 s but it does not work as expected; speech gives me values from getMaxAmplitude in the range from 0 to about 25000 while the pistol shots (or shouting!) only reaches about 15000. With a sampling frequency of 8kHz there should be some samples with considerably high level.
Anyone who knows how these things work? Are there filters that are applied before registering the max amplitude. If so, is it hardware or software?
Thanks,
/George
It seems there's an AGC (Automatic Gain Control) filter in place. You should also be able to identify the shot by its frequency characteristics. I would expect it to show up across most of the audible spectrum, but get a spectrum analyzer (there are a few on the app market, like SpectralView) and try identifying the event by its frequency "signature" and amplitude. If you clap your hands what do you get for max amplitude? You could also try covering the phone with something to muffle the sound like a few layers of cloth
It seems like AGC is in the media recorder. When I use AudioRecord I can detect shots using the amplitude even though it sometimes reacts on sounds other than shots. This is not a problem since the shooter usually doesn't make any other noise while shooting.
But I will do some FFT too to get it perfect :-)
Sounds like you figured out your agc problem. One further suggestion: I'm not sure the FFT is the right tool for the job. You might have better detection and lower CPU use with a sliding power estimator.
e.g.
signal => square => moving average => peak detection
All of the above can be implemented very efficiently using fixed point math, which fits well with mobile android platforms.
You can find more info by searching for "Parseval's Theorem" and "CIC filter" (cascaded integrator comb)
Sorry for the late response; I didn't see this question until I started searching for a different problem...
I have started an application to do what I think you're attempting. It's an audio-based lap timer (button to start/stop recording, and loud audio noises for lap setting). It' not finished, but might provide you with a decent base to get started.
Right now, it allows you to monitor the signal volume coming from the mic, and set the ambient noise amount. It's also using the new BSD license, so feel free to check out the code here: http://code.google.com/p/audio-timer/. It's set up to use the 1.5 API to include as many devices as possible.
It's not finished, in that it has two main issues:
The audio capture doesn't currently work for emulated devices because of the unsupported frequency requested
The timer functionality doesn't work yet - was focusing on getting the audio capture first.
I'm looking into the frequency support, but Android doesn't seem to have a way to find out which frequencies are supported without trial and error per-device.
I also have on my local dev machine some extra code to create a layout for the listview items to display "lap" information. Got sidetracked by the frequency problem though. But since the display and audio capture are pretty much done, using the system time to fill in the display values for timing information should be relatively straightforward, and then it shouldn't be too difficult to add the ability to export the data table to a CSV on the SD card.
Let me know if you want to join this project, or if you have any questions.