We are working on a cross-platform project that requires sound volume sampling on smartphones and analyse the result with as high accuracy as possible, the IPhone developer used iOS implemented functionality of returnning sound power/volume in dB scale calculated by the OS itself. as far as i know there is no equivalent functionality in Android OS.
as of now, I am working on Android with the MediaRecorder class given by the OS, and i use getMaxAmplitude to measure the sound power/volume, i have seen a lot of answers on the net regard how to transfer amplitude to dB scale, the answer that sounded most reasonable was using the formula :
20*Math.log10(amplitude/MAX_AMPLITUDE)
but then i must know what the MAX_AMPLITUDE that can be returned by getMaxAmplitude, thing is that it is diffrent on diffrent devices, for exemple i tested getMaxAmplitude on HTC Desire, and on Samsung Galaxy S3,
on HTC it was reaching 32767 (which i saw in some answer that is the documented max), and on the S3 it was not going beyond 16383 (half of the HTC).
Q1 :
is this(the approach discussed above) the correct approach? its just that I read that the correct way to measure sound power/volume is by calculating RSM and then convert it to dB, is this how its done on IPhone?
Q2 :
no metter if i use RSM or just the Amplitude from getMaxAmplitude, it seems to me that i still need to know whats the highest amplitude i can get from the record hardware, is there a way to know that? or is there a way to somehow go around it?
90dBspl is an rms value in the acoustic domain.
The digital level of 2500 rms in a 16bit system is the same as approximately -22dB FS rms (actually -22.35), where 0dBFS rms is a full scale square wave. A full scale sinusoidal in such a system is 0dBFS peak and -3dB FS rms (reaching from -32768 to +32767).
A square wave of +/-2500 can be calculated as:
20 * log ( 2500/32767) = -22.35 dB FS rms
Please note that peaks of sinusoidals are always 3dB higher than the rms level. The only signal that has the same rms and peak level is the square wave.
Now, Android has a requirement of 30dB linearity around 90dBspl, but this linearity shall be +12dB above 90dBspl and -18dB below the same point. Outside this range there can be compression in different ways, depending on which phone model you test.
The guaranteed highest linear level inside an Android phone is -22dBFS +12dB = -10dBFS rms. Above this level it is uncertain. The most common scenario is that the last 7dB of peak headroom are still linear, leading to an acoustic maximum level of 90dBspl + (22-3 dB) = 109dB spl rms for a sinusoidal without clipping (or 112 dB spl peak).
In some phones you will find a peak limiter that reduces the gain above 102dBspl rms. The outcome of this is that you can still record up to the level of saturation for the microphone. This saturation level varies, but it is common to have like 2% distortion at 120dB spl. Above this level the microphone component starts to saturate and clip.
Looking at the other end of the scale:
Small phone microphones are in general noisy. The latest microphones can have a noise floor at -63dB below 0dBPa (94dBspl), but most microphones are between -58 and -60dB below 0dBPa.
How can this be calculated to dBFS rms ?
0dBPa rms is 94dB spl rms. From the statement above we know that 90dBspl rms acoustic level will be recorded at the digital level of -22dBFS rms in Android phones. -63dB below 90dBspl is the same as -22dBFSrms +4dB -63dB = -81dBFSrms. The absolute maximum range of dynamics in a 16 bit system can be approximated to 96dB (or 93dB depending how you see it), so the noise level is at least 12dB above the quantization noise in the digital file.
This is a very important finding for video recording mode. Unfortunately many video applications in Android tend to have too high microphone gain in the recording. This leads to clipping when recording loud music concerts and similar situations. We also know that the microphone itself is good up to at least 120dB. So it would be a good idea for any audio system engineer to make a video recording mode that actually used the whole dynamic range of the microphone. This means that the gain should be set at least 8dB lower. It is always possible to change the rms level afterwards in a video recording if the sound is too soft, but if it is clipped, then you have damaged the recording forever.
So, my message to you programmers is to implement a video recording mode where the acoustic level of 90dB spl rms is recorded at -30dBFSrms or slightly below that. Any maximization can be done afterwards. In this way we could record rock concerts with much better sound. Doing automatic gain control does not help the sound quality. The dynamic range is often too big to be controlled automatically. You get a lot of pumping in the sound. It is better to implement two different video recording modes: Concert mode and speech mode. In speech mode (optimized for a talking person at 1m distance) the recording gain could be even higher than -22dBFSrms for 90dBspl. I would say -12dBFS rms for 90dBspl would be a suitable recording level. (speech at 1m distance has an rms level of approximately 57dB spl and peaks 20-30dB higher).
Björn Gröhn
Audio system engineer at Sony mobile Lund, Sweden
Related
The Android Compatibility Definition Document states that
"Audio input sensitivity SHOULD be set such that a 90 dB sound power level (SPL) source at 1000 Hz yields RMS of 2500 for 16-bit samples".
"PCM amplitude levels SHOULD linearly track input SPL changes over at least a 30 dB range from -18 dB to +12 dB re 90 dB SPL at the microphone."
Questions:
Does (1) include Mic sensitivity plus the internal gain of android device to achieve RMS of 2500?
Is (2) for Mic maximum acoustic level only or includes internal of Android device?
Your questions are confusing me. I think you are mixing different levels and gains.
An acoustic level of 90dBspl rms is transfered into the electrical domain through the microphone. The microphone has a different acoustic unit for conversion. It measures sound pressure level in dBPa or Pa. (94dBspl = 0dBPa) The specified -42dBV/Pa means that if you have 0dBPa or 1Pa (which is the same sound pressure level) then you will get -42dBV out from the microphone in the analog electrical domain. -42dBV = 7.94mV (0dBV =1V)
Now, from this point there can be different gains analog and digital. First you can have some analog gain and then you have an A/D converter. After that you are in the digital domain and then you can have digital gain as well. The Android requirement does not specify these gains. It specifies what final digital level you should have with a given acoustic sound pressure level. You can of course calculate each and every step inside the sound chain, but the easiast way is to set all digital gains to 0dB and maybe set the analog gain to something around +20dB (if possible), then you try to get an acoustic sound source with the proper sound pressure level. You will need a sound pressure level meter and a sinusoidal 1kHz tone played through a loudspeaker at maybe 20cm distance in a fairly non-reverberant echo-free room.
Now you record the 90dBspl 1kHz tone with your device and analyze the recording in the digital domain. If you can, you should adjust the gain in the analog domain. Then the digital headroom will be correct. If you do not know what you are doing you could easily try to adjust too much in the digital domain, leading to digital clipping or quantization noise. Digital gain should only put in when you have done everything you can in the analog domain.
If everything is correctly adjusted you will have a good matching between 90dBspl rms acoustic level and the recorded digital level of -22dBFS rms which is the level of 2500 rms in a 16 bit system (this is however a very strange way of measuring). 0dBFS rms is a fully saturated square wave in such a system. A fully saturated sinusoidal will have -3dBFS rms or 0dBFS peak.
Be aware of that if you have enabled any automatic gain control, you will probably not be able to comply with the requirement of linearity.
I am developing an android app for recording the sound. In my app i will display the SPL (Sound Pressure Level) in dB. As part of my search, i come across, mobile hardware can only record sounds up to <= 110 dB. The reason is, mobiles are designed for human voice recording and that falls under the range of 60 dB. So if i need to record the sounds which is more than 110 dB how the mobile hardware will respond to that? Do i need to depend upon external devices and not the mobiles? Please provide your comments.
Thanks & regards,
Siva.
Your question is in fact about the dynamic range of the audio input of a mobile phone - any value you record must be capable of being represented in the scale used to measure it.
There is an associated question of what the largest sound pressure level that a particular phone can record, but this is ultimately limited by the dynamic range and the design of transducer used. Any absolutely measure is relative a calibration point - which in digital audio systems is dB FSD (e.g. ratio sample to maximum), yielding negative values.
The dynamic range in dB of a ideal PCM system is limited by quantisation noise and is related directly to bit-depth (Q) of the sample:
SQNR = 20*log10(2 ^ Q) =~ 6.02Q
State-of-the-art ADCs used in pro-audio equipment typically have 24-bit sample depth giving a SQNR of 144dB. It's worth noting, that in silicon ADCs and DACs, the thermal noise floor of the analogue section of the converter is smaller than this, and the LSB might as well be random.
AFAIK, Android is using 16-bit PCM, which has a SQNR of 96dB. This is the same performance as the CD Audio standard. A SNR of 110dB wouldn't be bad for pro-audio equipment.
In practice, audio quality is rarely a headline feature of phones and most get nowhere near this. Most users use crappy headphones or the on-board speaker of their phone for voice calls and won't notice the difference. It's an obvious corner to cut from both a cost and power budget point of view for a phone manufacturer.
Additionally, good digital audio design is a black-art. Factors such as decoupling of digital signals from ground and physical proximity of analogue components come into play. You find that in tear-downs of Apple kit, they often place the codec right next to the headphone jack, and away from the main board of the system. Again, other cost-conscious manufactures don't do this, and it'll degrade the dynamic range of the system.
In order to get meaningful measurements from the audio input you will need to disable both automatic gain control (AGC) and probably the HFP (used to remove DC bias, and often set with Fc > 100Hz for voice calls).
If your intention is to record absolute SPL, you will need to calibrate the audio system of the device to a set-point. There is no standardisation of this between manufacturers (or even devices from any given manufacturer). Unless you fancy doing this for the devices on the market (of which there are a lot), you'll never provide universally accurate measurements.
I'm processing audio using a phone Samsung Galaxy mini and also in a tablet Nexus 7
I've using the class audiorecord, until now, I have been able to correctly analyze audio from frequencies 200 to ~20000 Hz.
I'm detecting pitch through auto-correlation, I based in this code: http://tarsos.0110.be/artikels/lees/YIN_Pitch_Tracker_in_JAVA
I am using 44100Hz of sampling frequency, and I have also used 8000Hz.
I have not being able to detect pitch from lower frequencies, I can hardly detect 100Hz by pointing the microphone to a speaker.
Does someone know the input frequency response of the devices or if are physically or code limited?
I would like at least being able to detect correctly from 50Hz because I'm trying to do a voice detector and I being struggling with this low frequencies in order to detect male voices.
Thank you for all.
-Jessica
I cant tell you about what is the limit of low frequencies that these microphones can capture.
Out of curiosity I did some tests with YIN here...
I'm using one Window = 2048 and Overlap = 1024, and I can find Frequency above 40HZ in recorded files sampled at 44100Hz, this prove me that the algorithm can find low frequencies.
You can do tests with you phone using pure sinusoid at 50Hz and see if your code can track.
"The fundamentals of human voices are roughly in the range of 80 Hz to 1100 Hz"
My guess is that the microphones from smart phones are not so good :-(
I need to find a way to get the current audio output volume while the phone is making noise on the headphones, this value will be converted to a decibel level. The android API does not appear to have any way of accessing a constant volume level other than a seemingly arbitrary volume setting level, but I dont see a way to convert that to a standard decibel level or "loudness" measurement. I have seen some ways to use the mic for this, but that wont work with headsets very well.
Does anyone know a way to measure either the maximum possible decibel (or some standard) output level to compare against, or possible the voltage being sent to the headset?
Help is welcomed.
Be aware that there are many different meanings of the word 'deciBel'. It is a means of representing some quantity (such as intensity/power/loudness) relative to a reference point. For audio signals inside equipment, or in an audio application, there is a peak level of 0dB. When sound is emitted from a speaker, the perceived loudness is measured as a Sound Pressure Level, often described as 'dB (SPL)' (or weighted variants such as dBA). When you see the tables of values such as rock concerts at 100dB then this is the SPL that is being described. This measurement is itself relative to a reference level.
So what will have available in the API is the buffer of audio data from which you can easily obtain the audio level in terms of the raw signal (which has a maximum of 0dB). You can't however easily convert this to a physical loudness because this will be dependent on the hardware. It will be different between one model of phone and the next, and will depend on the headphones too. The only way of doing this will be to calibrate the phone by measuring with an SPL meter, but then this will give you a result which will only give reasonable results on this particular phone.
I'm doing it like this:
SLmillibel gain_to_attenuation(float volume)
{
SLmillibel volume_mb;
if(volume>=1.0f) volume_mb=SL_MILLIBEL_MAX;
else if(volume<=0.02f) volume_mb=SL_MILLIBEL_MIN;
else
{
volume_mb=M_LN2/log(1.0f/(1.0f-volume))*-1000.0f;
if(volume_mb>0) volume_mb=SL_MILLIBEL_MIN;
}
return volume_mb;
}
I want to determine the volume of an audio signal.
I have found two options:
Compute Root Mean Squared of the amplitude
find the maximum amplitude
Are there advantages to using #1 or #2?
Here is what I am trying to do:
I want my Android to analyze audio from the microphone. I want the device to detect a loud noise. The input is a short [].
If you use the maximum amplitude (2), then your volume level would be determined by a single sample (which you might not even be able to hear). When calculating a value that correlates with your impression of the loudness of the sound such as the Sound Pressure Level or the Sound Power Level you need to use the RMS (1).
Because you ear is not equally sensitive to all frequencies, a better correlate of your perception can be had by using an A-weighting on the signal. Split (filter) the signal in octave bands, calculate the RMS for each band and apply the A-weighting.
If you want to check volume level, just compute its dB Value (I assume the signal is normalized i.e. 1 == maximum level):
level[n] = - 20 x log(1/signal[n]);
However, detecting audio noise is not a trivial task. The most common and simple technique is to use algorithm called NoiseGate which basically compares the signal level with some dB Threshold value - if the signal level is above threshold, then the output is zeroed. But it is unusable in practice; there must be also some Attack and Release times for smooth thresholding otherwise it would affect also a real signal (music, speech) and produce some kind of clipping.
Check Google, it will give you a lot of resources about NoiseGate algorithm and noise removal techniques:
http://en.wikipedia.org/wiki/Noise_gate
http://www.developer.com/java/other/article.php/3599661/Adaptive-Noise-Cancellation-using-Java.htm