I'm trying to stream the audio data recorded on android to a micro-controller for playback. the audio is recorded using the AudioRecord class and is then sent over UDP. on the receiving side, the micro-controller receives the data and plays it using PWM. there are a couple of problems though :
I don't exactly know what format the AudioRecord class uses. i'm using ENCODING_PCM_16BIT but don't even know if its bipolar or not and how to convert it to unipolar if it is.
Due to limited bandwidth, i can't send more than 8 bits per sample. since 8 bit PCM isn't supported on my phone, i've used the 16 bit version but for conversion, i've just used the upper 8 bits. i'm not sure if that's right.
Since i've used a weird Crystal Oscillator for my circuit, the audio has to be sampled at 7.2kHz. my phone supports 8kHz sampling so i just use that and send %90 of the recorded data (using a for loop with a float as variable).
I've hooked up a 2W speaker to the OC2 pin on my ATmega32 using a 220 Ohm resistor and a 100nF capacitor to act as a filter. (Schematic) but again i'm not sure if its the correct way to do it.
So all of this put together produces nothing but noise as output. the only thing that changes when i "make some noise" near the MIC is the volume and the pattern of the output noise. the pattern doesn't make any sense though and is the same for human voice or music.
This is the piece of code i wrote to convert the data before sending it over UDP :
float divider = 8/7.2f;
int index=0;
recorder.read(record_buffer,0,buffer_size);
for(float i=0;i<buffer_size;i+=divider)
{
send_buffer[index++]= (byte) (record_buffer[(int)i] >> 8);
}
I don't know where to go from here. any suggestion is appreciated.
Update:
I took RussSchultz's advice and sent a sine wave over UDP and hooked up the output to my cheap O-Scope. this is what i get:
No Data : http://i.stack.imgur.com/1XYE6.png
No Data Close-up: http://i.stack.imgur.com/ip0ip.png
Sine : http://i.stack.imgur.com/rhtn0.png
Sine Close-up: http://i.stack.imgur.com/12JxZ.png
There are gaps when i start sending the sine wave which could be the result of buffer overflow on the hardware. since the gaps follow a pattern, it can't be UDP data loss.
so after working on this for a month i got it to work.
I don't exactly know what format the AudioRecord class uses. i'm using ENCODING_PCM_16BIT but don't even know if its bipolar or not and how to convert it to unipolar if it is.
Due to limited bandwidth, i can't send more than 8 bits per sample. since 8 bit PCM isn't supported on my phone, i've used the 16 bit version but for conversion, i've just used the upper 8 bits. i'm not sure if that's right.
its was bipolar. i had to convert it to 8 bit by adding half the dynamic range to each sample and taking the upper 8 bits.
Since i've used a weird Crystal Oscillator for my circuit, the audio has to be sampled at 7.2kHz. my phone supports 8kHz sampling so i just use that and send %90 of the recorded data (using a for loop with a float as variable).
even though i have a slight frequency shift, its still acceptable.
I've hooked up a 2W speaker to the OC2 pin on my ATmega32 using a 220 Ohm resistor and a 100nF capacitor to act as a filter. (Schematic) but again i'm not sure if its the correct way to do it.
i changed the filter to an exact 3.6KHz low pass RC one (using one of the many online calculators). the speaker should not be connected directly because it requires a current uC can't provide. you will still get an output but the quality is not good at all. what you should do is drive the speaker using a darlington pair or (as i have) use a simple op-amp circuit.
Related
I am using noise meter to read noise in decibels. When I run the app it is recording almost 120 readings per second. I don't want those many recordings. Is there any way to specify that I want only one or two recordings per second like that. Thanks in advance. noise_meter package.
I am using code from git hub which is already written using noise_meter github repo noise_meter example
I tried to calculate no. of samples using sample rate which is 40100 in the package. but I can't understand it.
As you see in the source code , audio streamer uses a fixed size buffer of a new thousand and an audio sample rate of 41000, and includes this comment Uses a buffer array of size 512. Whenever buffer is full, the content is sent to Flutter. So, small audio blocks will arrive at the consumer frequently (as you might expect from a streamer). It doesn't seem possible to adjust this.
The noise meter package simply takes each block of audio and calculates the noise level, so the rate of arrival of those is exactly the same as rate of arrival of audio blocks from the underlying package.
Given the simplicity of the noise meter calculation, you could replace it with your own code directly on top of audio streamer. You just need to collect multiple blocks of audio together before performing the simple decibel calculation.
Alternatively you could simply discard N out of each N+1 samples.
I've just written some iOS code that uses Audio Units to get a mono float stream from the microphone at the hardware sampling rate.
It's ended up being quite a lot of code! First I have to set up an audio session, specifying a desired sample rate of 48kHz. I then have to start the session and inspect the sample rate that was actually returned. This will be the actual hardware sampling rate. I then have to set up an audio unit, implementing a render callback.
But I am at least able to use the hardware sampling rate (so I can be certain that there is no information is lost through software re-sampling). And also I am able to set the smallest possible buffer size, so that I achieve minimal latency.
What is the analogous process on android?
How can I get down to the wire?
PS Nobody has mentioned it yet but it appears to be possible to work at the JNI level.
The AudioRecord class should be able to help you do what you need from the Java/Kotlin side of things. This will give you raw PCM data at the sampling rate you requested (assuming the hardware supports it.) It's up to your app to read the data out of the AudioRecord class in an efficient and timely manner so it does not overflow the buffer and drop data.
Exists library for android which converts bits to sound? For example, we have number as BigInteger. This number is converted to bytes and then this bytes are presented as short sound with different frequency (the best - ultra sounds). Of course, working in the opposite way would be great. Microphone get sound and converts it to bits.
Did anybody hear something about this kind of library?
I really fail at FFT and now I'm in need to communicate from the headphone jack of my Android to the Arduino there's currently a library for Arduino (talks about it in the blog post Real-time spectrum analyzer powered by Arduino) and one for Android too!
How should I start? How should I build audio signals which ultimately can be turned into FFTs and the Arduino can analyse the same using the library and I can actuate anything?
You are asking a very fuzzy question: "How should I build audio signals which ultimately can be turned into FFTs and the Arduino can analyse the same using the library and I can actuate anything?". I am going to help you think through the problem - asking yourself the right questions is essential to get any answers.
Presumably, your audio signals are "coming from somewhere" - i.e. they are sound. This means that you need to convert them into a stream of numbers first.
problem #1: converting audio signal into a stream of numbers
This breaks down into three separate sub problems:
Getting the signal to the right amplitude
Choosing the sampling rate needed
Digitizing and storing the data for later processing
Items (1) and (3) are related, since you need to know how you are going to digitize the signal before you can choose the right amplitude. For example, if you have a microphone as your sound input source, you will need to amplify the signal (and maybe add some automatic gain control) before feeding it into an ADC (analog to digital converter) that has a 5 V input range, since the microphone may have an output in the mV range. Without more information about the hardware you are using, there's not a lot to add here. It sounds from your tag that you are trying to do that inside an Android device - in which case I wonder how you intend to move the digital signal to the Arduino (over USB?).
The second point, "choosing the sampling rate", is actually very important. A sound signal contains many different frequencies - think of them as keys on the piano. In order to detect a high frequency, you need to sample the signal "faster than it is changing". There is a formal theorem called "Nyquist's Theorem" that states that you have to sample at 2x the highest frequency that is present in your signal. Note - it's not just "that you are interested in", but "that is present". If you sample a high frequency signal with a low frequency sample clock, it will appear "aliased" - it wil show up in your output as something completely different. So before you digitize a signal you have to decide what the frequencies of interest are, and remove all higher frequencies with a filter. Let's say you are interested in frequencies up to 500 Hz (about 1 octave above middle C on a piano). To give your filter a chance to work, you might choose to cut off all frequencies above 1 kHz (filters "roll off" - i.e. they increase in strength over a range of frequencies), and would sample at 2 kHz. This means you get 2000 samples per second, and you need to figure out where to put them on your Arduino (memory fills up quickly on the little board.)
Problem #2: analyzing the signal
Assuming that you have somehow captured a digital signal, your next task is analyzing it. The FFT is basicaly some clever math that tells you, for a given sound sample, "what keys on the piano were hit, and how hard". It breaks the sound signal into a series of frequency "bins", and determines how much energy is in each bin (it also computes the phase, but let's keep it simple). So if the input of a FFT algorithm is a sound sample, the output is an array of values telling you what frequencies were present in the signal. This is approximate, since it will find the "nearest bin". Sticking with the same analogy - if you were hitting a piano that's out of tune, the algorithm won't return "out of tune", but rather "a bit of C, and a bit of C sharp", since it cannot actually measure anything in between. The accuracy of an FFT is determined by the sampling frequency (which gives you the upper limit on the frequency you can detect) and the sample length: the longer you "listen" so the sample, the more subtle the differences you can "hear". So you have another trade-off to consider: if your audio signal changes rapidly, you have to sample for a short time (to capture the quick changes); but if you need an accurate frequency, you have to sample for a long time. For example if you are writing a Morse decoder, your sampling has to be short compared to a pause between "dits" and "dashes" - or they will slur together. Figuring out that a morse tone is present is pretty easy though, since there will be a single tone (one bin in the FFT) that is much larger than the others.
Exactly how you implement these things depends on your application. The third step, "doing something with it", requires you to decide what is a meaningful signal. Again, if you are making a Morse decoder, you would perhaps turn an LED ON when a single tone is present (one or two bins in the FFT have much bigger value than the mean of the others), and OFF when it is not (all noise - lots of bins with approximately the same size). But without a LOT more information from you, there's not much more one can say to help you.
You might learn a lot from reading the following articles:
http://www.arduinoos.com/2010/10/sound-capture/
http://www.arduinoos.com/2010/10/fast-fourier-transform-fft/
http://interface.khm.de/index.php/lab/experiments/frequency-measurement-library/
I am working a phone recording software (android) which record a conversation between 2 people on a phone call. The output of each phone call is an audio file of which contains the sound from both the caller and callee.
However, most of the time, the voice from the phone that this software run on is clearer than the other. Users request me to make the 2 sound equally clear.
So the problem I have now is: I have a sound file containing voices from 2 sources with different volume, what should I do make the volume of voice from those 2 sources equally regarding the noise should not be increased. Given that this is a phone call so at a specific time there is only one person speaking.
I see at least 1 straight solution for this: making a program analyzing the wave form of the sound file, identifying parts of the sound file coming from the source having smaller voice and increase it to a level seemingly balance with the another. However this will be not an easy one to implement and I also hope that there would be better solution out there. Do you have any suggestion for me?
Thank you.
Well, the first thing to do is to get rid of all of the noise that you do not care about.
The spectrum that you would want to use is: 300 Hz to 3500 Hz
You can cut all of the other frequencies which would substantially cut your noise. You can then apply an autoequalization gain profile or even tap into the DSP profiles available on several devices.
I would also take a look at this whitepaper if you have a chance. (IEEE or ACM membership required).
An Auto-Equalization System Based on DirectShow Technology and Its Application in Audio Broadcast System of Radio Station
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=5384659&contentType=Conference+Publications&searchWithin%3Dp_Authors%3A.QT.Bai+Xinyue.QT.
This is how I have solved this problem:
1. I decode the audio into a series of Integer value thank to the storing WAV format.
The result be [xi] ; 0 < xi < 255
2. Then I have to decide 2 custom value:
- Noise threshold? if xi > threshold => it is not noise (pretty naive!)
- How long should sound be a chunk of human voice?
I myself choose the first value to 5 and the second value to 100ms
3. My algorithm will analyze the [xi] in to [Yi] with each Y is an array of x and each Y represent a chunk of human sound.
After that, I apply k-mean with k=2 and got 2 different cluster of Y, one belongs to the person whose voice is louder and the other belongs to the one with softer voice.
4. What left is pretty straight forward, I have to decide a parameter M, each x belong to a Y of the softer voice will multiply with M and I get the final result.