Signal-processing (morse code decoding) with Android application

Signal-processing (morse code decoding) with Android application - android

I made test application in Delphi that beeps morse code using Windows API Beep function. Then made an application in Android that stores this morse code in WAV file. Now I want Android application to decode the morse code. Is there some tutorials for sound processing or can somebody post some simple code (think there's no simplicity here) for an example? Or maybe steps that I need to do to get it work?
I also downloaded the JTransforms and jfttw libraries but don't really know where to start.
Regards,
evilone

An FFT is overkill for this - you can just use a simple Goertzel filter to isolate the morse code from background noise, then decode the output of this.

I think an older issues of QST magazine had an article on DSP for Morse/CW decoding several years back. Might want to try and search their archives.
Basically, you need DSP code to determine whether or not a tone is present at any given point in time, and an estimate of the onset and off-time of each tone. Then scale the duration of each tone and the gap times between the tones for the expected code speed, and compare against a table of timings for each Morse code letter to estimate the probability of each or any letter being present.
In the simplest case, you might have a dot-dash-space decision tree. In severe noise and fading plus highly personalized fist/timing you might need some sophisticated statistical and/or adaptive audio pattern matching techniques for decent results.

Related

Compare two voice in android

I am working on one voice messaging application, I need to compare two voice like,
Register with app by record your voice
Sent voice message to
another user by record voice, but first need to compare this voice
to recorded voice in profile.
Its for security purpose and need to know recorded message is from specific user or not.
I tried :
Compare two sound in Android
http://www.dreamincode.net/forums/topic/274280-using-fft-to-compare-two-audio-files-and-then-realtime-comparison/
But not getting idea about voice Comparison.
Please share if anybody know about the same. Didn't find any sample to do this.

Since you indicated it's for security purpose, I'd like to first share a few things on voice biometry :-)
The problem with authenticating someone is that you'd need to be sure he was actually there saying the things that were recorded... and that's a whole different level of complexity than merely comparing voice characteristics.
Algorithms extracting voice features from a sample and later calculating the distance between a new sample and the first one can easily be fooled by a recording made up by an attacker.
Since in your case there's a human recipient, creating a message made up of chopped words or sentences from random conversations is actually quite difficult and time consuming. But not completely impossible...
There are very good sounding softwares created for the music industry that will e.g. take some voice audio input and make it sound (intonation and time wise) like a second audio sample (a guide, made by the fraudster). Vocalign Pro by SynchroArts does this to help get perfect backing vocal tracks. You could further tweak the audio by hand using other voice editing softwares and achieve an acceptable level of quality that wouldn't be immediately detected by the recipient.
Depending on what the attacker wants your user to say, the process complexity could range from an hour to a day provided he has all the recording material he wants...
To fight against this type of attack, you need to detect the audio sample has been edited. The digital edition will leave unnatural traces. E.g. in the background noise surrounding the voice.
AFAICT, only the best commercial softwares achieve this level security check, but I can't tell how far they go in the detection of such edits.
From a pure security perspective, you'd also need to be sure the device was not compromised. So these voice verification check should happen server side and not on the phone itself.
Please note these are general considerations and it all depends on what sort of security measures you actually need for your use case. My car alarm is certainly not unbreakable but it helps raising the bar so fewer attackers could potentially steal it...
Another thing to consider is that biometry is by definition a statistical process and it will yield a certain percentage of false positives and false negatives. By changing the acceptance threshold, you'll be able to lower one of them at the cost of raising the other.
Selecting an appropriate threshold will require you to have a fair amount of test data. Say 1 minute recording of at least 200 speakers to start getting a picture.
One more thing I think you'll need to consider is the inherent variability of the human voice. People may be sick which in some cases might render the voice unrecognizable. Also the emotional state might play a role: sadness or anger will yield different sounding voices...
And last but not least, the surrounding noise might pose a problem. Say the user enrolled while at home and later records a message while on the go in a busy city environment, the system might have troubles making sure it's actually the same person speaking. The signal to noise ratio is definitely going to be one of your main issues. Small tip: depending on the distance of the microphone to the mouth, the ratio will be quite different. You'll get way better result when the user puts the phone close to its face like in a regular phone conversation than when the user looks at the screen while recording the message.
Voice variability and signal to noise ratio are probably the main reasons behind false negative results.
Hopefully, you now have a better understanding of the challenges awaiting you and I can start sharing some pointers for open source and commercial libraries.
AFAIK, there are no open source libraries that includes fraudster detection...
You may want to check Nuance Communication for state-of-the-art. There are plenty other vendors, just check with Google, I only mentioned Nuance because of it's reputation.
There is an OSS library called Alize (written in C++, under LGPL license) which uses an algorithm called MFCC (Mel Frequency Cepstrum Coefficients). MFCC is known to bring excellent results. Expect a steep learning curve as this software is aimed at researchers willing to improve the state-of-the-art on this topic and the vocabulary used is very specific.
I wrote an OSS library named Recognito (Java, Apache 2.0) aimed at regular developers so you should be able to test it in a matter of minutes. The lib is very young and I first focused on it's API before improving the algorithms. The algorithm I use for the moment is called Linear Predictive Coding (LPC) and is known to bring good results (and I do have good results, provided recordings yield the same level of quality :-)). I'm currently in the process of releasing a new version including a likelihood coefficient in the match results. MFCC implementation is on the road map.
There is plenty of javadoc and the code should be very straightforward...
https://github.com/amaurycrickx/recognito
Recognito has a dependency on javax.sound packages for audio file handling. You may want to check this post for what it takes to use it in Android: Voice matching in android
Given many people need something for android, I'll do something about it in the near future instead of saying how one should modify the lib :-)
HTH

Animate object to move based on modulation of a sound?

I am trying to animate an object (mouth, eyebrow and some other expressions) to move in accordance to incoming sound. I was thinking to read sound modulation and detect changes and to animate movement of objects.
Is this a good approach?
If not, how should I approach to coding such feature? Does this have to be done in OpenGL or I can use Android SDK and its animations?

I assume you mean Frequency Modulation, as opposed to Amplitude Modulation? Perhaps a combination of both? That might be pretty neat. I don't think there's any reason NOT do to this...
As to whether its a "good" approach or not? I think it might be pretty cool.
The android SDK has pretty robust animation built in at this point. You should be able to do the animations using the SDK just fine. If you get to the point where you really want to go wild with it you might need to step down a layer for speed concerns.
Look at:
https://stackoverflow.com/questions/17163446/what-is-the-best-2d-game-engine-for-android/17166794#17166794
For a pretty robust coverage of 2d Games in Android, which might point you in the right direciton.

Very interesting idea. I think there are two parts to the problem: input (sound) and output (graphics). For the graphics side, I don't think you need OpenGL per se. You could check out this guide to game development: http://www.techrepublic.com/blog/software-engineer/the-abcs-of-android-game-development-prepare-the-canvas/2157/
I think it is relevant because it deals with moving graphics in real time.
For the sound, it would be nice to have an integer based on the frequency and modulation of the sound. For analyzing the sound, perhaps this library could help you out: http://code.google.com/p/musicg/
It can:
- Read amplitude-time domain data
- Read frequency-time domain data
Then, you could progress through the amplitude data of a sound in real time and update the graphics accordingly.

Real-time audio denoise using FFT on android

I'm thinking of starting a android project, which records audio signals and does some processing to denoise. My quesion is, as many (nearly all) denoising algorithms involve FFT, is it possible for me to do a real-time program? By real-time I mean the program do recording and processing at the same time, so I could save my time when I finish recording.
I have made a sample project, which applies fourier transformation to the audio signal and implement a simple algorithm called sub-spectrum. But I found that it is difficult to implement this algorithm in real time, which means after I press the 'stop' button, it takes me a while to do the processing and save the file (I'm also wondering how do these commercial recorder programs record sound and at the same time save it). I know that my FFT may not be the fastest, but I'd like to know whether I could achieve 'real-time', if I fully optimized it or use the fastest FFT code? Thanks a lot!

It sounds like you are talking about broadband denoising. So I'll address my question to that. There are other kinds of denoising, from simple filtering to adaptive filtering to dynamic range expanding and probably others.
I don't think anyone can answer this question with a simple yes or no. You will have to try it and see what can be done.
First off, there are a variety of FFT implementations, including FFTW, of varying speed you could try. Some are faster than others, but at the end of the day they are all going to deliver comparable results.
This is one place where native C/C++ will outperform Java/Dalvik code because it can truly take advantage of vector code. For that to work, you'll probably need to write some assembler, or find some code that is already android optimized. I'm not aware of an android optimized FFT, but I'm sure it exists.
The real performance win will come from how you structure your overall denoising algorithm. All denoising I'm familiar with is extremely processor intensive and probably won't work on a phone in real-time, although it might on a tablet. That's just a(n educated) guess, though.

Appropriate audio capture and noise reduction

In my android application I need to capture the user's speech from the microphone and then pass it to the server. Currently, I use the MediaRecorder class. However, it doesn't satisfy my needs, because I want to make glowing effect, based on the current volume of input sound, so I need an AudioStream, or something like that, I guess. Currently, I use the following:
this.recorder = new MediaRecorder();
this.recorder.setAudioSource(MediaRecorder.AudioSource.MIC);
this.recorder.setOutputFormat(MediaRecorder.OutputFormat.MPEG_4);
this.recorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);
this.recorder.setOutputFile(FILENAME);
I am writing using API level 7, so I don't see any other AudioEncoders, but AMR Narrow Band. Maybe that's the reason of awful noise which I hear in my recordings.
The second problem I am facing is poor sound quality, noise, so I want to reduct (cancel, suppress) it, because it is really awful, especially on my noname chinese tablet. This should be server-side, because, as far as I know, requiers a lot of resources, and not all of the modern gadgets (especially noname chinese tablets) can do that as fast as possible. I am free to choose, which platform to use on the server, so it can be ASP.NET, PHP, JSP, or whatever helps me to make the sound better. Speaking about ASP.NET, I have come across a library, called NAudio, may be it can help me in some way. I know, that there is no any noise reduction solution built in the library, but I have found some examples on FFT and auto-corellation using it, so it may help.
To be honest, I have never worked with sound this close before and I have no idea where to start. I have googled a lot about noise reduction techniques, code examples and found nothing. You guys are my last hope.
Thanks in advance.

Have a look at this article.
Long story short, it uses MediaRecorder.AudioSource.VOICE_RECOGNITION instead of AudioSource.MIC, which gave me really good results and noise in the background did reduce very much.
The great thing about this solution is, it can be used with both AudioRecord and MediaRecorder class.

For audio capture you can use the AudioRecord class. This lets you record raw audio, i.e. you are not restricted to "narrow band" and you can also measure the volume.

Many smartphones have two microphones, one is the MIC you are using, the other one is near camera for video shooting, called CAMCORDER. You can get data from both of them to do noise reduction. There are many papers talking about audio noise reduction with multiple microphones.
Ref: http://developer.android.com/reference/android/media/MediaRecorder.AudioSource.html
https://www.google.com/search?q=noise+reduction+algorithm+with+two+mic

Microphone input

I'm trying to build a gadget that detects pistol shots using Android. It's a part of a training aid for pistol shooters that tells how the shots are distributed in time and I use a HTC Tattoo for testing.
I use the MediaRecorder and its getMaxAmplitude method to get the highest amplitude during the last 1/100 s but it does not work as expected; speech gives me values from getMaxAmplitude in the range from 0 to about 25000 while the pistol shots (or shouting!) only reaches about 15000. With a sampling frequency of 8kHz there should be some samples with considerably high level.
Anyone who knows how these things work? Are there filters that are applied before registering the max amplitude. If so, is it hardware or software?
Thanks,
/George

It seems there's an AGC (Automatic Gain Control) filter in place. You should also be able to identify the shot by its frequency characteristics. I would expect it to show up across most of the audible spectrum, but get a spectrum analyzer (there are a few on the app market, like SpectralView) and try identifying the event by its frequency "signature" and amplitude. If you clap your hands what do you get for max amplitude? You could also try covering the phone with something to muffle the sound like a few layers of cloth

It seems like AGC is in the media recorder. When I use AudioRecord I can detect shots using the amplitude even though it sometimes reacts on sounds other than shots. This is not a problem since the shooter usually doesn't make any other noise while shooting.
But I will do some FFT too to get it perfect :-)

Sounds like you figured out your agc problem. One further suggestion: I'm not sure the FFT is the right tool for the job. You might have better detection and lower CPU use with a sliding power estimator.
e.g.
signal => square => moving average => peak detection
All of the above can be implemented very efficiently using fixed point math, which fits well with mobile android platforms.
You can find more info by searching for "Parseval's Theorem" and "CIC filter" (cascaded integrator comb)

Sorry for the late response; I didn't see this question until I started searching for a different problem...
I have started an application to do what I think you're attempting. It's an audio-based lap timer (button to start/stop recording, and loud audio noises for lap setting). It' not finished, but might provide you with a decent base to get started.
Right now, it allows you to monitor the signal volume coming from the mic, and set the ambient noise amount. It's also using the new BSD license, so feel free to check out the code here: http://code.google.com/p/audio-timer/. It's set up to use the 1.5 API to include as many devices as possible.
It's not finished, in that it has two main issues:
The audio capture doesn't currently work for emulated devices because of the unsupported frequency requested
The timer functionality doesn't work yet - was focusing on getting the audio capture first.
I'm looking into the frequency support, but Android doesn't seem to have a way to find out which frequencies are supported without trial and error per-device.
I also have on my local dev machine some extra code to create a layout for the listview items to display "lap" information. Got sidetracked by the frequency problem though. But since the display and audio capture are pretty much done, using the system time to fill in the display values for timing information should be relatively straightforward, and then it shouldn't be too difficult to add the ability to export the data table to a CSV on the SD card.
Let me know if you want to join this project, or if you have any questions.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.