How to make voice recognition check against a local database in android? - android

Do you remember in old cellphones you could make a speech shortcut to call a person.
I am trying to make an app in android with that function. The user records a word or sound it wants to control the application with and the voice-recognizer will only check if the sound it hears equals the sound previously recorded.
Does anyone know how to make this or know of a guide? I've been searching for months without finding a satisfying solution.
Thanks

You need to convert both reference sounds and recorded sound to features. For that you need to split sound on frames and extract FFT or directly mel-cepstrum. You can use any MFCC library out there for that.
After you get features, you can compare them with DTW algorithm. You can find some details here
http://en.wikipedia.org/wiki/Dynamic_time_warping
The DTW will return you the threshold which you can use to select the right person to call to.
Similar quesitons is
Simplest algorithm of measuring how similar of two short audio

Related

Converting audio file into text file using pocketsphinx

Good day ma'am/sirs! I'm new to android app developing, and I'm really in need of help. I'm developing a Speech-to-Text app though its not the usual STT apps that are available on app-stores. I'm using pocketsphinx for offline speech recognition and conversion, and Android Studio IDE.
My app has three main features and those are:
Record - here is where the user will be able to record his/her speech. The recorded speech will be saved into the device's storage.
Library - Here is where the user will be able to see his/her recorded speech and converted audio-to-text files. Also the convert feature where the user may convert his/her recorded speech into text files.
Edit - here is where the user will be able to edit his/her audio/text files. Only cut, delete, and modify(only text) are the available features.
My main problem is, is it actually possible to convert a recorded speech into text by using pocketsphinx? To make it more understandable, I've tried demos of pocketsphinx and what I've experienced through it is when you speak through your device, it directly converts what you said. Unlike my idea, where you may record your speech, and convert it into text whenever you want. I'm so confused if its possible, if yes, may someone tell/explain to me how? If no, may someone tell/explain to me the other ways to follow my idea? Thanks in advance!

How to implement voice recognition with my own (tiny) dictionary?

I would like to implement offline voice recognition in my app. But I want it for two purposes:
For a small set of commands (play, stop, previous, next and a couple of others);
For a list of a few hundred bird names.
To implement (1), it seems to me a bad idea (slower and resource consuming) to use the full voice recognition force of android. In my mind, it would be easier to tell my app to only interpret a few words. That is, to use my own dictionary, telling my app to "use only these 10 words".
To implement (2) is similar to (1), but with a few hundred instead of 10.
Does this makes sense, and if so is there an easy way to implement it? Is it worth it?
Thanks!
L.
You can implement your app using CMUSphinx on Android. CMUSphinx tutorial is here:
http://cmusphinx.sourceforge.net/wiki/tutorial
The language models to recognize limited set of words are described here
http://cmusphinx.sourceforge.net/wiki/tutoriallm
You can use keyword spotting mode to recognize few commands.
Pocketsphinx on Android is described here:
http://cmusphinx.sourceforge.net/wiki/tutorialandroid
The demonstration includes the way to switch recognition modes from 10 words to few hundred words as you intend.

Android AudioRecorder with UI

I want to create an audio recorder I have tried the code I get from http://developer.android.com/guide/topics/media/audio-capture.html and it works fine; it records and it plays back what I have said. But right now I want to create an audio recorder with a UI like in recognizerIntent which Google has provided so that you can see or monitor that your voice has been recorded(sorry I can't find a right term but I'm hoping you can understand what I am trying to say). Do you know any tutorial or links that can help me?thanks for you help!
I think you mean that you want some feedback that your voice is being picked up? If so, perhaps this project, Audalyzer might be a good place to start looking?
Libraries within the package offer you a dB reading, a one and two-dimensional wave form, and an FFT plot.
(source: googlecode.com)
There is a simpler WaveformControl you may find easier to understand and modify for your needs.

Android Continuous Speech recognition? [duplicate]

I googled around and found the regular speech-api from google. But I think this isn't what I need. I need continious voice recognition and the ability to launch other actions when a specific word is spoken. Is there anything in the android sdk that I can use?
If not: Is it possible to implement third-party libraries? (If yes: which - and what do I have to think about when implement a third-party-library?)
Edit: I thought about this again. I have to recognize just one 'word' (that probably won't be in googles-speech-databases). I have the chance to record it. That means, I'm able to continiously match the incoming audio-stream against my recording. That should work without a database. But I'm new to android-development. Do you have suggestions for APIs to use for recording and matching the recorded? Or is there any better way to continiously wait for a specifig 'word' to occur and then process any further actions?
btw: if that wasn't clear described: the app should continue to record and watch for the word to occure again when the reaction is done.
Is there anything in the android sdk that I can use?
No, sorry.

Continuous Speechrecognition in Android

I googled around and found the regular speech-api from google. But I think this isn't what I need. I need continious voice recognition and the ability to launch other actions when a specific word is spoken. Is there anything in the android sdk that I can use?
If not: Is it possible to implement third-party libraries? (If yes: which - and what do I have to think about when implement a third-party-library?)
Edit: I thought about this again. I have to recognize just one 'word' (that probably won't be in googles-speech-databases). I have the chance to record it. That means, I'm able to continiously match the incoming audio-stream against my recording. That should work without a database. But I'm new to android-development. Do you have suggestions for APIs to use for recording and matching the recorded? Or is there any better way to continiously wait for a specifig 'word' to occur and then process any further actions?
btw: if that wasn't clear described: the app should continue to record and watch for the word to occure again when the reaction is done.
Is there anything in the android sdk that I can use?
No, sorry.

Categories

Resources