I am trying to implement voice recognition in Android. I have followed various tutorials for the same which say, we need to give call to RecognizerIntent with RecognizerIntent.ACTION_RECOGNIZE_SPEECH and start activity for result. So when we speak,we get a set of values from the google server by providing RecognizerIntent.EXTRA_RESULTS. This is working fine. But what I need to do is, I need to provide a set of strings to the voice recognition engine, so when we say something, it will match what we have said with the provided set of strings and returns only one matched string.So I need to provide the recognizer engine some set of values from which it should give me the matched word. Can this be done?
This can be done with pocketsphinx, see keyword matching mode in tutorial
http://cmusphinx.sourceforge.net/wiki/tutorialandroid
Related
Is it possible to restrict the Google Speech API to only recognize from a given set of words? Alternatively, is it possible to "ban" certain words? If not, is it possible with any other speech API that supports German?
I do know that I can set hint phrases via a speech context. Still it rather recognizes a different word.
As an example, I use the API mostly for the German language. I want to recognize the word "stärker" (which is also listed as a speech context hint), though, the API mostly transcribes it to "Stärke" unless I pronounce the "r" at the end unnaturally strong. So, is it possible to prevent the speech API from transcribing that word, for instance?
Thanks in advance!
No, that's not possible with Google Speech API. I don't know if it's possible with other speech recognition services.
Google speech API supports set of parameters in RecognitionConfig.
There is an optional boolean parameter called "profanityFilter", which filters out profanities.
https://cloud.google.com/speech-to-text/docs/reference/rest/v1beta1/RecognitionConfig
I would like to implement offline voice recognition in my app. But I want it for two purposes:
For a small set of commands (play, stop, previous, next and a couple of others);
For a list of a few hundred bird names.
To implement (1), it seems to me a bad idea (slower and resource consuming) to use the full voice recognition force of android. In my mind, it would be easier to tell my app to only interpret a few words. That is, to use my own dictionary, telling my app to "use only these 10 words".
To implement (2) is similar to (1), but with a few hundred instead of 10.
Does this makes sense, and if so is there an easy way to implement it? Is it worth it?
Thanks!
L.
You can implement your app using CMUSphinx on Android. CMUSphinx tutorial is here:
http://cmusphinx.sourceforge.net/wiki/tutorial
The language models to recognize limited set of words are described here
http://cmusphinx.sourceforge.net/wiki/tutoriallm
You can use keyword spotting mode to recognize few commands.
Pocketsphinx on Android is described here:
http://cmusphinx.sourceforge.net/wiki/tutorialandroid
The demonstration includes the way to switch recognition modes from 10 words to few hundred words as you intend.
I am started programming sl4a (in QPython) and it is really great. Now I tried to use the droid.recognizeSpeech function. This one works fine too, but I like to get it in the background listening for a keyword, like Google's 'OK Google'.
So I looked around, but cannot find anything. I don't know how I can implement it.
So I ask you, can someone tell me, if it is possible, how to make recognize speech always listening in the background waiting for a keyword?
I've toyed with the idea of doing this, but never found any useful practical application for it. So here's a summary of my research, I hope it's enough to get you started:
1. The Speech Recognizer facade has multiple parameters. Usually, everyone puts "none" in all of them except the first. Here's the facade in it's actuality:
recognizeSpeech:
Recognizes user's speech and returns the most likely result.
prompt (String) text prompt to show to the user when asking them to speak (optional)
language (String) language override to inform the recognizer that it should expect speech in a language different than the one set in the java.util.Locale.getDefault() (optional)
languageModel (String) informs the recognizer which speech model to prefer (see android.speech.RecognizeIntent) (optional)
returns: (String) An empty string in case the speech cannot be recognized.
So you're looking for the languageModel in this case, that option is restricted to two types. A web search model and a free-form speech model. You're looking for the free-form speech model in this case. Here's a little more info on this model from the horse's mouth:
Google on the Free-Form Language Model
Once you've looked at the free-form speech model, what should help you is Chrome's continuous speech recognition model, which should share a lot of the same characteristics of the Free-Form language model. Hope this helps set you on the right direction
Doing some research I have found some different speech to text API's for Android.
Pocket Sphinx
Android Native API
I have the following requirements:
Must be able to support offline speech recognition (I'm not sure
if the Android API can do this)
Must be able to detect and
respond immediately to every word said. I would rather this than
detecting an entire sentence. I could split the returned sentence
into and array though and get each word.
The detection needs to
be processing in the backgound (no popups or anything as the Android
API seems to do)
Can someone recommend an API that is capable of my requirements.
Pocketsphinx meets all your requirements. What you call the "Android Native API" is basically a set of interface definitions and it does not contain the notion of offline/online.
You can also implement these interfaces using Pocketsphinx, since it supports things like partial results, confidence scores, n-best results etc. This way the implementation becomes available to any Android app. Maybe somebody has done it already, but I'm not aware of it.
I need to use speech input to insert text. How can i detect keyword when I'm speaking ?
Can i do this with Android Speech Input or I need external library ?
Any ideas ?
Thanks
Keyword detection task is different from a speech recognition task. While second tries to understand the text being spoken and check all possible word combinations, keyword spotting usually check two hypothesis - word is here or garbage is here. Its way more efficient to check keyword presence but it requires custom algorithm. You can implement one with the open source speech recognition toolkit like CMUSphinx.
http://cmusphinx.sourceforge.net
Which runs on Android too, you can check
Voice command keyword listener in Android
to see how to integrate it.
Absolutely.
See this for some code that detects the "magic word"
Just launch an Intent with ACTION_RECOGNIZE_SPEECH and then check the results for your keyword. Checking for the keyword can be complicated, but this code should get you started.
https://github.com/gmilette/Say-the-Magic-Word-
I used the Snowboy library for this task
Website: https://snowboy.kitt.ai
Github: https://github.com/kitt-ai/snowboy
It is a C library but it can be included in Android code using the JNI. The only downside to it is that you have to train it with audio samples if you want to use another keyword than the ones that come with the library.