I have few questions to ask related to Google Speech recognition in Android. I have developed an Android application using Google Speech Recognition online service. Now, to further improve it, I need to know answers for the below questions.
Can I add a "custom dictionary" either in offline recognition or online recognition?
Can I command it to do Grammar based recognition, keyword recognition and keyphrase recognition? Right now it seems like there is no options for such, instead of common recognition.
Can I change the "listen timeout" ? It seems like how much I change, it simply do not work.
It's not possible with Android Speech API, but you can use CMU Sphinx project for all of the above.
This is the correct answer, because I have tried and done it.
Can I add a "custom dictionary" either in offline recognition or
online recognition?
Not possible
Can I command it to do Grammar based recognition, keyword recognition
and keyphrase recognition? Right now it seems like there is no options
for such, instead of common recognition
Can't command it to do grammer based recognition but can detect keywords and keyphrase, you have to write custom code with if-else condition do do that. To do this you actually need detect and convert word by word instead of waiting for the entire sentence is completed by the user and android voice recognition service automatically get closed to give you the result. This is possible and it is known as "mid speech interim"
For keyword recognition see this video
Can I change the "listen timeout" ? It seems like how much I change,
it simply do not work.
No, but you can code it in a tricky way to do continuous recognition. Aboce youtube video does continuous recognition as well. For an application which does the same refer to this link.
Related
I am trying to get continuous speech recognition working using pocketsphinx. I tried following their tutorial (https://cmusphinx.github.io/wiki/tutorialandroid/), although to me it is somewhat vague and I could not get it to work. I am now trying a different approach: to start with pocket-sphinx-android-demo (https://github.com/cmusphinx/pocketsphinx-android-demo) then reduce its limitations (to be able to recognize for longer and more words). I have figured out where the output of the recognition goes, although the demo is still only able to recognize weather words (I removed the digit and phones demos). I have discovered that the activation phrase can have an infinite vocabulary, but I can't figure out what is limiting the vocabulary of the actual recognition. Here is the github link to my project if it may be helpful: https://github.com/Michaelszeng/pocket-sphinx_App_mk2/commits/master
Does anybody know what is limiting the demo recognizer's vocabulary or how I could remove that limitation?
I would like to implement offline voice recognition in my app. But I want it for two purposes:
For a small set of commands (play, stop, previous, next and a couple of others);
For a list of a few hundred bird names.
To implement (1), it seems to me a bad idea (slower and resource consuming) to use the full voice recognition force of android. In my mind, it would be easier to tell my app to only interpret a few words. That is, to use my own dictionary, telling my app to "use only these 10 words".
To implement (2) is similar to (1), but with a few hundred instead of 10.
Does this makes sense, and if so is there an easy way to implement it? Is it worth it?
Thanks!
L.
You can implement your app using CMUSphinx on Android. CMUSphinx tutorial is here:
http://cmusphinx.sourceforge.net/wiki/tutorial
The language models to recognize limited set of words are described here
http://cmusphinx.sourceforge.net/wiki/tutoriallm
You can use keyword spotting mode to recognize few commands.
Pocketsphinx on Android is described here:
http://cmusphinx.sourceforge.net/wiki/tutorialandroid
The demonstration includes the way to switch recognition modes from 10 words to few hundred words as you intend.
Doing some research I have found some different speech to text API's for Android.
Pocket Sphinx
Android Native API
I have the following requirements:
Must be able to support offline speech recognition (I'm not sure
if the Android API can do this)
Must be able to detect and
respond immediately to every word said. I would rather this than
detecting an entire sentence. I could split the returned sentence
into and array though and get each word.
The detection needs to
be processing in the backgound (no popups or anything as the Android
API seems to do)
Can someone recommend an API that is capable of my requirements.
Pocketsphinx meets all your requirements. What you call the "Android Native API" is basically a set of interface definitions and it does not contain the notion of offline/online.
You can also implement these interfaces using Pocketsphinx, since it supports things like partial results, confidence scores, n-best results etc. This way the implementation becomes available to any Android app. Maybe somebody has done it already, but I'm not aware of it.
I'm using the Speech Recognizer Intent in Android. Is there a way to add your own customized words or phrases to Android's Speech recognition 'dictionary'
No. You can only use the two language models supported.
The built in speech recognition provided by google only supports the dictation and search language models. See http://developer.android.com/reference/android/speech/RecognizerIntent.html and LANGUAGE_MODEL_FREE_FORM or LANGUAGE_MODEL_WEB_SEARCH.
http://developer.android.com/resources/articles/speech-input.html says:
You can make sure your users have the
best experience possible by requesting
the appropriate language model:
free_form for dictation, or web_search
for shorter, search-like phrases. We
developed the "free form" model to
improve dictation accuracy for the
voice keyboard, while the "web search"
model is used when users want to
search by voice
Michael is correct, you cannot change the Language Model.
However, you can use "sounds like" algorithms to process the results from Android and match words it doesn't know.
See my answer here:
speech recognition reduce possible search results
Does anyone have experience with java voice recognition and localization?
I'm thinking to build an android application, with some basic voice recognition options, but I want to implement localization for that based on some translate tool, maybe Google translate, and users can update his "dictionary" with new languages from remote dictionary...this project is in first phase, and I'm still brainstorming, so does anyone have some experience or is something like that even possible?
Why not just use Android's built-in speech recognition? It's REALLY easy (you just set up an Intent then catch it when it returns) and the results are surprisingly good.
android.speech
I'm not sure exactly what you're trying to do, but this will allow you to specify the language to recognize.