I'm using the Speech Recognizer Intent in Android. Is there a way to add your own customized words or phrases to Android's Speech recognition 'dictionary'
No. You can only use the two language models supported.
The built in speech recognition provided by google only supports the dictation and search language models. See http://developer.android.com/reference/android/speech/RecognizerIntent.html and LANGUAGE_MODEL_FREE_FORM or LANGUAGE_MODEL_WEB_SEARCH.
http://developer.android.com/resources/articles/speech-input.html says:
You can make sure your users have the
best experience possible by requesting
the appropriate language model:
free_form for dictation, or web_search
for shorter, search-like phrases. We
developed the "free form" model to
improve dictation accuracy for the
voice keyboard, while the "web search"
model is used when users want to
search by voice
Michael is correct, you cannot change the Language Model.
However, you can use "sounds like" algorithms to process the results from Android and match words it doesn't know.
See my answer here:
speech recognition reduce possible search results
Related
Is it possible to restrict the Google Speech API to only recognize from a given set of words? Alternatively, is it possible to "ban" certain words? If not, is it possible with any other speech API that supports German?
I do know that I can set hint phrases via a speech context. Still it rather recognizes a different word.
As an example, I use the API mostly for the German language. I want to recognize the word "stärker" (which is also listed as a speech context hint), though, the API mostly transcribes it to "Stärke" unless I pronounce the "r" at the end unnaturally strong. So, is it possible to prevent the speech API from transcribing that word, for instance?
Thanks in advance!
No, that's not possible with Google Speech API. I don't know if it's possible with other speech recognition services.
Google speech API supports set of parameters in RecognitionConfig.
There is an optional boolean parameter called "profanityFilter", which filters out profanities.
https://cloud.google.com/speech-to-text/docs/reference/rest/v1beta1/RecognitionConfig
I have few questions to ask related to Google Speech recognition in Android. I have developed an Android application using Google Speech Recognition online service. Now, to further improve it, I need to know answers for the below questions.
Can I add a "custom dictionary" either in offline recognition or online recognition?
Can I command it to do Grammar based recognition, keyword recognition and keyphrase recognition? Right now it seems like there is no options for such, instead of common recognition.
Can I change the "listen timeout" ? It seems like how much I change, it simply do not work.
It's not possible with Android Speech API, but you can use CMU Sphinx project for all of the above.
This is the correct answer, because I have tried and done it.
Can I add a "custom dictionary" either in offline recognition or
online recognition?
Not possible
Can I command it to do Grammar based recognition, keyword recognition
and keyphrase recognition? Right now it seems like there is no options
for such, instead of common recognition
Can't command it to do grammer based recognition but can detect keywords and keyphrase, you have to write custom code with if-else condition do do that. To do this you actually need detect and convert word by word instead of waiting for the entire sentence is completed by the user and android voice recognition service automatically get closed to give you the result. This is possible and it is known as "mid speech interim"
For keyword recognition see this video
Can I change the "listen timeout" ? It seems like how much I change,
it simply do not work.
No, but you can code it in a tricky way to do continuous recognition. Aboce youtube video does continuous recognition as well. For an application which does the same refer to this link.
I have an iOS and an Android app that do Speech to Text recognition using Nuance Mobile SDK. Both apps work perfectly when using a Nuance sandbox appID.
However, when if I upload a phrase file, the Nuance server always returns zero results, as I can verify in the "didFinishWithResults" methods on both android and ios.
This is the phrase file I upload as a custom vocabulary to Nuance:
<phrases>
<phrase>two on to</phrase>
<phrase>bet 1 3 for</phrase>
<phrase>...and some other phrases.</phrase>
</phrases>
My Nuance custom dictionary is set to:
Topic:WebSearch
Grammar recognition mode: YES (<== Apps work perfectly when set to NO)
Vocabulary Weight: 90
Nuance's documentation claims that:
It is important to note that custom vocabularies are different than the constrained speech
recognition grammars many developers are familiar with. Using constrained grammars results in
high accuracy of words that are in the grammar and low (or no) recognition of words or phrases
that are not in the grammar. With custom vocabularies our large language models are still used
and the vocabulary simply adjusts the recognition probabilities of that large model. As a result,
using a vocabulary will not change your results as much as a conventional grammar would. For
example, you can expect to still get word results that are not in the vocabulary or “out of
grammar.” The grammar recognition mode feature makes the vocabulary act much more like a
traditional recognition grammar but as we are still using the underlying language model even
then you may get “out of grammar” results.
.....So my question is, What am I doing wrong to always get zero results from Nuance ASR when the custom vocabulary's grammar recognition mode is set to YES?
Nuance's customer support is totally useless, they probably outsource a bunch of people with just an FAQ in hand that have no idea about anything.
I hope somebody can help me out on this one.
I am developing an Android App that uses speech to text recognition.I have used RecognizerIntent and i know about the link
http://developer.android.com/reference/android/speech/RecognizerIntent.html#EXTRA_LANGUAGE
But this allows US-english. I want the speech recognizer to recognize Indian Englishas i need the App to recognize Indian names. Is it possible?
As the linked document says, the value is a "IETF language tag (as defined by BCP 47)". Which values are actually supported depends on the speech recognizer that you are using. E.g. Google's recognizer supports en-IN, so if you are using Google's recognizer then you could try to set the value of EXTRA_LANGUAGE to en-IN and test if Indian names will be recognized.
I would like to integrate speech recognition into my Android application.
I am aware google provides two language models (free form for dictation and web search for short phrases).
However, my app will have a finite number of possible words (maybe a few thousand). Is it possible to specify the vocabularly; limiting it to these words, in the hope of achieving more accurate results?
My immediate thoughts would be to use the web search language model and then check the results of this against my vocabulary.
Any thoughts appreciated.
I think your intuition is correct and you've answered your own question.
The built in speech recognition provided by google only supports the dictation and search language models. See http://developer.android.com/reference/android/speech/RecognizerIntent.html
You can get back results using these recognizer models and then classify or filter the results to find what best matches your limited vocabulary. There are different techniques to do this and they can range from simple parsing to complex statistical models.
The only other alternative I've seen is to use some other speech recognition on a server that can accept your dedicated language model. Though this is costly and complex and used by commercial speech companies like VLingo or Dragon or Microsoft's Bing.
You can use Opensource models like Voxforge or cheap ones like Lumenvox.
Some have been ported to android. I forgot by whom.
I answered pretty much the same question before - please check here: Building openears compatible language model
and here:
typically you need very large text corpora to generate useful language models.
If you just have a small amount of training data, your language model will be over-fitted, which means that it will not generalize.