Speech recognition language model - android

I would like to integrate speech recognition into my Android application.
I am aware google provides two language models (free form for dictation and web search for short phrases).
However, my app will have a finite number of possible words (maybe a few thousand). Is it possible to specify the vocabularly; limiting it to these words, in the hope of achieving more accurate results?
My immediate thoughts would be to use the web search language model and then check the results of this against my vocabulary.
Any thoughts appreciated.

I think your intuition is correct and you've answered your own question.
The built in speech recognition provided by google only supports the dictation and search language models. See http://developer.android.com/reference/android/speech/RecognizerIntent.html
You can get back results using these recognizer models and then classify or filter the results to find what best matches your limited vocabulary. There are different techniques to do this and they can range from simple parsing to complex statistical models.
The only other alternative I've seen is to use some other speech recognition on a server that can accept your dedicated language model. Though this is costly and complex and used by commercial speech companies like VLingo or Dragon or Microsoft's Bing.

You can use Opensource models like Voxforge or cheap ones like Lumenvox.
Some have been ported to android. I forgot by whom.

I answered pretty much the same question before - please check here: Building openears compatible language model
and here:
typically you need very large text corpora to generate useful language models.
If you just have a small amount of training data, your language model will be over-fitted, which means that it will not generalize.

Related

is it possible to make a voice recognizing android app which don't uses internet?

I know that android already have an own recognizing voice API but it uses internet, I would like to have like a compiled library with a few voice commands that I could use offline, is that possible?
Yes it is possible you would have to train a model to convert speech to commands.
A simple approach could be to break speech at intervals where the amplitude is minimum, this will give you the words in the speech, match these words with your model with some probability if the matched probability is higher than some threshold value your model can then help execute the commands related to that word. You will have to ship this model within your application. i would recommend to update the model as you train your model further.

Nuance Mobile always returns zero results if I enable Grammar Recognition Mode

I have an iOS and an Android app that do Speech to Text recognition using Nuance Mobile SDK. Both apps work perfectly when using a Nuance sandbox appID.
However, when if I upload a phrase file, the Nuance server always returns zero results, as I can verify in the "didFinishWithResults" methods on both android and ios.
This is the phrase file I upload as a custom vocabulary to Nuance:
<phrases>
<phrase>two on to</phrase>
<phrase>bet 1 3 for</phrase>
<phrase>...and some other phrases.</phrase>
</phrases>
My Nuance custom dictionary is set to:
Topic:WebSearch
Grammar recognition mode: YES (<== Apps work perfectly when set to NO)
Vocabulary Weight: 90
Nuance's documentation claims that:
It is important to note that custom vocabularies are different than the constrained speech
recognition grammars many developers are familiar with. Using constrained grammars results in
high accuracy of words that are in the grammar and low (or no) recognition of words or phrases
that are not in the grammar. With custom vocabularies our large language models are still used
and the vocabulary simply adjusts the recognition probabilities of that large model. As a result,
using a vocabulary will not change your results as much as a conventional grammar would. For
example, you can expect to still get word results that are not in the vocabulary or “out of
grammar.” The grammar recognition mode feature makes the vocabulary act much more like a
traditional recognition grammar but as we are still using the underlying language model even
then you may get “out of grammar” results.
.....So my question is, What am I doing wrong to always get zero results from Nuance ASR when the custom vocabulary's grammar recognition mode is set to YES?
Nuance's customer support is totally useless, they probably outsource a bunch of people with just an FAQ in hand that have no idea about anything.
I hope somebody can help me out on this one.

Algorithm for very simple voice/speech recognition

I'm writing a game for Google Glass, but unfortunately SpeechRecognizer API isn't available on the current builds on Google Glass GDK.
So I've been thinking about implementing an algorithm for a very simple voice recognition.
Let's say I want to recognize only: "Yes" and "No".
Do you know any example code or any helpful resources to help me in implementing this ?
Is it so hard that I should drop the idea and go with big frameworks like CMUSphinx ?
What about recognizing: up, down, right, left or numbers from 1 to 10 ?
As I know, there often used transition to the frequency domain by fast Fourier transform (FFT) and it analyzing. Also need some dictionary of speeched words for frequency correlation.
Please see this links:
CMU Sphinx have java implementation.
David Wagner have a good article and matlab implementation.
P.S. Ohh, if you speak in russian, why you don't read this article - very simple, with java examples.
P.P.S. Honestly, I never use this framework, but if you have only a superficial knowledge about speech recognition, robust and easyest way is to use existing complete solutions like frameworks or libraries, otherwise you need spend time to possess the necessary knowledge threshold. In this case you can read this article.

Android Voice Recognition (in Search Widget) using given word list

Is it possible to restrict voice search search widget to look for a match near to a given set of words. For example if I am using it to search over a list of names, its not meaningful as names are often corrected to some words.
If you use 3rd party Android recognition from Nuance (The people behind DragonDictate), it supports a "grammar mode" where you can somewhat restrict the phrases that will be recognised during recognition.
Importantly, if you add unusual names into a Custom Vocabulary, they SHOULD become recognizable (Complex pronunciation issues aside)
You can find information if you dig through:
http://dragonmobile.nuancemobiledeveloper.com ,
looking for 'Custom Vocabularies'. Grammar mode is essentially a special mode of custom vocabularies.
At the time of writing, there was a document here that makes some mention of grammar mode:
http://dragonmobile.nuancemobiledeveloper.com/downloads/custom_vocabulary/Guide_to_Custom_Vocabularies_v1.5.pdf - It only really becomes clear when you try to progress in their provisioning web GUI.
You have to set up an account, and jump through other hoops, but there is a free tier. This is the only potential way I have found to constrain a recognition vocabulary.
Well, short of running up PocketSphinx, but that is still described as a 'Research' 'PreAlpha'.
No, I don't work for Nuance. Not sure anyone does. They may have all been eaten by zombies. You would guess as much reading their support forums. They never reply.
No, unfortunately this is not possible.
You could also look at recognition from AT&T.
They have a very feature-rich web API, including full grammar support. I only found out about it recently!
1,000,000 transactions per month for free. Generous!
Look for 'AT&T API Program'. Weird name.
Link at time of writing:
http://developer.att.com/apis/speech
Unfortunately for me, no Australian accent language models at time of writing. Only US and UK English. Boooo.
EDIT: Some months after I submitted the above, AT&T retired the service mentioned. It seems everyone just wants a 'dumbed down' API where you just call a recognizer, and it returns words. Sure, that is of course the holy grail, but a properly designed, constrained grammar will generally work better. As someone with speech skills, the minimalism of the common Speech APIs today is really frustrating...

Android: Speech Recognition Append Dictionary?

I'm using the Speech Recognizer Intent in Android. Is there a way to add your own customized words or phrases to Android's Speech recognition 'dictionary'
No. You can only use the two language models supported.
The built in speech recognition provided by google only supports the dictation and search language models. See http://developer.android.com/reference/android/speech/RecognizerIntent.html and LANGUAGE_MODEL_FREE_FORM or LANGUAGE_MODEL_WEB_SEARCH.
http://developer.android.com/resources/articles/speech-input.html says:
You can make sure your users have the
best experience possible by requesting
the appropriate language model:
free_form for dictation, or web_search
for shorter, search-like phrases. We
developed the "free form" model to
improve dictation accuracy for the
voice keyboard, while the "web search"
model is used when users want to
search by voice
Michael is correct, you cannot change the Language Model.
However, you can use "sounds like" algorithms to process the results from Android and match words it doesn't know.
See my answer here:
speech recognition reduce possible search results

Categories

Resources