I have started with Speech recognition using android, sl4a and python and so far, it works fine.
My user is just supposed to input numbers between 0 and 9 with his voice. Is there a way to tell android to only search in those number and therefore reduce the time of recognition (and probably errors) ?
No. You cannot change what google returns. You can only process the results.
Fortunately, you can process the results to increase the chance of a match.
For example, you could use a phonetic matching algorithm like
Soundex
Using Soundex or something similar, if the recognizer hears something like "true" your code could still recognize it as the number 2.
Related
I would really appreciate little help with voice control for android. I am making voice-controlled chess, but voice recognition is identifying words that I don't want.
Example: "King to C7" > "Pink 2 See 7"
So, is there a way how to filter only words that I want to use? Make something like whitelist? Thanks for every response.
Unfortunately, this isn't possible because of the way how speech to text works. Limiting the speech recognition only to some small whitelist would require retraining of the neural network (and that isn't something that an individual can do).
i am building an application "Voice Calculator" which takes input as a voice and display result based on the input.
i dont want to use a google servers for voice recognization, is there any way through i can achive my goal.
i want to take input as " two plus three multiply four hundred twenty two minus one hundred" etc. so i would like to record and compare every words,
that can be converted in to text and which can be used to perform calculation.
can any one guide me , how to achive this? i am done with designing calculator with its functionality,
i hope i am able to explain my doubt, looking for help.. thank u..
I have used Google API for voice recognition, although I wanted an off-line version, I need to rely on voice recognition.
Have a look at Voice Recognization for android example.
Granted that not many device support it yet, but Jelly Bean will allow you to download Google's voice controls to the device for offline use.
I don't have much experience with Android, but was asked by a hearing-impaired friend if there is a way to essentially "stream" voice to text on a mobile device. I've used and looked into the android built in api, but it seems that only sends the speech off for processing after the speech input is completed. I'm looking for something that works contiguously (similar to how Dragon works with microsoft word).
Perhaps there is already an app that does this. If not, is there a way to implement this with the current Android OS/API?
Any suggestions appreciated.
As you've mentioned, the speech-to-text recognition is sent to Google for processing. This can take enormous computing power, which current devices simply can't handle (yet). Because everything is processed server-side, you won't be able to do immediate speech recognition in real time directly on the phone.
It's possible that somebody has created a 3rd-party library to do this, but I'm not aware of any. Even so, it would probably have some significant limitations or reduced accuracy.
You can use this Extra for the Recognizer Intent:
String EXTRA_PARTIAL_RESULTS Optional boolean to indicate whether partial results should be returned by the recognizer as the user speaks (default is false).
http://developer.android.com/reference/android/speech/RecognizerIntent.html#EXTRA_PARTIAL_RESULTS
HI Folks,
I have a strange problem for Voice recognition on Google Nexus one phone which
have Firmware:2.2.1.Voice recognition gives multiple interpretations
of the spoken word When I speak "Hello" to the voice recognition,
the results received is "hello, hotels, photos, fomdem, honda"
which is expected to come only "hello"
The same things works fine on Firmware 2.1 which give satisfactory result.
Whats has to be done to avoid this issue.Any Suggestions are helpful
Best Regards,
Vinayak
I can't explain the differnet behavior for different versions, but have you looked at http://developer.android.com/reference/android/speech/RecognizerIntent.html#EXTRA_MAX_RESULTS ?
The intent accepts a Max Results parameter which tells the recognizer how many candidate strings to return to the client. Typically in speech recognition, the client may need to provide the user a disambiguation step (like "did you say "hello" or "hotel"?". If you only want the most likely candidate, set EXTRA_MAX_RESULTS to 1.
I am writing an application that will behave similar to the existing Voice recognition but will be sending the sound data to a proprietary web service to perform the speech recognition part. I am using the standard MediaRecord (which is AMR-NB encoded) which seems to be perfect to speech recognition. The only data provided by this is the Amplitude via the getMaxAmplitude() method.
I am trying to detect when the person starts to talk so that when the person stops talking for about 2 seconds I can proceed to send the sound data to the web service. Right now I am using a threshold for the amplitude that if its goes over a value (i.e. 1500) then I assume the person is speaking. My concern is that the amplitude levels may vary by device (i.e. Nexus One v Droid), so I am looking for a more standard approach to this that can be derived from the amplitude values.
P.S.
I looked at graphing-amplitude but it doesn't provide a way to do it with just the amplitude.
Well, this might not be of much help but how about starting by measuring the offset noise captured by the microphone of the device by the application, and apply the threshold dynamically based on that? That way you would make it adaptable to the different devices' microphones and also to the environment the user is using it at, at a given time.
1500 is too low of a number. Measuring the change in amplitude will work better.
However, it will still result in miss detections.
I fear the only way to solve this problem is to figure out how to recognize a simple word or tone rather than simply detect noise.
There are now multiple VAD library designed for Android. One of these are:
https://github.com/gkonovalov/android-vad
Most of the smartphones come with a proximity sensor. Android has API for using these sensors. This would be adequate for the job you described. When the user moves the phone near to his ear, you can code the app to start recording. It should be easy enough.
Sensor class for android