I've got voice recognition running in my Android app. However, the user experience is poor due to the lag when the Google voice recognition dialog is launched. The app relies on receiving a response from the user after a prompt, but when the RecognizerIntent is started, it takes a couple of seconds before the user can actually provide his/her response. It doesn't feel 'natural' enough.
What I'd like to find out is if anyone knows a way to sort of 'warm start' the recognizer activity so that it is ready immediately following my prompt.
I'm happy to provide more details if it would be helpful.
Related
I'm looking to add voice commands to an Android App that will be running on a tablet as a kiosk. I don't want the user to have to push a button, because the user is doing something more important (e.g. driving a car, flying a plane, or performing brain surgery) and the command could be completed by a single button push.
I see tutorials describing how to add speech to text and have the user push a button and get the text, but nothing allowing the wake word "Okay, Google" to start the voice recognition (much less a custom wake word).
I looked at using the Google Voice Actions to start with "Okay, Google" and then send something to my app (register an intent), but that has to be trained to one specific user (at least for the tablet I tried it on). I'll have different users every day (maybe more than one a day) and no opportunity for training the device.
I've worked with CMUSphinx and found it to be too unreliable for spotting a wake word.
Is there a way to add "Okay, Google" as a way to start listening to text inside my app?
Got it working using PocketSphinx for offline wake work recognition and then I hand the microphone over to IBM's Watson's Speech to Text software that works over the internet and comes back with pretty reliable results.
Unfortunately what you are trying to achieve is not possible. If I understood correctly what your concept: a 3rd party app will awaken the devices and act based on a set of commands (from a security point of view this is very bad).
The closest you can do is follow the Voice Actions Api - https://developers.google.com/voice-actions/system/
I have been searching from last 9 or 10 days but I didn't get lucky enough to get my hands on some understandable code. I want to start my main activity through a trigger word like "ok google" or "open" my application receives the command and then performs some action. How can this be don, please provide a sample code.
Thanks very much in advance.
So you want to have a voice command that can open your when it's not open?
The only way you are allowed to do this is to use Accessibility Service. You will have to create your own service, implement the voice recognition, then user will have to manually turn the service on in their phone and based on the voice command, you can do whatever you want.
Basically:
Step 1) Develop Accessibility service
Step 2) Merge speech recognition with the service. Here is an example of speech recognition.
Step 3) User MUST turn on accessibility service
You are basically trying to creating an App like TalkBack, but only for your own app, which is no easy task. But i'm sure you can figure it out
Hope this helps. Good luck!
I've implemented a relatively simple test application for a customer who owns a container repair facility. His aim is to deliver to his operators a tool with (possibly) voice interaction. The app basicly works well, using Google's Speech API. The annoying problem are the notification sounds when you launch the recognition intent and the subsequent timeout notification if the user doesn't speak within 4 seconds. I'm intercepting all the errors, so I can relaunch the recognizer, but it's not so comfortable hearing this couple of notification every 4 seconds, especially when you're awaiting the next container to check. A partial workaround could be the implementation of a sound trigger like the "Ok Google" feature found, for example, on my Samsung S6, but I'm not able to find info about that. The app is written with Xamarin, but it has been already ported under Android Studio in order to test the Nuance library, so if there isn't a chance for implementing "Ok Google" trigger under Xamarin, also any java suggestion would be very welcome. Obviously I don't need "Ok Google" but anoter trigger, like "inizio" or "start check", that is a user-defined trigger (or set of triggers).
Thanks.
Rodolfo
I'm trying to make a voice recognition app that listens and alerts you right when a specific word is said. All I've seen online is starting and stopping the recognition, then parsing the results. Is there a way to do what I'm looking for?
I have implemented the RecognizerIntent and called google's voice recognition service and it works fine and i get results. However, sometimes if i mumble or am too far away from my device i get the message "Didn't catch that. Try speaking again." message. Is there a way to bypass this and not show this message as i don't want the users to have to press OK to continue?
Thanks
Use SpeechRecognizer if you want to be in control of the UI.
If I understand correctly then you are launching the RecognizerIntent and this gets handled by one of the activities in the Google Search app (or Google Voice Search, or whatever it happens to be called currently). Now, since this is an activity, it takes over the UI, i.e. it pops up a dialog box, shows a prompt and a VU meter, etc. In case of an error condition, it could in principle return control to your app by sending one of the error codes e.g. RESULT_NO_MATCH (as the documentation suggests), but it chooses not to. Instead it pops up a "try again" message. The only way to return to your activity is by pressing BACK, or hoping that the recognition succeeds.
If you want to control more of the user experience then use SpeechRecognizer. This way your are calling a service and then interact with it via callbacks. You will be in full control of the UI. Or almost, e.g. the Google app makes a beep when the recognition starts, and there is no way to turn it off and provide your own beep.
Note that this answer is specific (to a specific version) of Google's voice recognition service which is not part of Android proper. It implements part of the RecognizerIntent/SpeechRecognizer API but its different versions differ in the API coverage and their exact behavior. So this answer might become wrong in the future.