I'm looking to add voice commands to an Android App that will be running on a tablet as a kiosk. I don't want the user to have to push a button, because the user is doing something more important (e.g. driving a car, flying a plane, or performing brain surgery) and the command could be completed by a single button push.
I see tutorials describing how to add speech to text and have the user push a button and get the text, but nothing allowing the wake word "Okay, Google" to start the voice recognition (much less a custom wake word).
I looked at using the Google Voice Actions to start with "Okay, Google" and then send something to my app (register an intent), but that has to be trained to one specific user (at least for the tablet I tried it on). I'll have different users every day (maybe more than one a day) and no opportunity for training the device.
I've worked with CMUSphinx and found it to be too unreliable for spotting a wake word.
Is there a way to add "Okay, Google" as a way to start listening to text inside my app?
Got it working using PocketSphinx for offline wake work recognition and then I hand the microphone over to IBM's Watson's Speech to Text software that works over the internet and comes back with pretty reliable results.
Unfortunately what you are trying to achieve is not possible. If I understood correctly what your concept: a 3rd party app will awaken the devices and act based on a set of commands (from a security point of view this is very bad).
The closest you can do is follow the Voice Actions Api - https://developers.google.com/voice-actions/system/
Related
I'm able to integrate Android widgets with Google Assistant. And want to have some voice command experience.
For example the CREATE_CALL intent, if user is trying to call Alice by saying call Alice with some app, and if there are 2 Alice in my app, is it possible for me to response with a widget showing 2 Alice, and asking user by voice, and user can choose which one to actually call, all by voice? Can it be done by SpeechRecognizer API?
Broadly speaking, App Actions do not have a voice conversation experience. There are some tricks you can pull that might head in that direction, but they are largely outside of the App Action Widget experience itself.
Can I respond with a widget showing that there are multiple matches?
Yes, you can send back a Control Widget that might allow them to choose which user they mean.
Can they speak which user?
Probably not in the way you're thinking. To use your example, they can re-invoke the CREATE_CALL BII using any of the phrases, but you can't prompt them with "Who did you mean, exactly?" and for them to just say the name.
Can I use the SpeechRecognizer API?
Not as part of a widget.
Widgets get embedded in the conversation with the Assistant.
In theory (and this is on my list to eventually test and figure out), you should be able to deep link to an Android Intent in cases such as this and open a view. While there, you could use SpeechRecognizer or just open the microphone to send audio somewhere. But this isn't done using the Widget itself.
In this scenario, SpeechRecognizer just does the Speech To Text (STT) or Automatic Speech Recognition (ASR) part of the processing. To actually match this up to phrases to determine an Intent, you would need a Natural Language Understanding (NLU) module such as Dialogflow. (But you may not need the SpeechRecognizer in that particular case, since Dialogflow can also take an audio stream to do the ASR part for you.)
I am trying to develop an app with voice commands to do different actions within my app. But to fire up the Speech listener module I want to use something like the Google's "OK Google" command which works without any manual touch input. This will help make my app completely hands-free.
Instead of re-creating what the google's service does, I wanted to know if it's possible to receive an event when "Ok google" is triggered.
Note: that this has to work only when my app is running not when it is closed.
Android Speech Recognition Without Dialog might be what you are looking for. Check it out.
I've implemented a relatively simple test application for a customer who owns a container repair facility. His aim is to deliver to his operators a tool with (possibly) voice interaction. The app basicly works well, using Google's Speech API. The annoying problem are the notification sounds when you launch the recognition intent and the subsequent timeout notification if the user doesn't speak within 4 seconds. I'm intercepting all the errors, so I can relaunch the recognizer, but it's not so comfortable hearing this couple of notification every 4 seconds, especially when you're awaiting the next container to check. A partial workaround could be the implementation of a sound trigger like the "Ok Google" feature found, for example, on my Samsung S6, but I'm not able to find info about that. The app is written with Xamarin, but it has been already ported under Android Studio in order to test the Nuance library, so if there isn't a chance for implementing "Ok Google" trigger under Xamarin, also any java suggestion would be very welcome. Obviously I don't need "Ok Google" but anoter trigger, like "inizio" or "start check", that is a user-defined trigger (or set of triggers).
Thanks.
Rodolfo
I am thinking about making an app that I can use to control my Arduino robot (over bluetooth/wifi) using voice commands. But to make the experience fluid, I will need the Android app speech recognition to be continuously running. If I want the robot to stop, I don't want to press a button, wait for the speech recognition dialog to appear, say my command "STOP", release the button, wait for the parser to parse it, and then send the stop command.
I would rather just have the Speech to Text in continuous listen mode when I am controlling my robot. And when it hears keywords, it sends them.
Can I do this in Android? I did some googling, and I found the recognizer intent, but all of the examples I found use a button trigger and pretty much followed the scenario I described above.
It can be done. Look at this link. It has also some example code :)
You can make it listen and when it speeches you get the word see if it is a keyword and then make the robot do as you want.
http://viralpatel.net/blogs/android-speech-to-text-api/
I've got an idea for an android app, I want to be able to say commands and have the application listen out for these and perform some action.
For example, I want my app to sit idle and listen for my voice, when it hears me say "start", the app will start doing something until I say "stop".
The idea is to lay the phone down and not have to physically touch it in order to control my app.
Would this be possible with any current APIs? If so which ones should I look into?
You can take a look at the Google voice commands.
http://www.google.com/mobile/voice-actions/
Alternatively, if you want to customise your application, you can use the google voice service and write an activity that will invoke the voice service and return you the result.
Check out the below link for the sample application.
http://developer.android.com/resources/samples/ApiDemos/src/com/example/android/apis/app/VoiceRecognition.html