I noticed that as soon as a voice recognition activity starts, text-to-speech output stops.
I understand the rational: TTS output could be "heard" by the voice recognition engine and interfere with its proper operation.
My question: Is this behavior hard-coded into the system, or can it be modified by a setting or parameter (in the API)?
Must the activity simultaneously use recognition and TTS? If the recognition can wait (functionally speaking), force the event to spawn the RecognizerIntent only onUtteranceCompleted
This is pure speculation, but there must be some common feature that can only be used by TTS and recognition one at a time (both apis come from android.speech.*)
Related
I have app which have running offline voice recognition service listening for one keyword.
If keyword is spoken is triggered google speech recognition service which displays image like this and return text of spoken sentence.
I would like to know two things:
How to make app processing Google speech to text if app is not in
foreground or screen is locked?
How to avoid "Speak Now" Dialog? (I would like to use some custom
UI component)
Thanks for any advice.
If you want to run speech recognition in the background I would strongly advice you to stay way from google speech. You can currently run speech recognition in the background but it will cause a speech activation sound to be triggered every 3-5 seconds. See the question below:
Android Speech Recognition as a service on Android 4.1 & 4.2
Currently this sound runs on the music channel for some reason and therefore if you try to mute it music will be muted as well.
If you want to implement this the "nice" way I would suggest you take a look at cmusphinx.
[possibly duplicate] But I didn't find answers to my questions below.
Is it possible to run voice recognition as a service?
I would like to implement something like this: I need to call a number though my phone through voice recognition is in sleep mode.
Is there any sensor to detect the voice apart from the voice recognition?
I'm working with Voice Recognition, and i think it's impossible to run voice recognition as a service. Because of:
Problem Performance : to run as service you must call Voice Recognizer continous.
Don't have API Supports: to run as service you must use Service and call Voice Recognizer continous.
So, find other solution instead Voice Recognition.
Introduction
Android provides two ways for me to use speech recognition.
The first way is by an Intent, as in this question: Intent example. A new Activity is pushed onto the top of the stack which listens to the user, hears some speech, attempts to transcribes it (normally via the cloud) then returns the result to my app, via an onActivityResult call.
The second is by getting a SpeechRecognizer, like the code here: SpeechRecognizer example. Here, it looks like the speech is recorded and transcribed on some other thread, then callbacks bring me the results. And this is done without leaving my Activity.
I would like to understand the pros and cons of these two ways of doing speech recognition.
What I've got so far
Using the Intent:
is simple to code
avoids reinventing the wheel
gives consistent user experience of speech recognition across the device
but
might be slow for the creation of a new activity with it's own window
Using the SpeechRecognizer:
lets me retain control of UI in my app
gives me extra possibilities of things to respond to (documentation)
but
is limited to be called from the main thread
more control requires more error-checking.
In addition to all this, I'd add at least this point:
SpeechRecognizer is better for hands-free user interfaces, since your app actually gets to respond to error conditions like "No matches" and perhaps restart itself. When you use the Intent, the app beeps and shows a dialog that the user must press to continue.
My summary is as follows:
SpeechRecognizer
Show different UI or no UI at all. Do you really want your app's UI to beep? Do you really want your UI to show a dialog when there is an error and wait for user to click?
App can do something else while speech recognition is happening
Can recognize speech while running in the background or from a
service
Can Handle errors better
Can access low level speech stuff like the raw audio or the RMS. Analyze that audio or use the loudness to make some kind of flashing light to indicate the app is listening
Intent
Consistent, and easy to use UI for users
Easy to program
The main difference is UI. SpeechRecognizer doesn't have any so you are responsible for creating one.
I use to wrote a prototype where I've have receiver for listening headset button, then activating speech recognition to listen for some commands. Screen was not activated so I had to use SpeechRecognizer (my UI was some prerecorded sounds and Text To Speech).
Second difference is that SpeechRecognizer has ability for constant listening. Intent version will always end exaction after some period. For example SpeechRecognizer is used by speech recognition "keyboard" so you can dictate a SMS.
In such case you will receive partial results only (in normal mode SpeechRecognizer gives only final results).
One thing that the other answers have not mentioned: if multiple speech recognizers are installed on the device then user switching between them is different depending on if "Intent" or the SpeechRecognizer is used.
In case of "Intent" the standard Activity selection dialog is popped up. The user can choose the recognizer to be used, and optionally set it globally as the default recognizer, to avoid the dialog in the future.
In case of SpeechRecognizer the user can set and configure the default recognizer in the global settings (Language and input -> Voice recognizer on ICS).
So, depending on which interface is used the documentation about setting the default recognizer and switching between recognizers should be different. (In most cases though there is just one recognizer, Google Voice Search, so this might not be a big issue in practice.)
I want to build an android application which will recognize my voice, convert it into text and will show what i just spoke in a toast. i am able to do this by using a button which will launch voice recognizer for me. But now i want to make it work on the bases of my voice only.
The application should trigger voice recognizer and start listening to me only when i start speaking and should stop listening when it senses silence. Just like the functioning of talking tom application. There it records the voice but i want to recognize it using voice recognizer. Some thing like this:
if(no silense)
Launch Recognizer
else if(silence)
Stop Recognizer
Show toast
The main problem is that how can i sense if user is speaking something or not before launching voice recognizer. Is there any way to sense noise intensity..??
Secondly, is there any way to launch voice recognizer in the background...??
Is it possible if I can detect audio signal (someone starts speaking) in a background service, which will then immediately launch the voice recognizer to recognize the speech.
Most speech recognizers already have an endpointer to detect the start-of-speech and end-of-speech. Endpointers usually try to read the ambient noise level to determine a baseline for silence and to adapt the signal-to-noise ratio. But, if the input noise level changes, it might trigger the start-of-speech of the endpointer. If listening all the time, with a sensitive microphone, the endpointer might also pickup someone speaking next to you, instead of you.
As such, using a speech button is a good practice to announce when you wish to talk. Trying to get the recognizer to listen all of the time is probably not what you want to do, or should be left up to researchers.
Ok I have figured it out. I have used mediaRecorder class for this. When the application launches i start recording the audio using mediaRecoder (or you can provide a button to start and stop the whole process). I check for the amplitude of the audio being recorded by the mediaRecorder. If the amplitude passes over a predefined threshold, I pause the recording and launch the Voice Recognition activity. In OnActivityResult I again resume the recorder.
if(mRecorder != null){
int i= mRecorder.getMaxAmplitude(); // Getting amplitude
Log.d("AMPL : ", String.valueOf(i));
if(i>20000){ // If amplitude is more than 20000
onRecord(false); //Stop recording before launching recognizer
Intent intent=new Intent(this,VoiceRecognizer.class); //Launch recognizer activity
startActivityForResult(intent, 12112);
}
Alternatively: You can also use RecognitionListener interface as referred in this SO post.
I have an Android application that uses speech recognition in an Activity. The GUI doesn't do anything except for contain the speech recognition objects. I would like to port this over to a service so I can talk to the application while it's running in the background.
However, as far as I know, the speech recognition service has to use onActivityResult, which is unavailable for Services. Is there a way to either contain an Activity in a Service such that its GUI is not displayed, or perform speech recognition in a service instead of an activity?
See Google's voice search speech recognition service - it might have some useful links to information. I don't think you can do non-Gui voice recognition because the recognizer is only exposed as the recognizer intent.
I don't think that Google wants people to call this service directly, and it likely violates some terms of service somewhere if you do, but check out http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/ to see the service behind Chrome speech recognition which I suspect is similar to Android.
what if you have your service wake up an activity when it detects any incoming audio signal,
that acts like a widget only taking up a small part of the screen or even just a single pixel, then call voice recognition from the invisible activity?
Just an idea, I don't remember if a widget can be an activity or if you can make an activity that doesn't take up the screen.