RecognitionListener: OnPartialResults vs OnResults - android

In terms of performance and usability, what is the best approach? What are the main differences between these two methods?
I currently have an implementation on "OnResults" that is constantly listening and compares with a couple strings, taking distinct actions for each word detected. However, it fails on recognizing the words some times and sometimes doesn't even listen to anything. If I moved the logic to "OnPartialResults" would improve the usability?

onResults is called when SpeechRecognizer finishes listening.
onPartialResults is called when SpeechRecognizer detect new word you have spoken, even before end of listening.
Both of them should have got the same results for single spoken words, but if your speech is longer, onResults can modify your output to make it little more grammatically correct (but just little bit).
Usage of them depends of your purposes. But more accuratrate results are given with onResults.
If you want to match spoken words to action, create own matcher, which would choose the best matching (but not always equal, because it does not work always).
More about onResults and onPartialResults at developer.android.com
Important: to get partial results, you have to add extra to recognizer intent:
intent.putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, true);

Related

Android Speechrecognizer stopListening() has no effect?

I have a problem when using SpeechRecognizer.stopListening() on Android, after having invoked startListening(). It simply seems to have no effect. The audio continues to be processed, and the recognition results are returned, just as if stopListening() had not been invoked.
Has anyone else had similar problems? May I be doing something wrong?
A possible clue: Immediately after invoking stopListening(), onError() is called with SpeechRecognizer.ERROR_CLIENT. Perhaps this means that the stop invocation failed?
The problem appears both when stopListening() is invoked before the start of speech is detected, or while speech is being processed.
startListening() and stopListening() are both invoked from the main thread.
Tested with both Android 5 and six on at least two different devices.
stopListening() in your explanation is behaving as expected. This call does not prevent the Recognizer from processing the speech input up until this point.
It is also expected that it will call onError() with SpeechRecognizer.ERROR_CLIENT. If you don't wish to handle this as an 'error', then at the point in your code where you call stopListening() add a simple boolean value:
deliberatelyCalledStop = true;
And then inside the error handling, you'll know if this has been thrown as an expected outcome and you can disregard it.
In order to shut down the recognition service and ignore all speech input detected thus far, you need to call:
recognizer.cancel();
recognizer.destroy(); // if you want to destroy it
These methods will achieve what you want.
Please be aware, at the time of writing, there are some pretty major bugs with Google's Recognition Service.
I recommend you read my answer to this question and then check out the gist of the bugs that I've already reported.

Is it possible to have 2 different SpeechRecognizers working at the same time?

When a SpeechRecognizer is processing an input sound, it takes a moment to be responsive again. I am intending to start a second SpeechRecognizer right when the first one calls onEndOfSpeech() to be responsive all the time but I am getting ERROR_RECOGNIZER_BUSY. I might be doing something wrong or it's just that only one SpeechRecognizer can be working at a time.
Does anyone knows if this is possible?
By the way, the speech recognizers I created are services, so there is no GUI.

Android Receive Broadcast on Audio Focus change

I am trying to write an app that detects whenever any app on the device starts or stops playing music. My app is not a media player, it's more of a dashboard functionality, so I have no need of requesting audio focus, but I want to know when other apps do so I can update my status text accordingly.
Essentially, I believe that the function AudioManager.isMusicActive() will provide essentially exactly what I want to know, however since I am writing a service that will be always on I would like to avoid needing to poll that on a constant basis. I need the information in near real time, so it would essentially be a 1 second poll in perpetuity.
I'm looking for a way to detect when STREAM_MUSIC is being used by another app.
Some ways I have thought about doing it:
(1) Again, a perpetual poll using either Timer or Handler to constantly poll the isMusicActive() function. Not ideal from a battery or processor management perspective. I could use a flag variable in the UI Activity to control when the poll runs (it isn't really necessary when the UI isn't in the foreground, anyways), but I still think I'm using more processor/battery time than I'd need to.
(2) Listen for a broadcast of some kind, but none of the Android audio broadcasts seem to really fit the bill. (that I could find)
(3) I could, I suppose, request audio focus and just never play any audio and never give it up. Theoretically, since I am starting this in an always on service I believe that should allow my app to sit at the bottom of the LIFO audio focus stack and I would be notified via the AudioManager.OnAudioFocusChangeListener mechanism in basically the opposite way of its intended purpose (i.e. turn on my app when I lose audio focus and turn it off when I gain audio focus back). However, I'm not entirely sure how doing something like this would function in real-life usage. I feel like abusing the audio focus methodology for something like this could very easily result in negative user experiences in situations I haven't even thought of.
(4) Is there a way to use the AudioManager.OnAudioFocusChangeListener (or similar) without needing to request audio focus at all?
Are there any other ways I could go about doing this? Just a pointer in the right direction would be incredibly helpful!
I needed similar thing, so did a bit of research.
Unfortunately, there seems to be no other way to accomplish this except of requesting audio focus in your app.
Material for study:
Check Android source code in frameworks\base\media\java\android\media\ where you can find actual code for requestAudioFocus in MediaFocusControl class.
Other classes in this folder show how they send broadcast events for volume change, bluetooth on/off, etc, but no broadcast event at all related to audio enabled/disabled.
Another information on topic:
http://developer.android.com/training/managing-audio/audio-focus.html
http://developer.android.com/training/managing-audio/audio-output.html
Other than that, there doesn't seem to be any documented way to know when audio hardware starts/stops to play.
I trust that requesting audio focus without actually playing audio should not consume any battery. Only one ugly side effect is that this stops currently played audio.

Why did RecognitionListener stop working in JellyBean?

For everyone using Android's voice recognition API, there used to be a handy RecognitionListener you could register that would push various events to your callbacks. In particular, there was the following onBufferReceived(byte[]) method:
public abstract void onBufferReceived (byte[] buffer)
Since: API Level 8 More sound has been received. The purpose of this
function is to allow giving feedback to the user regarding the
captured audio. There is no guarantee that this method will be called.
Parameters buffer a buffer containing a sequence of big-endian 16-bit
integers representing a single channel audio stream. The sample rate
is implementation dependent.
Although the method explicitly states that there is no guarantee it will be called, in ICS and prior it would effectively be called 100% of the time: regularly enough, at least, that by concatenating all the bytes received this way, you could reconstruct the entire audio stream and play it back.
For some reason, however, in the Jellybean SDK, this magically stopped working. There's no notice of deprecation and the code still compiles, but the onBufferReceived is now never called. Technically this isn't breaking their API (since it says there's "no guarantee" the method will be called), but clearly this is a breaking change for a lot of things that depended on this behaviour.
Does anybody know why this functionality was disabled, and if there's a way to replicate its behaviour on Jellybean?
Clarification: I realize that the whole RecognizerIntent thing is an interface with multiple implementations (including some available on the Play Store), and that they can each choose what to do with RecognitionListener. I am specifically referring to the default Google implementation that the vast majority of Jellybean phones use.
Google does not call this method their Jelly Bean speech app (QuickSearchBox). Its simply not in the code. Unless there is an official comment from a Google Engineer I cannot give a definite answer "why" they did this. I did search the developer forums but did not see any commentary about this decision.
The ics default for speech recognition comes from Google's VoiceSearch.apk. You can decompile this apk and see and find see there is an Activity to handle an intent of action *android.speech.action.RECOGNIZE_SPEECH*. In this apk I searched for "onBufferReceived" and found a reference to it in com.google.android.voicesearch.GoogleRecognitionService$RecognitionCallback.
With Jelly Bean, Google renamed VoiceSearch.apk to QuickSearch.apk and made a lot of new additions to the app (ex. offline dictation). You would expect to still find an onBufferReceived call, but for some reason it is completely gone.
I too was using the onBufferReceived method and was disappointed that the (non-guaranteed) call to the method was dropped in Jelly Bean. Well, if we can't grab the audio with onBufferReceived(), maybe there is a possibility of running an AudioRecord simultaneously with voice recognition. Anyone tried this? If not, I'll give it a whirl and report back.
I ran in to the same problem. The reason why I didn't just accept that "this does not work" was because Google Nows "note-to-self" record the audio and sends it to you. What I found out in logcat while running the "note-to-self"-operation was:
02-20 14:04:59.664: I/AudioService(525): AudioFocus requestAudioFocus() from android.media.AudioManager#42439ca8com.google.android.voicesearch.audio.ByteArrayPlayer$1#424cca50
02-20 14:04:59.754: I/AbstractCardController.SelfNoteController(8675): #attach
02-20 14:05:01.006: I/AudioService(525): AudioFocus abandonAudioFocus() from android.media.AudioManager#42439ca8com.google.android.voicesearch.audio.ByteArrayPlayer$1#424cca50
02-20 14:05:05.791: I/ActivityManager(525): START u0 {act=com.google.android.gm.action.AUTO_SEND typ=text/plain cmp=com.google.android.gm/.AutoSendActivity (has extras)} from pid 8675
02-20 14:05:05.821: I/AbstractCardView.SelfNoteCard(8675): #onViewDetachedFromWindow
This makes me belive that google disposes the audioFocus from google now (the regonizerIntent), and that they use an audio recorder or something similar when the Note-to-self-tag appears in onPartialResults. I can not confirm this, has anyone else made tries to make this work?
I have a service that is implementing RecognitionListener and I also override onBufferReceived(byte[]) method. I was investigating why the speech recognition is much slower to call onResults() on <=ICS . The only difference I could find was that onBufferReceived is called on phones <= ICS. On JellyBean the onBufferReceived() is never called and onResults() is called significantly faster and I'm thinking its because of the overhead to call onBufferReceived every second or millisecond. Maybe thats why they did away with onBufferReceived()?

Audio Signal when Voice Search Dialog is Ready to Accept Input?

The Google Voice Search comes with a significant delay from the moment you call it via startActivityForResult() until its dialog box is displayed, ready to take your speech.
This requires the user to always look at the screen, waiting for the dialog box to be displayed, before speaking.
It would be nice to add a 'ding' sound or some other non-visual cue to when Voice Search is ready to accept speech input.
Is this possible at all?
If so, how do go about doing that?
Ok this will complicate your program, however, if you really want that signal consider implementing the speech recognition by an object instead of calling the intent or making your own activity.
(warning: much of this is speculation including the order of calls)
Perhaps the delay is in instantiating resources before actually listening. If my theory is correct than you could setRecognitionListener(RecognitionListener listener) (the latency passes), create a recognizerIntent object (maybe some more latency passes), finally in (an overridden) startListening(Intent recognizerIntent), call a "PING!" before calling the super method.
It is up to you whether you would like to wrap all this functionality in a new activity, which is probably recommended, or to tack on the latency to the UI.

Categories

Resources