I am trying to develop an application for Android wear that on a button click will ask the user to speak something and send it to a webserver. I also need to have a list of pre-defined templates, similar to what Hangouts works.
What I have tried:
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Send to server");
startActivityForResult(intent, SPEECH_REQUEST_CODE);
This works, but I cannot supply the user a set of pre-defined templates.
Reading this - https://developer.android.com/training/wearables/notifications/voice-input.html I see that it is possible to do this in a notification... but this will not be in the front, I need this UI to be modal/blocking, so a notification is not good for my use case.
What are my options? How can I implement this?
Unfortunately, other than Receiving Voice Input in a Notification there is no way to use voice recognition with pre-defined text responses.
Based on the documentation : Adding Voice Capabilities
Voice actions are an important part of the wearable experience. They let users carry out actions hands-free and quickly. Wear provides two types of voice actions:
System-provided
These voice actions are task-based and are built into the Wear platform. You filter for them in the activity that you want to start when the voice action is spoken. Examples include "Take a note" or "Set an alarm".
App-provided
These voice actions are app-based, and you declare them just like a launcher icon. Users say "Start " to use these voice actions and an activity that you specify starts.
Also as discussed in 24543484 and 22630600, both implemented a notification in their android to get the voice input.
Hope this helps.
I am doing speech recognition using a third party cloud service on Android, and it works well with Android API SpeechRecognizer. Code below:
Intent recognizerIntent =
new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_WEB_SEARCH);
// accept partial results if they come
recognizerIntent.putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, true);
//need to have a calling package for it to work
if (!recognizerIntent.hasExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE)) {
recognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, "com.example.speechrecognition");
}
recognizer = SpeechRecognizer.createSpeechRecognizer(context);
recognizer.setRecognitionListener(this);
recognizer.startListening(recognizerIntent);
At the same time, I want to record the audio with different audio setting, such as frequency, channels, audio-format, etc. Then I will continuously analyze this audio buffer. I use the AudioRecord for the purpose. This works well only I turn off the speech recognition.
If I record audio and speech recognition at the same time, then error happens.
E/AudioRecord: start() status -38
How to implement this kind of feature, I also tried native audio - SLRecordItf, also doesn't work.
As the comments state, only one mic access is permitted/possible at a time.
For the SpeechRecognizer the attached RecognitionListener has a callback of onBufferReceived(byte[] buffer) but unfortunately, Google's native recognition service does not supply any audio data to this, it's very frustrating.
Your only alternative is to use an external service, which won't be free. Google's new Cloud Speech API has an Android example.
I need to give users a method to launch their phone's voice assistant from my app, be it Google Now or anything else.
When searching on how to do this I keep finding explanations on how to get voice input while I just want to launch Google Now in "listening" mode. This question clearly asks for the same thing but the accepted answer explains how to open voice input:
How to programmatically initiate a Google Now voice search?
I know this can't be a rare case, how can it be done?
startActivity(new Intent(Intent.ACTION_VOICE_COMMAND).setFlags(Intent.FLAG_ACTIVITY_NEW_TASK));
worked for me and seems more intuitive.
It is not very clear what exactly you are trying to achieve, but I hope following will be helpful.
The code given at How to programmatically initiate a Google Now voice search? will launch the default speech recognizer (or "voice assistant" as you have put it).
Following, however, will explicitly open (if available) Google's speech recognizer:
Intent intent = new Intent(Intent.ACTION_MAIN);
intent.setClassName("com.google.android.googlequicksearchbox",
"com.google.android.googlequicksearchbox.VoiceSearchActivity");
try {
startActivity(intent);
} catch (ActivityNotFoundException anfe) {
Log.d(TAG, "Google Voice Search is not found");
}
It looks as though Google has made offline speech recognition available from Google Now for third-party apps. It is being used by the app named Utter.
Has anyone seen any implementations of how to do simple voice commands with this offline speech rec? Do you just use the regular SpeechRecognizer API and it works automatically?
Google did quietly enable offline recognition in that Search update, but there is (as yet) no API or additional parameters available within the SpeechRecognizer class. {See Edit at the bottom of this post} The functionality is available with no additional coding, however the user’s device will need to be configured correctly for it to begin working and this is where the problem lies and I would imagine why a lot of developers assume they are ‘missing something’.
Also, Google have restricted certain Jelly Bean devices from using the offline recognition due to hardware constraints. Which devices this applies to is not documented, in fact, nothing is documented, so configuring the capabilities for the user has proved to be a matter of trial and error (for them). It works for some straight away – For those that it doesn't, this is the ‘guide’ I supply them with.
Make sure the default Android Voice Recogniser is set to Google not
Samsung/Vlingo
Uninstall any offline recognition files you already have installed
from the Google Voice Search Settings
Go to your Android Application Settings and see if you can uninstall
the updates for the Google Search and Google Voice Search
applications.
If you can't do the above, go to the Play Store see if you have the
option there.
Reboot (if you achieved 2, 3 or 4)
Update Google Search and Google Voice Search from the Play Store (if
you achieved 3 or 4 or if an update is available anyway).
Reboot (if you achieved 6)
Install English UK offline language files
Reboot
Use utter! with a connection
Switch to aeroplane mode and give it a try
Once it is working, the offline recognition of other languages,
such as English US should start working too.
EDIT: Temporarily changing the device locale to English UK also seems to kickstart this to work for some.
Some users reported they still had to reboot a number of times before it would begin working, but they all get there eventually, often inexplicably to what was the trigger, the key to which are inside the Google Search APK, so not in the public domain or part of AOSP.
From what I can establish, Google tests the availability of a connection prior to deciding whether to use offline or online recognition. If a connection is available initially but is lost prior to the response, Google will supply a connection error, it won’t fall-back to offline. As a side note, if a request for the network synthesised voice has been made, there is no error supplied it if fails – You get silence.
The Google Search update enabled no additional features in Google Now and in fact if you try to use it with no internet connection, it will error. I mention this as I wondered if the ability would be withdrawn as quietly as it appeared and therefore shouldn't be relied upon in production.
If you intend to start using the SpeechRecognizer class, be warned, there is a pretty major bug associated with it, which require your own implementation to handle.
Not being able to specifically request offline = true, makes controlling this feature impossible without manipulating the data connection. Rubbish. You’ll get hundreds of user emails asking you why you haven’t enabled something so simple!
EDIT: Since API level 23 a new parameter has been added EXTRA_PREFER_OFFLINE which the Google recognition service does appear to adhere to.
Hope the above helps.
I would like to improve the guide that the answer https://stackoverflow.com/a/17674655/2987828 sends to its users, with images. It is the sentence "For those that it doesn't, this is the ‘guide’ I supply them with." that I want to improve.
The user should click on the four buttons highlighted in blue in these images:
Then the user can select any desired languages. When the download is done, he should disconnect from network, and then click on the "microphone" button of the keyboard.
It worked for me (android 4.1.2), then language recognition worked out of the box, without rebooting. I can now dictates instructions to the shell of Terminal Emulator ! And it is twice faster offline than online, on a padfone 2 from ASUS.
These images are licensed under cc by-sa 3.0 with attribution required to stackoverflow.com/a/21329845/2987828 ; you may hence add these images anywhere along with this attribution.
(This the standard policy of all images and texts at stackoverflow.com)
A simple and flexible offline recognition on Android is implemented by CMUSphinx, an open source speech recognition toolkit. It works purely offline, fast and configurable It can listen continuously for keyword, for example.
You can find latest code and tutorial here.
Update in 2019: Time goes fast, CMUSphinx is not that accurate anymore. I recommend to try Kaldi toolkit instead. The demo is here.
In short, I don't have the implementation, but the explanation.
Google did not make offline speech recognition available to third party apps. Offline recognition is only accessable via the keyboard. Ben Randall (the developer of utter!) explains his workaround in an article at Android Police:
I had implemented my own keyboard and was switching between Google
Voice Typing and the users default keyboard with an invisible edit
text field and transparent Activity to get the input. Dirty hack!
This was the only way to do it, as offline Voice Typing could only be
triggered by an IME or a system application (that was my root hack) .
The other type of recognition API … didn't trigger it and just failed
with a server error. … A lot of work wasted for me on the workaround!
But at least I was ready for the implementation...
From Utter! Claims To Be The First Non-IME App To Utilize Offline Voice Recognition In Jelly Bean
I successfully implemented my Speech-Service with offline capabilities by using onPartialResults when offline and onResults when online.
I was dealing with this and I noticed that you need to install the offline package for your Language. My language setting was "Español (Estados Unidos)" but there is not offline package for that language, so when I turned off all network connectivity I was getting an alert from RecognizerIntent saying that can't reach Google, then I change the language to "English (US)" (because I already have the offline package) and launched the RecognizerIntent it just worked out.
Keys: Language setting == Offline Voice Recognizer Package
It is apparently possible to manually install offline voice recognition by downloading the files directly and installing them in the right locations manually. I guess this is just a way to bypass Google hardware requirements.
However, personally I didn't have to reboot or anything, simply changing to UK and back again did it.
Working example is given below,
MyService.class
public class MyService extends Service implements SpeechDelegate, Speech.stopDueToDelay {
public static SpeechDelegate delegate;
#Override
public int onStartCommand(Intent intent, int flags, int startId) {
//TODO do something useful
try {
if (VERSION.SDK_INT >= VERSION_CODES.KITKAT) {
((AudioManager) Objects.requireNonNull(
getSystemService(Context.AUDIO_SERVICE))).setStreamMute(AudioManager.STREAM_SYSTEM, true);
}
} catch (Exception e) {
e.printStackTrace();
}
Speech.init(this);
delegate = this;
Speech.getInstance().setListener(this);
if (Speech.getInstance().isListening()) {
Speech.getInstance().stopListening();
} else {
System.setProperty("rx.unsafe-disable", "True");
RxPermissions.getInstance(this).request(permission.RECORD_AUDIO).subscribe(granted -> {
if (granted) { // Always true pre-M
try {
Speech.getInstance().stopTextToSpeech();
Speech.getInstance().startListening(null, this);
} catch (SpeechRecognitionNotAvailable exc) {
//showSpeechNotSupportedDialog();
} catch (GoogleVoiceTypingDisabledException exc) {
//showEnableGoogleVoiceTyping();
}
} else {
Toast.makeText(this, R.string.permission_required, Toast.LENGTH_LONG).show();
}
});
}
return Service.START_STICKY;
}
#Override
public IBinder onBind(Intent intent) {
//TODO for communication return IBinder implementation
return null;
}
#Override
public void onStartOfSpeech() {
}
#Override
public void onSpeechRmsChanged(float value) {
}
#Override
public void onSpeechPartialResults(List<String> results) {
for (String partial : results) {
Log.d("Result", partial+"");
}
}
#Override
public void onSpeechResult(String result) {
Log.d("Result", result+"");
if (!TextUtils.isEmpty(result)) {
Toast.makeText(this, result, Toast.LENGTH_SHORT).show();
}
}
#Override
public void onSpecifiedCommandPronounced(String event) {
try {
if (VERSION.SDK_INT >= VERSION_CODES.KITKAT) {
((AudioManager) Objects.requireNonNull(
getSystemService(Context.AUDIO_SERVICE))).setStreamMute(AudioManager.STREAM_SYSTEM, true);
}
} catch (Exception e) {
e.printStackTrace();
}
if (Speech.getInstance().isListening()) {
Speech.getInstance().stopListening();
} else {
RxPermissions.getInstance(this).request(permission.RECORD_AUDIO).subscribe(granted -> {
if (granted) { // Always true pre-M
try {
Speech.getInstance().stopTextToSpeech();
Speech.getInstance().startListening(null, this);
} catch (SpeechRecognitionNotAvailable exc) {
//showSpeechNotSupportedDialog();
} catch (GoogleVoiceTypingDisabledException exc) {
//showEnableGoogleVoiceTyping();
}
} else {
Toast.makeText(this, R.string.permission_required, Toast.LENGTH_LONG).show();
}
});
}
}
#Override
public void onTaskRemoved(Intent rootIntent) {
//Restarting the service if it is removed.
PendingIntent service =
PendingIntent.getService(getApplicationContext(), new Random().nextInt(),
new Intent(getApplicationContext(), MyService.class), PendingIntent.FLAG_ONE_SHOT);
AlarmManager alarmManager = (AlarmManager) getSystemService(Context.ALARM_SERVICE);
assert alarmManager != null;
alarmManager.set(AlarmManager.ELAPSED_REALTIME_WAKEUP, 1000, service);
super.onTaskRemoved(rootIntent);
}
}
For more details,
https://github.com/sachinvarma/Speech-Recognizer
Hope this will help someone in future.
So I've searched far and wide for some sort of solution to the issue regarding removing Google's Voice Recognition UI dialog when a user wants to perform a Voice Command but have been unable to find any solution. I am trying to implement an app which displays a menu to the user and the user can either click on the options or say the options out loud which will open the new pages. So far ive been unable to implement this unless I use Googles RecognizerIntent but I dont want the dialog box to pop up. Anyone have any ideas? Or has anyone solved this issue or found a workaround? Thanks
EDIT: As a compromise maybe there is a way to move the dialog to the bottom of the screen while still being able to view my menu?
Does How can I use speech recognition without the annoying dialog in android phones help?
I'm pretty sure the Nuance/Dragon charges for production or commercial applications that use their services. If this is just a demo, you may be fine with the developer account. Android speech services are free for all Android applications.
You know that you can do this with google's APIs.
You've probably been looking at the documentation for the speech recognition intent. Look instead at the RecognitionListener interface to the speech recognition APIs.
Here's some code to help you
public class SpeechRecognizerExample extends Activity implements RecognitionListener{
//This would go down in your onCreate
SpeechRecognizer recognizer = SpeechRecognizer.createSpeechRecognizer(this);
recognizer.setRecognitionListener(this);
//Then you'd need to start it when the user clicks or selects a text field or something
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
//intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, "zh");
intent.putExtra("calling_package",
"yourcallingpackage");
recognizer.startListening(intent);
//Then you'd need to implement the RecognitionListener functions - basically works just like a click listener
Here's the docs for a RecognitionListener:
http://developer.android.com/reference/android/speech/RecognitionListener.html