I am doing speech recognition using a third party cloud service on Android, and it works well with Android API SpeechRecognizer. Code below:
Intent recognizerIntent =
new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
RecognizerIntent.LANGUAGE_MODEL_WEB_SEARCH);
// accept partial results if they come
recognizerIntent.putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, true);
//need to have a calling package for it to work
if (!recognizerIntent.hasExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE)) {
recognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, "com.example.speechrecognition");
}
recognizer = SpeechRecognizer.createSpeechRecognizer(context);
recognizer.setRecognitionListener(this);
recognizer.startListening(recognizerIntent);
At the same time, I want to record the audio with different audio setting, such as frequency, channels, audio-format, etc. Then I will continuously analyze this audio buffer. I use the AudioRecord for the purpose. This works well only I turn off the speech recognition.
If I record audio and speech recognition at the same time, then error happens.
E/AudioRecord: start() status -38
How to implement this kind of feature, I also tried native audio - SLRecordItf, also doesn't work.
As the comments state, only one mic access is permitted/possible at a time.
For the SpeechRecognizer the attached RecognitionListener has a callback of onBufferReceived(byte[] buffer) but unfortunately, Google's native recognition service does not supply any audio data to this, it's very frustrating.
Your only alternative is to use an external service, which won't be free. Google's new Cloud Speech API has an Android example.
Related
I am tying to record audio from the microphone into a sound file using Expo-AV while also doing speech recognition using react-native-voice. It works as expected on iOS but on Android I can only get 1 or the other working.
import Voice from '#react-native-voice/voice';
import { Audio } from 'expo-av';
Voice.onSpeechStart = startRecording; // Gets called from Voice.start()
async function startRecording() {
// Calls expo-av recording methods to start a recording
}
...
Voice.start("en-us");
Permissions for RECORD_AUDIO is set. When I comment out the recording portion, the voice recognition events from react-native-voice start firing.
While on a call, does speechRecognizer works?
Has anyone tried it?
My goal is to listen to user's voice and not the incoming voice. While researching I found this link:
can speech recognizer take input from incoming call voice (Speaker)?
which is looking for using SpeechRecognizer for incoming voice 'during a call'.
And am looking for using SpeechRecognizer while on the call, just for my user's voice.
The Internet says YES!
One answer I found in a comment from this question was to use this SDK.
I implemented SpeechRecognizer in Android Wear but this UI looks same as 'Ok Google' ui thus confuses user believing that they are talking to our app in fact they are talking to 'Ok Google' UI.
Is there a way to customize the SpeechRecognizer UI so that we can avoid this confusion?
Currently I don't think so. When I try it I get this error message "SpeechRecognizer﹕ no selected voice recognition service". Looking at Google Glass it seems that based on this information it's not available but may become so. Hopefully this is also true for Android Wear.
Is it possible to have Android Voice Recognition (as a custom service) on Google Glass?
Sure, it's possible to use a custom ui.
Create one, show it and
run the recognizer from code
sr = SpeechRecognizer.createSpeechRecognizer(getApplicationContext());
sr.setRecognitionListener(new Speachlistener());
if (recognizerIntent == null) {
recognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
//intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
recognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, getApplication().getPackageName());
recognizerIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 5);
}
try{
sr.startListening(recognizerIntent);
}catch (Exception e)
{}
Is there is any android code to pick the voice from the other side in the call?
meaning while I'm speaking with another person in my Phone I can Play a speaker that picks the voice, can I direct this voice to my application to record, recognize it, or convert to text?
Since my target is to convert the voice in the call to text.
Thanks in advance.
Use Broadcast Receiver to handle calls
public void onReceive(Context context, Intent intent) {
String state = intent.getStringExtra(TelephonyManager.EXTRA_STATE);
if(state.equals(TelephonyManager.EXTRA_STATE_RINGING)){
}else if(state.equals(TelephonyManager.EXTRA_STATE_OFFHOOK){
//apply recording here
}else if (state.equals(TelephonyManager.EXTRA_STATE_IDLE)){
//stop recording
}
}
To convert Voice to Text Follow this Link
You can record from AudioSource.VOICE_DOWNLINK. Keep in mind though that this might not work on every single Android phone out there, and that voice call audio is quite heavily compressed and therefore might not give you any good results from a Speech-To-Text engine.
I've used the voice recognition feature on Android and I love it. It's one of my customers' most praised features. However, the format is somewhat restrictive. You have to call the recognizer intent, have it send the recording for transcription to google, and wait for the text back.
Some of my ideas would require recording the audio within my app and then sending the clip to google for transcription.
Is there any way I can send an audio clip to be processed with speech to text?
I got a solution that is working well to have speech recognizing and audio recording. Here is the link to a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.
I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.
Google Speech API is called through HTTP connections. Mike Pultz gives more details about the API:
"(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."
However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording
I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.
Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.
The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.
Unfortunately not at this time. The only interface currently supported by Android's voice recognition service is the RecognizerIntent, which doesn't allow you to provide your own sound data.
If this is something you'd like to see, file a feature request at http://b.android.com. This is also tangentially related to existing issue 4541.
As far as I know there is still no way to directly send an audio clip to Google for transcription. However, Froyo (API level 8) introduced the SpeechRecognizer class, which provides direct access to the speech recognition service. So, for example, you can start playback of an audio clip and have your Activity start the speech recognizer listening in the background, which will return results after completion to a user-defined listener callback method.
The following sample code should be defined within an Activity since SpeechRecognizer's methods must be run in the main application thread. Also you will need to add the RECORD_AUDIO permission to your AndroidManifest.xml.
boolean available = SpeechRecognizer.isRecognitionAvailable(this);
if (available) {
SpeechRecognizer sr = SpeechRecognizer.createSpeechRecognizer(this);
sr.setRecognitionListener(new RecognitionListener() {
#Override
public void onResults(Bundle results) {
// process results here
}
// define your other overloaded listener methods here
});
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
// the following appears to be a requirement, but can be a "dummy" value
intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, "com.dummy");
// define any other intent extras you want
// start playback of audio clip here
// this will start the speech recognizer service in the background
// without starting a separate activity
sr.startListening(intent);
}
You can also define your own speech recognition service by extending RecognitionService, but that is beyond the scope of this answer :)