How can you read data, i.e. convert simple text strings to voice (speech) in Android?
Is there an API where I can do something like this:
TextToVoice speaker = new TextToVoice();
speaker.Speak("Hello World");
Using the TTS is a little bit more complicated than you expect, but it's easy to write a wrapper that gives you the API you desire.
There are a number of issues you must overcome to get it work nicely.
They are:
Always set the UtteranceId (or else
OnUtteranceCompleted will not be
called)
setting OnUtteranceCompleted
listener (only after the speech
system is properly initialized)
public class TextSpeakerDemo implements OnInitListener
{
private TextToSpeech tts;
private Activity activity;
private static HashMap DUMMY_PARAMS = new HashMap();
static
{
DUMMY_PARAMS.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID, "theUtId");
}
private ReentrantLock waitForInitLock = new ReentrantLock();
public TextSpeakerDemo(Activity parentActivity)
{
activity = parentActivity;
tts = new TextToSpeech(activity, this);
//don't do speak until initing
waitForInitLock.lock();
}
public void onInit(int version)
{ //unlock it so that speech will happen
waitForInitLock.unlock();
}
public void say(WhatToSay say)
{
say(say.toString());
}
public void say(String say)
{
tts.speak(say, TextToSpeech.QUEUE_FLUSH, null);
}
public void say(String say, OnUtteranceCompletedListener whenTextDone)
{
if (waitForInitLock.isLocked())
{
try
{
waitForInitLock.tryLock(180, TimeUnit.SECONDS);
}
catch (InterruptedException e)
{
Log.e("speaker", "interruped");
}
//unlock it here so that it is never locked again
waitForInitLock.unlock();
}
int result = tts.setOnUtteranceCompletedListener(whenTextDone);
if (result == TextToSpeech.ERROR)
{
Log.e("speaker", "failed to add utterance listener");
}
//note: here pass in the dummy params so onUtteranceCompleted gets called
tts.speak(say, TextToSpeech.QUEUE_FLUSH, DUMMY_PARAMS);
}
/**
* make sure to call this at the end
*/
public void done()
{
tts.shutdown();
}
}
Here you go . A tutorial on using the library The big downside is that it requires an SD card to store the voices.
A good working example of tts usage can be found in the "Pro Android 2 book". Have a look at their source code for chapter 15.
There are third-party text-to-speech engines. Rumor has it that Donut contains a text-to-speech engine, suggesting it will be available in future versions of Android. Beyond that, though, there is nothing built into Android for text-to-speech.
Donut has this: see the android.speech.tts package.
Related
In my android app I have a TTS using Google engine.
Have something like this:
tts=new TextToSpeech(MyClass.this, status -> {
if(status == TextToSpeech.SUCCESS){
tts.setLanguage(locale);
tts.setOnUtteranceProgressListener(new UtteranceProgressListener() {
#Override
public void onDone(String utteranceId) {
if (utteranceId.equals("***")) {
runOnUiThread(() -> {
Button view2 = findViewById(R.id.speech);
view2.setCompoundDrawablesWithIntrinsicBounds(R.drawable.play, 0, 0, 0);
});
}
}
#Override
public void onError(String utteranceId) {
}
#Override
public void onStart(String utteranceId) {
}
});
}
});
Basically I am using 2 languages, slovak and english. Both are working fine with Google TTS.
The problem is, Samsung devices have their own TTS engine set by default and therefore the app text to speech works not on those devices.
After the users changes their device settings to use Google TTS, then it is working.
But is there a way, that my code will support both TTS engines?
I found out that there might work something like this:
TextToSpeech(Context context, TextToSpeech.OnInitListener listener, String engine)
e.g. using com.google.android.tts as the engine parameter.
However in my code I have that like new TextToSpeech(MyClass.this, status -> {... and it doesn't accept engine as a 3rd parameter, and still I don't know how to detect when Samsung engine is needed and switch engines accordingly.
worth trying forcing TTS engine by passing this third param, so exchange very last line in posted snippet
});
to
}, "com.google.android.tts");
there are also two useful methods for you: getDefaultEngine() and getEngines(). just create at start some dummy new TextToSpeech with two params (empty listener) and check what possibilites you have.
also getAvailableLanguages() and isLanguageAvailable(Locale loc) may be useful when Google engine isn't present, but default one still may support your desired langs
I have an application that occasionally speaks via the systems text to speech(TTS) system, but if there's a background service (like an audiobook, or music stream) running at the same time they overlap.
I would like to pause the media, play my TTS, then unpause the media. I've looked, but can't find any solutions.
I believe if I were to play actual audio from my app, it would pause the media until my playback was complete (if I understand what I've found correctly). But TTS doesn't seem to have the same affect. The speech is totally dynamic, so I can't just record all the options.
Using the latest Xamarin.Forms, I've looked into all the media nuget packages I could find, and they all seem pretty centered on controlling media from files.
My only potential thought (I don't like it), is to maybe play an empty audio file while the TTS is running. But would like a more elegant solution if it exists.
(I don't care about iOS at the moment, so if it's an android only solution, I'm okay with it. And if it's native (java/kotlin), I can convert/incorporate it.)
Agree with rbonestell said, you can use DependencyService and AudioFocus to achieve it, when you record the audio, you can create interface in PCL.
public interface IControl
{
void StopBackgroundMusic();
}
When you record the audio, you can executed the DependencyService with following code.
private void Button_Clicked(object sender, EventArgs e)
{
DependencyService.Get<IControl>().StopBackgroundMusic();
//record the audio
}
In android folder, you can create a StopMusicService to achieve that.
[assembly: Dependency(typeof(StopMusicService))]
namespace TTSDemo.Droid
{
public class StopMusicService : IControl
{
AudioManager audioMan;
AudioManager.IOnAudioFocusChangeListener listener;
public void StopBackgroundMusic()
{
audioMan = (AudioManager)Android.App.Application.Context.GetSystemService(Context.AudioService);
listener = new MyAudioListener(this);
var ret = audioMan.RequestAudioFocus(listener, Stream.Music, AudioFocus.Gain);
}
}
internal class MyAudioListener :Java.Lang.Object, AudioManager.IOnAudioFocusChangeListener
{
private StopMusicService stopMusicService;
public MyAudioListener(StopMusicService stopMusicService)
{
this.stopMusicService = stopMusicService;
}
public void OnAudioFocusChange([GeneratedEnum] AudioFocus focusChange)
{
// throw new NotImplementedException();
}
}
}
Thanks to Leon Lu - MSFT, I was able to go in the right direction. I took his implementation (which has some deprecated calls to the Android API), and updated it for what I needed.
I'll be doing a little more work making sure it's stable and functional. I'll also see if I can clean it up a little too. But here's what works on my first test:
[assembly: Dependency(typeof(MediaService))]
namespace ...Droid.Services
{
public class MediaService : IMediaService
public async Task PauseBackgroundMusicForTask(Func<Task> onFocusGranted)
{
var manager = (AudioManager)Android.App.Application.Context.GetSystemService(Context.AudioService);
var builder = new AudioFocusRequestClass.Builder(AudioFocus.GainTransientMayDuck);
var focusRequest = builder.Build();
var ret = manager.RequestAudioFocus(focusRequest);
if (ret == AudioFocusRequest.Granted)
{
await onFocusGranted?.Invoke();
manager.AbandonAudioFocusRequest(focusRequest);
}
}
}
}
I'd like to ask you for some help with Android TextToSpeech feature.
Basically, I'd like to develop a simple AI which speaks, asking a question then waits for an answer, and at last, based on answer asks another question and so on, until user pronounces a keyword which stops everything.
Now I know TextToSpeech has to be initialized before using speak method, and I'm trying to take this into account by using onActivityResult method.
Below some code:
Activity class:
public class MainActivity extends AppCompatActivity implements OnInitListener, Button.OnClickListener{
Button sayHello;
TextView textView;
private static final int CHECK_DATA = 0;
private static final Locale defaultLocale = Locale.UK; // British English
private static final String TAG = "TTS";
private TextToSpeech tts;
private boolean isInit = false;
sayIt Method: used to speak:
public void sayIt(String text, boolean flushQ){
if(isInit){
if(flushQ){
tts.speak(text, TextToSpeech.QUEUE_FLUSH, null, null);
} else {
tts.speak(text, TextToSpeech.QUEUE_ADD, null, null);
}
} else {
Log.i(TAG, "Failure: TTS instance not properly initialized");
}
}
TextToSpeech Listener:
#Override
public void onInit(int status){
if(status == TextToSpeech.SUCCESS){
isInit = true;
// Enable input text field and speak button now that we are initialized
sayHello.setEnabled(true);
// Set to a language locale after checking availability
Log.i(TAG, "available="+tts.isLanguageAvailable(Locale.UK));
tts.setLanguage(defaultLocale);
// Examples of voice controls. Set to defaults of 1.0.
tts.setPitch(1.0F);
tts.setSpeechRate(1.0F);
// Issue a greeting and instructions in the default language
tts.speak("Initialized!", TextToSpeech.QUEUE_FLUSH, null, Integer.toString(12));
} else {
isInit = false;
Log.i(TAG, "Failure: TTS instance not properly initialized");
}
}
Button Listener:
#Override
public void onClick(View v){
if(isInit)
sayIt("You clicked!", true);
}
onActivityResult Method:
// Create the TTS instance if TextToSpeech language data are installed on device. If not
// installed, attempt to install it on the device.
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
if (requestCode == CHECK_DATA) {
if (resultCode == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS) {
// Success, so create the TTS instance. But can't use it to speak until
// the onInit(status) callback defined below runs, indicating initialization.
Log.i(TAG, "Success, let's talk");
tts = new TextToSpeech(this, this);
// Use static Locales method to list available locales on device
Locale[] locales = Locale.getAvailableLocales();
Log.i(TAG,"Locales Available on Device:");
for(int i=0; i<locales.length; i++){
String temp = "Locale "+i+": "+locales[i]+" Language="
+locales[i].getDisplayLanguage();
if(locales[i].getDisplayCountry() != "") {
temp += " Country="+locales[i].getDisplayCountry();
}
Log.i(TAG, temp);
}
} else {
// missing data, so install it on the device
Log.i(TAG, "Missing Data; Install it");
Intent installIntent = new Intent();
installIntent.setAction(TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA);
startActivity(installIntent);
}
}
}
And, at last, onCreate Method:
#Override
public void onCreate(Bundle savedInstance){
super.onCreate(savedInstance);
setContentView(R.layout.activity_main);
sayHello = findViewById(R.id.sayBtn);
textView = findViewById(R.id.textView);
sayHello.setEnabled(false);
sayHello.setOnClickListener(this);
Intent checkIntent = new Intent();
checkIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
startActivityForResult(checkIntent, CHECK_DATA);
/* THIS SPEAK DOES NOT WORK! */
sayIt("Speech from method!", true);
}
Issue is: Button successfully gets enabled when onInit method initialises TextToSpeech and successfully pronounces text.
My goal is to make the Activity speak from onCreate method, since at the moment it only works from onInit and onClick listeners, bot not in onCreate, even if I check for tts initialization using onActivityResult.
Basically I want the TextToSpeech to speak with no Buttons involved.
I know very similar questions were already posted, but none solved my problem. Have some idea?
Hope I've been clear, Thank you!
UPDATE: Log shows ERROR detected occurs in else branch of onInit method, where Log.i(TAG, "Failure: TTS instance not properly initialized"); line is.
SOLUTION:
The only thing to do here is to wait a little time in order to let TextToSpeech initialize for good.
A good way seems to be by using a delayed Handler as follows:
final Handler handler = new Handler();
handler.postDelayed(new Runnable() {
#Override
public void run() {
//Waiting for RobotTextToSpeech initialization for 1500ms
rtts.speak("This speak will work!");
rtts.speak("This other speak will work too!");
}
}, 1500);
}
By doing this, looks like TextToSpeech works well even in onCreate method, we just have to wait little time.
Hope this can help.
In Android, you create a TextToSpeech instance like this:
tts = new TextToSpeech(getApplicationContext(), new TextToSpeech.OnInitListener() {
#Override
public void onInit(int i) {
if (i == TextToSpeech.SUCCESS) {
begin();
}
else {
Log.i(TAG, "init failed");
}
}
}, "com.google.android.tts");
Notice that the desired speech engine is specified as the last argument.
There are multiple possible speech engines that can exist on a device (Samsung, PICO, Google, and more).
Question: How can we know whether or not this this TextToSpeech instance was successful in assigning the specified Engine to itself?
I don't see any way of doing this in the documentation:
onInit() only carries SUCCESS or FAIL, and there seems to be no method to query the (private) "myEngine" variable of the TextToSpeech instance.
I'm making an app that takes commands from User and write it in real time. What would be the Best option for me to take? Third Party software like sphinx or should I use the built in (android speech recognition)?
Secondly I want it to write in real time, like when I speak it starts writing?
You should use the built in Android Speech recognition. Specifically, you will need to operate the SpeechRecognier API so that there is no popup dialog box.
Also, do not expect SpeechRecognizer to return anything within onPartialResults(). It rarely does.
You could try to use Sphinx, but it seems other developers have difficulty getting it to run on Android. That said, sphinx will be your only option if you want your app to run without an internet connection.
Here is a snipped of code you will need to use SpeechRecognizer:
public void recognizeDirectly(Intent recognizerIntent)
{
// SpeechRecognizer requires EXTRA_CALLING_PACKAGE, so add if it's not
// here
if (!recognizerIntent.hasExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE))
{
recognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
"com.dummy");
}
SpeechRecognizer recognizer = getSpeechRecognizer();
recognizer.startListening(recognizerIntent);
}
#Override
public void onResults(Bundle results)
{
Log.d(TAG, "full results");
receiveResults(results);
}
#Override
public void onPartialResults(Bundle partialResults)
{
Log.d(TAG, "partial results");
receiveResults(partialResults);
}
/**
* common method to process any results bundle from {#link SpeechRecognizer}
*/
private void receiveResults(Bundle results)
{
if ((results != null)
&& results.containsKey(SpeechRecognizer.RESULTS_RECOGNITION))
{
List<String> heard =
results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
float[] scores =
results.getFloatArray(SpeechRecognizer.CONFIDENCE_SCORES);
receiveWhatWasHeard(heard, scores);
}
}
#Override
public void onError(int errorCode)
{
recognitionFailure(errorCode);
}
/**
* stop the speech recognizer
*/
#Override
protected void onPause()
{
if (getSpeechRecognizer() != null)
{
getSpeechRecognizer().stopListening();
getSpeechRecognizer().cancel();
getSpeechRecognizer().destroy();
}
super.onPause();
}
/**
* lazy initialize the speech recognizer
*/
private SpeechRecognizer getSpeechRecognizer()
{
if (recognizer == null)
{
recognizer = SpeechRecognizer.createSpeechRecognizer(this);
recognizer.setRecognitionListener(this);
}
return recognizer;
}
// other unused methods from RecognitionListener...
#Override
public void onReadyForSpeech(Bundle params)
{
Log.d(TAG, "ready for speech " + params);
}
#Override
public void onEndOfSpeech()
{
}
gregm is right but the main "write in real time" part of the question wasn't answered. You need to add an extra to indicate that you are interested in getting parts of the result back:
Adding the extra to the intent works for me
intent.putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS, true);
Warning: Partial does not return only new stuff but also the previous one. So you need to implement a check for the differences by yourself...