CONTEXT: My application is sending sentences to whatever TTS engine the user has. Sentences are user-generated and may contain punctuation.
PROBLEM: Some users report that the punctuation is read aloud (TTS says "comma" etc) on SVOX, Loquendo and possibly others.
QUESTION:
Should I strip all punctuation?
Should I transform the punctuation using this kind of API?
Should I let the TTS engine deal with the punctuation?
The same user that sees the problem with Loquendo, does not have this problem with another Android application called FBReader. So I guess the 3rd option is not the right thing to do.
I had the same problem with one of my apps.
The input string was:
Next alarm in 10 minutes,it will be 2:45 pm
and the TTS engine would say:
Next alarm in 10 minutes comma it will be 2:45 pm.
The problem was fixed just by adding a space after the comma like this:
Next alarm in 10 minutes, it will be 2:45 pm
This is a stupid mistake, and maybe your problem is more complicated than that, but it worked for me. :)
So, you're worried about what back-alley-acquired text-to-speech engine the user might happen to have selected as their default... presumably because you don't want your app to look bad due to this engine's unknown/bad behavior. Understandable.
The (good) fact is, though, that the TTS's behavior is not actually your responsibility unless you decide to embed an engine in the app itself (Difficulty: Hard, Recommended? No).
Engines can and should be presumed to adhere to Android rules and behaviors dictated here... and presumed to supply their own sufficient set of configuration options in the Android system settings (home\settings\language&locale\TTS) which may or may not include pronunciation options. The user should also be presumed intelligent enough to install an engine that they are satisfied with.
It is a slippery slope to take on the job of anticipating and "correcting" for unknown and unwanted engine behaviors (at least in engines that you haven't tested yourself).
A SIMPLE AND GOOD OPTION (Difficulty: Easy):
Make a setting in your app: "ignore punctuation."
A BETTER OPTION (Difficulty: Medium):
Do the above, but only show the "ignore punctuation" setting-option if the engine you have detected on the user's device is prone to this issue.
Also, one thing to note is that there are many, many differences between engines (whether they use embedded voices vs online, response time, initialization time, reliability/adherence to Android specs, behavior across Android API levels, behavior across their own version history, the quality of voices, not to mention language capability)... differences that may be even more important to users than whether or not punctuation is pronounced.
You say "My application is sending sentences to whatever TTS engine the user has." Well... "That's yer problem right there." Why not give the user a choice on what engine to use?
And leads us to...
AN EVEN BETTER OPTION (Difficulty: Hard and Good! [in my humble opinion]):
Decide on some "known-good" engines your app will "support," starting with Google and Samsung. I would guess that there are less than 5% of devices out there these days that don't have either of those engines on them.
Study and test these engines as much as possible across all Android API levels that you plan to support... at least in as far as whether they pronounce punctuation or not.
Over time, test more engines if you like, and add them to your supported engines in subsequent app updates.
Run an algorithm when your app starts that detects which engines are installed, then use that info against your own list of supported engines:
private ArrayList<String> whatEnginesAreInstalled(Context context) {
final Intent ttsIntent = new Intent();
ttsIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
final PackageManager pm = context.getPackageManager();
final List<ResolveInfo> list = pm.queryIntentActivities(ttsIntent, PackageManager.GET_META_DATA);
ArrayList<String> installedEngineNames = new ArrayList<>();
for (ResolveInfo r : list) {
String engineName = r.activityInfo.applicationInfo.packageName;
installedEngineNames.add(engineName);
// just logging the version number out of interest
String version = "null";
try {
version = pm.getPackageInfo(engineName,
PackageManager.GET_META_DATA).versionName;
} catch (Exception e) {
Log.i("XXX", "try catch error");
}
Log.i("XXX", "we found an engine: " + engineName);
Log.i("XXX", "version: " + version);
}
return installedEngineNames;
}
In your app's settings, present all engines that you've decided to support as options (even if not currently installed). This could be a simple group of RadioButtons with titles corresponding to the different engine names. If the user selects one that isn't installed, notify them of that and give them the option of installing it with an intent.
Save the user's selected engine name (String) in SharedPreferences, and use their selection as the last argument of the TextToSpeech constructor any time you need a TTS in your app.
If the user has some weird engine installed, present it as a choice also, even if it is unrecognized/unsupported, but inform them that they have selected an unknown/untested engine.
If the user selects an engine that is supported but is known to pronounce punctuation (bad), then upon selection of that engine, have an alert dialog pop up warning the user about that, explaining that they can turn this bad behavior off with the "ignore punctuation" setting referred to already.
SIDE-NOTES:
Don't let the SVOX/PICO (emulator) engine get you too worried -- it has many flaws and is not even designed or guaranteed to run on Android above API ~20, but is still included on emulators images up to API ~24, resulting in "unpredictable results" that don't actually reflect reality. I have yet to see this engine on any real hardware device made within the last seven years or so.
Since you say that "sentences are user generated," I would be more worried about solving the problem of what language they are going to be typing in! I'll look out for a question on that! :)
Related
How can I specify a custom wakeword name (eg "stack overflow" or "party time") in the spokestack-android configuration? I'm looking for something like:
SpeechPipeline pipeline = new SpeechPipeline.Builder()
.setProperty("wakeword", "stack overflow")
//...
.build();
Update: You can train your own wakeword (without writing code, just providing audio samples) with a Maker subscription. When they're finished training, you can download and configure the custom wake word the same way you set up the default wake word.
Currently, Spokestack Android only supports wakeword detection via a binary classifier, so we only recognize "Spokestack". In theory, this could be done via Android's platform ASR, with the caveat that the user would constantly be interrupted by Google Assistant-style audible dings as the ASR request times out and gets restarted, so it'd only be useful for informal demos, not real apps.
That said, it's theoretically possible, so feel free to open an issue, and it might show up in a future version if we get enough demand for it.
I need an advice on a problem I am experiencing with this action I published recently:
https://assistant.google.com/services/a/uid/000000576929e1c4
The action supports 2 languages, English and Italian and its invocation name is "Blitzy rider" for both languages, which used to work fine at the time I published, both in simple invocations (just the action name) and composite invocations (action name + intent to perform).
Since few days the invocation is terrible, in English it fails 50% of time, in Italian fails 100%. It seems that the voice recognition is trying to do its best to avoid my action name and pick similar names, for example:
“Belizzi rider”
“Belize ride”
“Brizzy rider”
“Blitz rider”
I suspect that Google changed the vocal recognition app (at I/O they said they wanted to move it from the cloud to the phone to speed up interaction). Anyway, with current situation my action is unusable. It provides traffic information and users need to invoke it quickly with the voice from the car, they cannot correct the name with the keyboard, that defeat to whole purpose of the action.
What do you suggest me to do? Should I rename the invocation name and differentiate it for the 2 languages? That is kind of painful because the name is also some sort of brand and it’s quoted in the banner and logo.
I kindly ask you also, if you can give me few seconds of your time to try from your assistant these invocations to see if they work:
“Talk to Blitzy rider”
“Ask Blitzy rider to read messages near Santa Monica” (or any other place of your choice)
And if you have an Italian assistant:
“Parla con Blitzy rider”
“Chiedi a Blitzy rider di leggere I messaggi vicino Milano”
On my phone it's a disaster.
Thank you very much for your help and advice.
I just tried to invoke your action three times, and "Blitzy Rider" (speech) was converted to "Glitzy Rider" twice and "Blake See Rider" once.
If I was in your situation, I would bite the bullet and rebrand.
I have a popular read aloud app, that is also often used by visually impaired and blind people. Some, very few of them complain that when using the app or having it read aloud, it repeatedly says "Service at Voice" (my app's name is #Voice Aloud Reader). I tested this on several phones with different versions of Android and TalkBack enabled, but couldn't reproduce this problem.
The app is showing a notification with reading progress and buttons to pause/resume, FF and reverse etc. Of course all the reading aloud is done from a service, not activity, because a user may want to close my activity, or even turn off screen, and still listen. I would gladly post more technical details, but don't know which ones are relevant.
I tried searching for any combination of terms "TalkBack saying 'service' repeatedly", but cannot find anything relevant. My users who contacted me about this could not find either any setting in TalkBack app to make it stop saying this. Could anyone shed some light on this issue?
I found the reason for my problem, part of it was my own app code, and part just confusing behavior of Android system and TalkBack on different devices. Here is what was happening:
The app, #Voice Aloud Reader, reads text loaded into it (web pages, docs, books) and highlights the sentence it reads aloud. On each change of sentence it updates progress, both on its own screen if visible, and in the notification. The notification update code is pretty old, from Android 4 days. I did not know then how to update the content of notification, it seemed to me that the only way to update it, after using NotificationBuilder to update content, was to call in my service again:
startForeground(/* id: */ 1000, myNotifBuilder.build());
It worked well for years, also under TalkBack, no problems. Even today on at least 5 test devices I have with Android 5 to 9 and with emulators, TalkBack activated, it works correctly. But some users reported that upon reading each new sentence (progress update), TalkBack says "Service #Voice". I finally updated the code as follows, and my users report that the problem is solved:
if (newNotification) {
startForeground(/* id: */ 1000, myNotifBuilder.build());
}
else {
NotificationManagerCompat.from(this).notify(1000, myNotifBuilder.build());
}
I doubt that this knowledge will help many people, now notifications are documented better and there is a clear "Update notification" chapter that explains how to do this correctly in Google documents for developers.
I bet it's announcing the app name on orientation changes each time the MainActivity is created.
SO link
I wish to know is it possible to write an android app that when it runs at the background, it can track user activities?(Such as what other app did the user used, what phone number did user dial, the GPS location for user, etc) Cause I am not sure can a single android app react to other application, does anyone know the answer? Thanks
In the general case, no, you can't. And users would probably prefer it so.
Once this has been said, there are certain partial solutions. Sometimes the system is so helpful that it will publish Intents reflecting user actions: for example when the user uninstalls an app -- with the caveat that you don't get that intent on the app itself being uninstalled.
It used to be the case that before Jelly Bean (4.1) apps could read the log that other applications publish and try to extract info from there, but it was a cumbersome, error prone, ungrateful task. For example, the browser shows nothing when it navigates to a certain page. You may read the logs for a while with adb logcat to get a feeling of what was possible and what isn't. This action requires the relevant permission, which cannot be held by regular apps now.
Thanks to #WebnetMobile for the heads up about logs and to #CommonsWare for the link, see the comments below.
Yes you can.
You can look here for instance about phone info:
Track a phone call duration
or
http://www.anddev.org/video-tut_-_querying_and_displaying_the_calllog-t169.html
There is a way to let Android and users know you are using and accessing their data for them to determine if they will allow it.
I am unsure you can simply access any app, but in theory if you know how to read the saved files that might be possible.
For instance Runtime.getRuntime().exec("ls -l /proc"); will get you the "proc" root folder with lots of data you might need there. This might have been changed, I am not sure, and I also don't know what you need.
Perhaps to get running process try:
public static boolean getApplications(final Context context) {
ActivityManager am = (ActivityManager) context.getSystemService(Context.ACTIVITY_SERVICE);
List<RunningTaskInfo> tasks = am.getRunningTasks(1);
}
For this to work you should include this in your AndroidManifest.xml
<uses-permission android:name="android.permission.GET_TASKS" />
See more about it: http://developer.android.com/reference/android/app/ActivityManager.html#getRunningAppProcesses%28%29
You certainly could but I think reporting that data back to you, unbeknownst to the user, via the internet, would be considered spyware and almost certainly illegal in most jurisdictions.
Fortunately spying users at that level should not be possible. Certain features can be achieved with abusing bugs in android which sooner than later will be fixed. I see absolutely no reason for you to know what number I am calling and where I've been lately. It's basically none of your business.
I see there are plenty of examples on how to call a number, and I also see that I can only have it pop up the dialer to go to an emergency number. But in all those example they hard coded "911" as the number to use. well this works fine in the US but since android phones are sold in other countries and thusly there is the possibility that my app will be bought by someone not in the US, or that someone who lives in the us may take their phone overseas; is there a way then my app can realize it's not in the us and thusly has to use a different number to call emergency service and what that number would be?
So to sum up I'd like to know if there is a way I can have it so when the app goes to bring up the dialer with the emergency number for the country it's in, with out having to know that number at complie time?
According to the source for PhoneNumberUtils.isEmergencyNumber():
String numbers = SystemProperties.get("ril.ecclist");
if (TextUtils.isEmpty(numbers)) {
// then read-only ecclist property since old RIL only uses this
numbers = SystemProperties.get("ro.ril.ecclist");
}
numbers will be a comma separated list.