Does Android TTS support Speech Synthesis Markup Language? - android

Passing the following SSML (Speech Synthesis Markup Language) document to the com.svox.pico TextToSpeech engine resulted in a reading of the XML body but no control from the phoneme element or the emphasis element. This result (no apparent SSML control) is the same on a Nexus One running Android 2.2 as well as on the emulator running an AVD with SDK level 8.
String text = "<?xml version=\"1.0\"?>" +
"<speak version=\"1.0\" xmlns=\"http://www.w3.org/2001/10/synthesis\" " +
"xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" " +
"xsi:schemaLocation=\"http://www.w3.org/2001/10/synthesis " +
"http://www.w3.org/TR/speech-synthesis/synthesis.xsd\" " +
"xml:lang=\"en-US\">" +
"tomato " +
"<phoneme alphabet=\"ipa\" ph=\"t&#x259;mei&#x325;&#x27E;ou&#x325;\"> tomato </phoneme> " +
"That is a big car! " +
"That <emphasis> is </emphasis> a big car! " +
"That is a <emphasis> big </emphasis> car! " +
"That is a huge bank account! " +
"That <emphasis level=\"strong\"> is </emphasis> a huge bank account! " +
"That is a <emphasis level=\"strong\"> huge </emphasis> bank account!" +
"</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);
Does any Android TTS engine support any of the SSML elements?

I've been experimenting with SSML and it seems that the TTS engine wraps its input automaticly with the root <speak> element, so if you leave it out, then it works fine and you don't get a parser error.
Example:
String text = "Testing <phoneme alphabet=\"xsampa\" ph=\""{k.t#`\"/>.";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);

The answer seems to be "sort of". Not all the SSML tags are supported yet, but some test examples of the use of the <phoneme> tag are at https://android.googlesource.com/platform/external/svox/+/89292811b7fe82e5c14fa13942779763627e26db
Though the test examples produce the desired speech output, they also produce XML parser error messages in logcat. I've opened an issue about these seemingly incorrect error messages at the Android issue tracker (issue 11010).

It does appear that android.speech.tts at SDK level 23 supports a subset of SSML. Speech text can be wrapped in <speak> tags, and <say-as> is observed, while <break> is not. There is no documentation regarding SSML support.

Related

store large json jext in variable in android studio (kotlin)

i have a large json text consisting of at least 2500 lines that contains MCQs, like this:
[[
"Cellular respiration is defined as:",
" An intracellular, energy-producing process",
" An extracellular, energy-producing process",
" An intracellular, energy-requiring process",
" An extracellular, energy-requiring process",
],[
"The physiological term for eating and drinking is:",
"Ingestion.",
"Propulsion.",
"Absorption.",
"Digestion.",
],[
"Nerve impulses:",
"Can travel either way along a neurone.",
"Travel more quickly in unmyelinated neurones.",
"Travel by saltatory conduction in myelinated neurones.",
"Travel during the refractory period.",
],[
"Which of the following is referred to as internal environment?",
"Intracellular fluid",
"Plasma",
"Cytosol",
"Extracellular fluid",
],[
"The two tiny openings in the laryngopharynx communicate with:",
"The oropharynx.",
"The maxillary sinus.",
"The middle ear.",
"The ethmoid sinus.",
],[
"Pontine area works opposite to:",
"Medullary area",
"Cortex",
"Red nucleus",
"Vestibular area",
],[
"In nerve cells:",
"The cell membrane is polarised in the resting state.",
"Sodium (Na+) is the principal intracellular cation.",
"At rest, Na+ tends to diffuse out of the cells.",
"Depolarisation occurs when Na+ floods out of the cells."]]
i want to store this text into a string variable and then convert it into json object. i have tried to do so like :
val jsonString = "......large-text....." ​
but the moment i paste that text in "jsonString" it jams the whole android studio , how can i store that text in a variable??
If you dont need to edit this long text, you can store as an asset .txt file. See more in https://developer.android.com/reference/android/content/res/AssetManager

String objects with various Hyperlinks in Textview?

I have a String objects that contain hyperlinks which I'm trying to make clickable in a Textview, here's an example:
String descrption = "\"Cardano is a decentralised platform" +
" that will allow complex programmable transfers of" +
" value in a secure and scalable fashion. It is one" +
" of the first blockchains to be built in the highly" +
" secure Haskell programming language. Cardano is developing" +
" a smart contract platform" +
" which seeks to deliver more advanced features than any protocol previously developed." +
" It is the first blockchain platform to evolve out of a scientific philosophy and a research-first driven approach." +
" The development team consists of a large global collective of expert engineers and researchers.\\r\\n\\r\\n" +
"The Cardano project is different from other blockchain projects as it openly addresses the need for regulatory oversight whilst maintaining" +
" consumer privacy and protections through an innovative software architecture.";
TextView textView = findViewById(R.id.description_textview);
Linkify.addLinks(textView, Linkify.WEB_URLS);
But this is how it comes out:
How can I get the hyperlinks to format correctly?
I've managed to get it working with the following code:
descrption= descrption.replaceAll("\\r\\n", "<p>");
Spanned spanned= Html.fromHtml(descrption);
textView .setText(spanned);
textView.setMovementMethod(LinkMovementMethod.getInstance());

How to obtain Firmware version of Android programmatically?

The following screenshot of a generic tablet shows an example of Firmware version:
We have tried pretty much everything of Android Build, none of them is the firmware version.
Firmware is the operating software available on an Android device, and it is available in different versions designed by different manufacturers. Basically it's the device-specific part of the software. For example, you may have Android 4.2.2 running on your phone, but have a firmware number that looks completely different because it relates to more details than just Operating System. The firmware number consists of several elements, all of which are essential for the functioning of the phone:
PDA: Android operating system and your customizations.
Phone: the actual identifier of your device.
CSC (Country Exit Code): the languages and country-specific parameters.
Bootloader: the boot loader program that runs at startup to all unit processes.
Software Build Version is basically the Firmware Version.
See android.os.Build.VERSION. SDK or SDK in contain the API version. android.os.Build.VERSION_CODES contains the relevant constants.
String deviceSoftwareVersion=telephonyManager.getDeviceSoftwareVersion();
Refer to this Build.VERSION
Procedure to use it in code.
if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.GINGERBREAD) {
// only for gingerbread and newer versions
}
This can be also checked by through code
System.out.println("button press on device name = "+android.os.Build.MODEL +" brand = "+android.os.Build.BRAND +" OS version = "+android.os.Build.VERSION.RELEASE + " SDK version = " +android.os.Build.VERSION.SDK_INT);
UPDATE
Try this one -
android.os.Build.FINGERPRINT
android.os.Build.BOARD
Build.HOST
Build.ID
Build.SERIAL
Check the following Code.
FINAL UPDATE
String mString = "";
mString.concat("VERSION.RELEASE {" + Build.VERSION.RELEASE + "}");
mString.concat("\nVERSION.INCREMENTAL {" + Build.VERSION.INCREMENTAL + "}");
mString.concat("\nVERSION.SDK {" + Build.VERSION.SDK + "}");
mString.concat("\nBOARD {" + Build.BOARD + "}");
mString.concat("\nBRAND {" + Build.BRAND + "}");
mString.concat("\nDEVICE {" + Build.DEVICE + "}");
mString.concat("\nFINGERPRINT {" + Build.FINGERPRINT + "}");
mString.concat("\nHOST {" + Build.HOST + "}");
mString.concat("\nID {" + Build.ID + "}");
((TextView) findViewById(R.id.textView1)).setText(mString);
You can also useBuild.DISPLAY
A build ID string meant for displaying to the user
I used following code to get firmware version of my device:
public String getFirmwareVersion() {
try {
Class cls = Class.forName("android.os.SystemProperties");
Method method = cls.getMethod("get", String.class);
method.setAccessible(true);
return (String) method.invoke(cls, FIRMWARE_VERSION_PROPERTY);
} catch (Exception e) {
return null;
}
}
As FIRMWARE_VERSION_PROPERTY I used "ro.product.version"
I've got this property address from the device manufacturer.
This address can vary for different devices and manufacturers, so it's advisable to contact manufacturer's tech support to get the correct one.

EXTRA_AVAILABLE_VOICES always returns eng-GBR only. Why?

I am using the following snippet to log all available (and unavailable) voices currently on phone:
ArrayList<String> availableVoices = intent.getStringArrayListExtra(TextToSpeech.Engine.EXTRA_AVAILABLE_VOICES);
String availStr = "";
for (String lang : availableVoices)
availStr += (lang + ", ");
Log.i(String.valueOf(availableVoices.size()) + " available langs: ", availStr);
ArrayList<String> unavailableVoices = intent.getStringArrayListExtra(TextToSpeech.Engine.EXTRA_UNAVAILABLE_VOICES);
String unavailStr = "";
for (String lang : unavailableVoices)
unavailStr += (lang + ", ");
Log.w(String.valueOf(unavailableVoices.size()) + " unavailable langs: ", unavailStr);
The logged result is somehwat bewildering, since I know beyond certainty that I have multiple languages installed and I can even hear the TTS speaking in eng-USA, yet the log shows:
1 available langs: eng-GBR,
30 unavailable langs: ara-XXX, ces-CZE, dan-DNK, deu-DEU, ell-GRC,
eng-AUS, eng-GBR, eng-USA, spa-ESP, spa-MEX, fin-FIN, fra-CAN,
fra-FRA, hun-HUN, ita-ITA, jpn-JPN, kor-KOR, nld-NLD, nor-NOR,
pol-POL, por-BRA, por-PRT, rus-RUS, slk-SVK, swe-SWE, tur-TUR,
zho-HKG, zho-CHN, zho-TWN, tha-THA,
Why is this inconsistent behavior? (note that eng-GBR appears in both the available and unavailable lists...)
It turns out that as far as text-to-speech in Android 2.x goes, it's the wild west out there: Every and any installed 3rd-party TTS engine can modify the output of this EXTRA_AVAILABLE_VOICES function however they desire, regardless whether checked/unchecked or selected/unselected as default.
I just tried uninstalling all TTS engines from my phone, leaving only the hard-coded Pico, and the result match exactly what I expected:
6 available voices: deu-DEU, eng-GBR, eng-USA, spa-ESP, fra-FRA,
ita-ITA,
0 unavailable voices:
I don't mind the output of this function dynamically refer to the currently selected (i.e. default) TTS engine, but the fact is that once a 3rd party TTS engine is installed, this function's output doesn't make any sense, because it ignores any settings.
Also note that the name misleading: It's available languages, not voices!
I am posting this answer with the hope that it will help someone save the time & agony of discovering this the hard way.

best practice for specifying pronunciation for Android TTS engine?

In general, I'm very impressed with Android's default text to speech engine (i.e., com.svox.pico). As expected, it mispronounces some words (as do I) and it therefore occasionally needs some pronunciation guidance. So I'm wondering about best practices for phonetically spelling out those words that the pico TTS engine mispronounces.
For example, the correct pronunciation of the bird Chachalaca is CHAH-chah-LAH-kah. Here is what the TTS engine produces:
mTts.speak("Chachalaca", TextToSpeech.QUEUE_ADD, null); // output: chuh-KAL-uh-KUH
mTts.speak("CHAH-chah-LAH-kah", TextToSpeech.QUEUE_ADD, null); // output: CHAH-chah-EL-AY-AYCH-dash-kuh
mTts.speak("CHAHchahLAHkah", TextToSpeech.QUEUE_ADD, null); // output: CHA-chah-LAH-ka
mTts.speak("CHAH chah LOCKah", TextToSpeech.QUEUE_ADD, null); // output: CHAH-chah-LAH-kah
Here are my questions.
Is there a standard phonetic spelling recognized by the Android TTS engine?
If not, are there some general rules for making custom pronunciation spellings that will make the spellings more likely to be correct in future TTS engines/versions?
It appears that the Android TTS engine ignores text case. What is the best way to specify emphasis?
By the way, this is what the TTS engine writes to logcat:
V/TtsService( 294): TTS processing: CHAH chah LOCKah
V/TtsService( 294): TtsService.setLanguage(eng, USA, )
I/SVOX Pico Engine( 294): Language already loaded (en-US == en-US)
I/SynthProxy( 294): setting speech rate to 100
I/SynthProxy( 294): setting pitch to 100
[UPDATE]
I tried passing an XML document to TextToSpeech.speak() as follows:
String text = "<?xml version=\"1.0\"?>" +
"<speak version=\"1.0\" xmlns=\"http://www.w3.org/2001/10/synthesis\" " +
"xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" " +
"xsi:schemaLocation=\"http://www.w3.org/2001/10/synthesis " +
"http://www.w3.org/TR/speech-synthesis/synthesis.xsd\" " +
"xml:lang=\"en-US\">" +
"That is a big car! " +
"That <emphasis>is</emphasis> a big car! " +
"That is a <emphasis>big</emphasis> car! " +
"That is a huge bank account! " +
"That <emphasis level=\"strong\">is</emphasis> a huge bank account! " +
"That is a <emphasis level=\"strong\">huge</emphasis> bank account!" +
"</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);
As Android Eve suggested, the TTS engine read only the XML body (i.e., the comments about the big car and the huge bank account). I didn't realize the TTS engine was capable of parsing XML documents. However, I did not hear any emphasis in the TTS output.
[UPDATE 2]
I simplified the question to whether or not Android TTS supports Speech Synthesis Markup Language here.
JW answered my question at the tts-for-android group:
Hi Greg,
The Pico engine recognizes the tag with the XSAMPA alphabet.
There are no easy rules to derive a certain pronunciation from the orthograpy, but you can use intuitive spellings and trial and error. Capitalizing and hyphens will introduce more problems than solving them. Using different spellings and introducing extra word boundaries (spaces) can work.
The emphasis tag and the exclamation mark will not change the synthesis result. Use , , and commands instead.
Some examples of the proper syntax for specifying the pronunciation using the SSML phoneme tag are in these tests of TextToSpeech.
Even with these simple test SSML documents, there are warning messages posted to logcat about the SSML document not being well-formed. So I opened an issue about these seemingly incorrect logcat messages to the Android issue tracker.
The syntax for specifying an x-SAMPA sequence to SVOX pico is
String text = "<speak xml:lang=\"en-US\"> <phoneme alphabet=\"xsampa\" ph=\"d_ZIn\"/>.</speak>";
mTts.speak(text, TextToSpeech.QUEUE_ADD, null);
Although more examples would be helpful, a good reference for x-SAMPA is at http://en.wikipedia.org/wiki/Xsampa If I compile a couple dozen examples, I'll post them to that Wikipedia page.
One answer for all 3 questions: Look at the SSML specifications: http://www.w3.org/TR/speech-synthesis/
For example, to specify emphasis, you use the emphasis element, e.g.
<?xml version="1.0"?>
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/10/synthesis
http://www.w3.org/TR/speech-synthesis/synthesis.xsd"
xml:lang="en-US">
That is a <emphasis> big </emphasis> car!
That is a <emphasis level="strong"> huge </emphasis>
bank account!
</speak>

Categories

Resources