Is it possible to integrate external TTS engine with Pepper Robot?
I want to integrate Third party Speech engine with pepper robot. Please guide me on the same.
You can integrate an external TTS engine with Pepper. Either offboard (like the services offered by IBM, MS Azure or Google) or onboard (ideal would be something in Java or Kotlin for Android Pepper, but anything is possible). If you have a specific technology in mind, please provide more details and we can give you a more precise answer.
Bear in mind that this may introduce latency in speech synthesising compared to the default text to speech engine.
Edit - sorry, I missed your Android tag. The below mentioned APIs only work on Pepper 2.5 (Choregraphe Pepper)
Alternatively, there are a number of different voices available on Pepper, perhaps one will suit your needs. Use the naoqi API function ALTextToSpeech.getAvailableVoices to list the different voice options, then ALTextToSpeech.setVoice to set the voice to one of those options.
Related
I am currently working with a Pepper robot (academic version and the QiSDK and NaoQi 2.9). Since I am using the academic version I can't use the cloud based automatic speech recognition service from Softbank which is not included and therefore e.g. I can't use wildcards or other chatbot engines besides QiChat.
Does anybody of you know how I can implement my own speech recognition service for Pepper? I can't find where I can get access to the audiostream of Pepper's microphones.
I've read the documentation from Softbank:
https://developer.softbankrobotics.com/pepper-qisdk
and
https://qisdk.softbankrobotics.com/sdk/doc/pepper-sdk/ch4_api/conversation/reference/basechatbot.html
And I've tried to create a SpeechRecognizer based on Android, which works, but uses the Tablets microphone and not Peppers.
Remote Speech Recognition is a service that you will need to by on top if it was not included with your original Pepper offer!
Regards,
Jonas
I was also curious and contacted the softbank support.
Summary:
With version 2.9. you have no access to the head microphones and can only access the tablet mic.
I am new to speech recognition, android and i have a use case where i need to build an android app which takes commands(limited set of commands, less than 100) from users and executes some logic. I have googled a bit and found the following can be done
Use google cloud speech api
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Can someone please suggest me which of the above might best suit my use case?
I have a limited set of commands and speed matters the most to me.
I am really confused and thus putting this question. Thanks in advance.
Use google cloud speech api
Very expensive since you have to pay for every request.
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
The inbuilt API is ok to use. It is different from cloud API and it is free. It does not work offline transparently for the user though. Bad side it is slow and you can not configure the vocabulary. So it will decode all words instead of some particular set of commands and often will confuse the required commands with other words in noise.
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Proper development is always an effort.
the following link gives me speech in Arabic by using google translate server side api , some website descripe that using this is illegal is this true or not ? because I want to added it to my android application.
P.S : android os does not support Arabic speech
http://translate.google.com/translate_tts?tl=ar&q=%D9%85%D8%B1%D8%AD%D8%A8%D8%A7
http://code.tutsplus.com/tutorials/use-text-to-speech-on-android-to-read-out-incoming-messages--cms-22524
Please Google "text to speech android tutorial" you'll get many.
Edit: Sorry , I understood you question wrong.
I believe it is. But if you are really concerned, please contact Google or ask on their forums, I dont think you'll get an answer to that here. Good luck!
As described in the Terms of Service in Google Translate API:
1. Prohibitions
You will not knowingly use the API to create, train, or improve (directly or
indirectly) a substantially similar product or service, including any other machine translation engine.
That means, for my understanding, that if You plan something like this, it is not allowed if it is Your own implementation of a translate engine. For example, You are calling Your app "Hussamabd´s great Translation Engine" and this app is really for translate words into other languages, then it is not allowed. BUT, there is another part in the API:
Introduction
This document is intended for developers who want to write applications that can interact with the Google Translate API. Google Translate is a tool that automatically translates text from one language to another language (e.g. French to English). You can use the Google Translate API to programmatically translate text in your webpages or apps.
This means to me, if You create an app, which intention is not to translate words, but You need this translation for any other reason, for example making Your app in every language, it will be ok.
Also, You have to pay some fees for using this API. But to get really sure, You should contact Google or a lawyer, because I am not and I can´t give You any law confirmed statement!
I'm writing an Android application which needs to speak out a text (i.e the TextToSpeech functionality in an eReader). I am trying to do this in Papiamento Languages ("http://www.narin.com/papiamentu/"). Is this possible? If so, how could I do it? There are some TTS engines available. I used eSpeak TTS Engine. With the use of eSpeak, I was able to configure the settings page to use it as the default engine. But how could I use that engine to do TTS in our application? Thanks.
Unfortunately Papiamento isn't supported at the OS level, nor do pretty much every 3rd party TTS engine. I wish it did though, saw a few people using our app when I was in Curaçao a few weeks ago :)
Currently, only espeak supports this language, and it should be as easy as going to your android settings, general management or language and input, text to speech, preferred tts engine and select espeak as the default tts engine, which should automatically synthesize using espeak.
I have a client who needs an Android App that can recognize spoken commands. From what I understand the built-in voice to text functionality actually sends data to Google's servers which then sends back a text translation. This is a major problem, as the voice data is extremely sensitive (unless if the data is encrypted when it is sent to and from Google - but I doubt it is encrypted).
There are 2 options that I can think of. First is to convert speech-to-text on the Android, though this seems like it would be an extremely expensive operation. The second possibility is to have a local server convert the data for me (I could encrypt the voice data and the translation when it is being sent to and from). Is this something CMU Sphinx could pull off? It may be worth noting that I will also have access to an Asterisk server, which could possibly assist with this (I don't know).
In reality, there should only be ~200 words which will need to be recognized. I would prefer opensource/free software solutions however I am also open to a commercial solution (perhaps FlexT9). Ideally, I can send the audio stream somewhere, get back a String which is the text, and I can then parse and do other things with the String.
I haven't done much android or any speech recognition development in the past, so I'm hoping someone can at least point me in the right direction. Thanks!
CMUSphinx is an open source speech recognition toolkit you can use to build your application. It contains tools, libraries and data which will enable you to build a speech application. You can learn more about CMUSphinx on the website above.
On Android you have several options to use CMUSphinx:
Recognize audio on the device. For that you can compile Pocketsphinx engine for android. For details see this blog post.
Recognize audio on server. As a server you can use either Pocketsphinx or Sphinx4. You can send audio in compressed flac format or extract speech recognition features on device and send feature stream to the server.
CMUSphinx provides you several acoustic models which will enable you to recognize audio in several languages like English, French, Mandarin, German, Dutch, Russian.
You can also improve the recognition result with adaptation tools.
If you have any questions on CMUSphinx you are welcome to ask in our community forums.
Closed source, but free, is the Microsoft speech engines. For some background see What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition?. For some more background you can try https://stackoverflow.com/a/4217638/90236
The complete SDK for the Microsoft Server Speech Platform 11 is available at http://www.microsoft.com/download/en/details.aspx?id=27226. The speech engine is a free download.