using voice/speech recognition to carry out instructions in my app

using voice/speech recognition to carry out instructions in my app - android

hello i wanted to ask for some knowledge on using voice/speech recognition to carry out instructions in my game.
I'm fairly new to game development and i am using libgdx to build my game. i know speech recognition api's exist. i would like to know how they work as in how i can integrate them to my game. i would like for the user to be able to say jump and the (player) to jump, the speech commands i want to use are very basic e.g. shoot and the player should shoot a bullet.
If anyone seems to have some knowledge with speech/voice recognition api's i want to know if there's a simple way i could set the speech recognition api i will be using to carry out specific action upon hearing specific keywords such as "jump".
Any answer will be helpful because my knowledge is very limited with using api's and speech recognition.

A service would be required to do this. This is basically a class that will run in background and hence will be able satisfy your needs above.
Here is a useful thread that uses a voice recognition service:
Android Speech Recognition Continuous Service

Related

Add cloud speech recognition to Pepper QiSDK

I am currently working with a Pepper robot (academic version and the QiSDK and NaoQi 2.9). Since I am using the academic version I can't use the cloud based automatic speech recognition service from Softbank which is not included and therefore e.g. I can't use wildcards or other chatbot engines besides QiChat.
Does anybody of you know how I can implement my own speech recognition service for Pepper? I can't find where I can get access to the audiostream of Pepper's microphones.
I've read the documentation from Softbank:
https://developer.softbankrobotics.com/pepper-qisdk
and
https://qisdk.softbankrobotics.com/sdk/doc/pepper-sdk/ch4_api/conversation/reference/basechatbot.html
And I've tried to create a SpeechRecognizer based on Android, which works, but uses the Tablets microphone and not Peppers.

Remote Speech Recognition is a service that you will need to by on top if it was not included with your original Pepper offer!
Regards,
Jonas

I was also curious and contacted the softbank support.
Summary:
With version 2.9. you have no access to the head microphones and can only access the tablet mic.

Suggestion for choosing speech to text apis

I am new to speech recognition, android and i have a use case where i need to build an android app which takes commands(limited set of commands, less than 100) from users and executes some logic. I have googled a bit and found the following can be done
Use google cloud speech api
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Can someone please suggest me which of the above might best suit my use case?
I have a limited set of commands and speed matters the most to me.
I am really confused and thus putting this question. Thanks in advance.

Use google cloud speech api
Very expensive since you have to pay for every request.
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
The inbuilt API is ok to use. It is different from cloud API and it is free. It does not work offline transparently for the user though. Bad side it is slow and you can not configure the vocabulary. So it will decode all words instead of some particular set of commands and often will confuse the required commands with other words in noise.
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Proper development is always an effort.

Google speech recognition library or API

Google has recently made great progress with their speech recognition software, which is used in several open source products, e.g. Chromium Web Speech and Android Handsfree texting. I would like to use their speech recognition as part of my server stack, however I can't find much about it.
Is the text recognition software available as a library or package? Or alternatively, can I call chromium from another program to transcribe some audio file to text?

The Web Speech API's are designed only to be used in the context of either Chrome or Android. There is a lot of work that goes on in the client so there is no public server to server API that would just take an audio file and process it.
If you search github you find tools such as https://gist.github.com/alotaiba/1730160 but I am pretty certain that this method of access is 100% not supported, endorsed or confirmed to keep working.

The method previously stated at https://gist.github.com/alotaiba/1730160 does work for me. I use it on a daily basis in my home automation programs. I use a python script to capture audio and determine what is useful audio or just noise, then it sends the little audio snippet to google and returns the text all under a second!! I have successfully integrated it into my programs and if you google around you will find even more people that have as well!

Accessing continuous voice recognition in Android 4.0

I am trying to get continuous voice input to work in my Android application. I tried using the built-in SpeechRecognizer Intent but it waits for the user to finish speaking before processing the words. This is not sufficient for me. I need the device to process the words while the user is still speaking.
I read that this is supported in Ice Cream Sandwich now. However, I did not find any API that allows me to access this feature. Does anyone know how this works now?
Thanks for your help!

I guess you heard about the new voice typing feature of Android 4.0. Take a look at this article.
You have to use an external library for it. Though the article says the library is designed for IME developers, and as I see the result of voice recognition will appear in a registered IME through InputMethodService. You can also check the source of the library, because it is a project on Google Code

how voice recognition in android works?

I want to know that how voice recognition in android works? Which library it uses for voice recognition? Does it perform voice recognition process on device/mobile it self or it sends all voice to google servers and receives text in responce?
Thanks,
Sunny.

The 4 Feets.com answer is now very misleading, as the link contains quite a bit of speculative information that turned out to be inaccurate.
Please checkout the VoiceRecognition.java demo in ApiDemos, and the RecognizerIntent reference. Android speech recognition requires an internet connection as the data is sent off to google and you receive a list of possible text translations back.

Have a look at 4 Feets.com 4 Feets.com
They have a nice overview with a little example regarding voice reg in SDK 1.5.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.

using voice/speech recognition to carry out instructions in my app - android

A service would be required to do this. This is basically a class that will run in background and hence will be able satisfy your needs above. Here is a useful thread that uses a voice recognition service: Android Speech Recognition Continuous Service

Related

Add cloud speech recognition to Pepper QiSDK

Suggestion for choosing speech to text apis

Google speech recognition library or API

Accessing continuous voice recognition in Android 4.0

how voice recognition in android works?

Categories

Resources