I have been developing an android app that uses the speech recognition service but the android device has no Google app installed. For that reason, I'm using the vosk API for speech recognition but for better accuracy in speech recognition. I need to use a higher size model. Which takes a lot of space in assets. So, how can I access the vosk model without including the assets or using them from the online server directly?
Edit:-
I have seen Kaldi's WebSocket in vosk. Can this help me to use vosk from an online server(https://github.com/just-ai/aimybox-android-sdk/tree/master/kaldi-speechkit#online-mode)?. In this, they have given information about how to use WebSocket and also given an example but I am unable to understand about making a WebSocket file.
Any help regarding this is Helpful!
Related
I am new to speech recognition, android and i have a use case where i need to build an android app which takes commands(limited set of commands, less than 100) from users and executes some logic. I have googled a bit and found the following can be done
Use google cloud speech api
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Can someone please suggest me which of the above might best suit my use case?
I have a limited set of commands and speed matters the most to me.
I am really confused and thus putting this question. Thanks in advance.
Use google cloud speech api
Very expensive since you have to pay for every request.
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
The inbuilt API is ok to use. It is different from cloud API and it is free. It does not work offline transparently for the user though. Bad side it is slow and you can not configure the vocabulary. So it will decode all words instead of some particular set of commands and often will confuse the required commands with other words in noise.
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Proper development is always an effort.
I need to build a chatbot which does not takes any online support.
I am using:
Python chatterbot to build conversation dialogues.
Android's google offline speech recognition to convert speech to text and vice versa.
I want to train the model on my PC and use the generated database.sqlite3 file on android.
The complete flow of the process is as follows:
Pretrained model generated database.sqlite3 which is placed in android.
Voice -> Text -> Local Android Server which runs python script using database.sqlite3 and generates response(text) -> Text to Voice
Now I have the problem of running Python on Android with all the environment needed to run the script on android. Kindly help me out with this.
I have searched stuffs and found setting local server on android using NanoHTTPD/AndroidSync. Now I want to use this server to run python script
If you have any better alternative to any of the steps above, kindly suggest.
In my experience, trying to get Python running on Android doesn't sound like the best way to accomplish this. I'd recommend splitting your project up into two parts:
1. A web application hosted somewhere
You can create a regular web application using a Python framework like Django or Flask. This application can provide a RESTful API that allows other applications to exchange information with your chat bot.
ChatterBot has built-in support for Django and there are numerous examples of the two being used together available. You can also take a look at the "How do I deploy my chat bot to the web?" section for a brief overview and some tips on how to get started.
2. The Android app
The app can access Android's native speech recognition technologies to interpret verbal information before it sends the recognized text to your chat bot API server.
Google has recently made great progress with their speech recognition software, which is used in several open source products, e.g. Chromium Web Speech and Android Handsfree texting. I would like to use their speech recognition as part of my server stack, however I can't find much about it.
Is the text recognition software available as a library or package? Or alternatively, can I call chromium from another program to transcribe some audio file to text?
The Web Speech API's are designed only to be used in the context of either Chrome or Android. There is a lot of work that goes on in the client so there is no public server to server API that would just take an audio file and process it.
If you search github you find tools such as https://gist.github.com/alotaiba/1730160 but I am pretty certain that this method of access is 100% not supported, endorsed or confirmed to keep working.
The method previously stated at https://gist.github.com/alotaiba/1730160 does work for me. I use it on a daily basis in my home automation programs. I use a python script to capture audio and determine what is useful audio or just noise, then it sends the little audio snippet to google and returns the text all under a second!! I have successfully integrated it into my programs and if you google around you will find even more people that have as well!
I am trying to develop an android application that will stream the video from android mobile to the web (similar to Qik). I had gone through RED5, MAMMOTH and RTMPD servers.
My question is which web server I should use? Which is the best supported on android? Is there any other alternative to do this?
If there is some tutorial or code is available, please point it to me.
Thanks
For the testing purpose, I had developed a small application in C# which serves as server. It captures the streamed data and display it in video format.
For the implementation of actual server, I used gstreamer. Implementation of gstreamer is also available for Android.
Google has speech recognition services available for use from mobile phones (Android has it built in, iPhone users can use the Google application) - http://www.google.com/mobile/. We've found one article where someone tried to reverse engineer the service at http://waxy.org/2008/11/deconstructing_google_mobiles_voice_search_on_the_iphone/.
We want to better understand what is happening over the network when we use Android's RecognizerIntent. Does anyone have any experience using this service over the web or know of other articles that may explain its workings?
I read this presentation few weeks ago- http://www.abelski.com/courses/android/speechinput.pdf
The following link is a 3 mile high review of the Google Voice Server ....
http://www.google.co.jp/events/developerday/2010/tokyo/pdf/tt1-gruenstein.pdf
Answer: just move your .apk file to your android phone it will work. Error is only occuring because we are trying to do in emulator