Google speech recognition offline use - android

I just read an article that google created a speech recognition that works offline on Android.
Is there a way to use that in my project that is not based on Android?
I'm sure that I'm not the first one who thinks about to use it in none Android projects, but I couldn't find anyone who did it and put it into the world wide web.

Related

How to use Vosk models from WebSocket online server?

I have been developing an android app that uses the speech recognition service but the android device has no Google app installed. For that reason, I'm using the vosk API for speech recognition but for better accuracy in speech recognition. I need to use a higher size model. Which takes a lot of space in assets. So, how can I access the vosk model without including the assets or using them from the online server directly?
Edit:-
I have seen Kaldi's WebSocket in vosk. Can this help me to use vosk from an online server(https://github.com/just-ai/aimybox-android-sdk/tree/master/kaldi-speechkit#online-mode)?. In this, they have given information about how to use WebSocket and also given an example but I am unable to understand about making a WebSocket file.
Any help regarding this is Helpful!

Suggestion for choosing speech to text apis

I am new to speech recognition, android and i have a use case where i need to build an android app which takes commands(limited set of commands, less than 100) from users and executes some logic. I have googled a bit and found the following can be done
Use google cloud speech api
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Can someone please suggest me which of the above might best suit my use case?
I have a limited set of commands and speed matters the most to me.
I am really confused and thus putting this question. Thanks in advance.
Use google cloud speech api
Very expensive since you have to pay for every request.
Use Android inbuilt speech to text capability (Is it different from google cloud speech api? If so how?). Also what are the pros and cons of using offline mode of android speech to text?
The inbuilt API is ok to use. It is different from cloud API and it is free. It does not work offline transparently for the user though. Bad side it is slow and you can not configure the vocabulary. So it will decode all words instead of some particular set of commands and often will confuse the required commands with other words in noise.
Use open source speech recognition libraries like Kaldi, CMU Sphinx(it looked like they need a lot of effort in collecting and training the data)
Proper development is always an effort.

Offline Image To Text Recognition (OCR) in android

How to build the android native SDK for image to text recognition. (I have done well with some APIs from web services. But this time, I just want to make the app without any Internet Connection, no APIs, and no Web Services. Just an offline OCR app).
So my question here is
how to crop each and every word containing in the image?
how to compare the cropped text with the alphabets and characters?
You said you didn't want to use an API, however I suggest you use the recently released OCR API by Google:
https://developers.google.com/vision/text-overview
Just add the following line to your dependecies:
compile 'com.google.android.gms:play-services-vision:9.2.0'
Note: Upon first use it will have to download some files from a google server for it to be able to work. Make sure to add this check .isOperational(). Afterwards you can use it without an internet connection.
I guess u can use Tesseract OCR Tool, an open source alternative by Google. How to integrate that in Android is simple via Tesseract Android Tools
Have a look at the tess-two project on github, it's very easy to use and gives good OCR results
You can use ML Kit for Image to Text Recognition:
https://firebase.google.com/docs/ml-kit/android/recognize-text

Google speech recognition library or API

Google has recently made great progress with their speech recognition software, which is used in several open source products, e.g. Chromium Web Speech and Android Handsfree texting. I would like to use their speech recognition as part of my server stack, however I can't find much about it.
Is the text recognition software available as a library or package? Or alternatively, can I call chromium from another program to transcribe some audio file to text?
The Web Speech API's are designed only to be used in the context of either Chrome or Android. There is a lot of work that goes on in the client so there is no public server to server API that would just take an audio file and process it.
If you search github you find tools such as https://gist.github.com/alotaiba/1730160 but I am pretty certain that this method of access is 100% not supported, endorsed or confirmed to keep working.
The method previously stated at https://gist.github.com/alotaiba/1730160 does work for me. I use it on a daily basis in my home automation programs. I use a python script to capture audio and determine what is useful audio or just noise, then it sends the little audio snippet to google and returns the text all under a second!! I have successfully integrated it into my programs and if you google around you will find even more people that have as well!

Using Android speech recognition API

Is there any way I can make an app using jelly bean's offline speech recognition api & choose the sensitivity?
Actually I want to build an app, which tests user's English speaking & score.
If anyone can provide link to an app that currently does the same thing, that will be helpful too, because on that case I wouldn't need build an app to help myself
If you want an offline speech recognition on android, use Pocketsphinx:
http://cmusphinx.sourceforge.net/2011/05/building-pocketsphinx-on-android/

Categories

Resources