I am looking for a (fast) voice to text recognition api but would prefer to stay within aws, as long as I can do so without sacrificing voice recognition quality.
In looking at the Alexa Voice API and tutorials, they seem to focus on the echo. Does it also work on iOS/Android and is it still as responsive?
The Alexa Voice Service allows you to embed Alexa - the digital assistant - in your mobile app. I don't believe there is any way to just use the voice recognition - that's certainly not what it is for.
But a few months ago they broke out some of the services used by Alexa into public services available via AWS (see here). Of these, note that Lex 'provides the advanced deep learning functionalities of automatic speech recognition (ASR) for converting speech to text, and NLU...'.
So that is what you are after.
As to how good it is, well, all I can say is that it is desiged to do what you are after and (given the emphasis that Amazon is putting on Alexa and Alexa's dependence on these services) I'm sure it is state-of-the-art and I would suggest that it is in a state of frequent improvement.
Amazon Transcribe is an automatic speech recognition (ASR) service that makes it easy for developers to add speech-to-text capability
Follow link:-
https://aws.amazon.com/transcribe/
Related
This question is to help the "Hard of hearing community" so that they can READ the phone/mobile call because they can not hear it.
Android 11 provides an API "AudioPlaybackCaptureConfiguration". This API gives apps the ability to copy the audio being played by other apps.
Google also implemented the same on Pixel mobiles as shownn here - https://www.youtube.com/watch?v=7hb3p8LZIq8 . But it has few limitations -
It supports only english language, How to enable support for the regional language
The current implementation translates voice to text using a local mobile engine i.e. voice is not going to google server(all the processing is happening offline in mobile itself), so accuracy is also low.
After seeing a lot of posts here it seems developers are facing issue while implementing the same to capture the caller voice and then transcibe it due to some restriction by Google.
How to record internal audio on Android devices or record MediaPlayer Audio Stream?
Is there anyway to capture the caller voice (https://developer.android.com/guide/topics/media/playback-capture#allowing_playback_capture) ? Like in the youtube video I shared above, Google must be capturing caller voice and its offline engine is processing that voice and converting it to text. So can we capture caller voice using some way and then send that voice to some server API or to Google Live Transcribe app (or whatever it is) for better accuracy and then the converted text will be displayed on the screen (as per user choice of language).
I am also a developer though not a mobile one. So some terminology may be wrong , please excuse it and provide your suggestion.
Can we modify the Android source code itself according to our requirement and remove that limitation so that we can achieve what we want to do even if it require to build custom Android OS ?
I'm developing an application that requires the passage of speech to text and for that I'm using the Google Speech Recognition API for Android. But I have a question, in offline mode (I don't want to use it online), does it have a limit of use simply to capture voice, without restriction? NO orders like "Ok Google", voice only. I searched all over the Internet but I'm not 100% sure.
Can you help me? Thank you all.
I am currently working on an app that would require recording the audio within my app and then sending the clip to google for transcription.
Is there any way I can send an audio clip to be processed with speech to text?
Or is there any other way other than this to convert that recording to text ?
Google's Voice To Text API is not available publicly at the moment and there's no announcement on where it could become available. On Android you can use system voice recognition feature, but it will only transcribe what it records by itself and your won't be able to feed it with any audio file for processing.
As for now, you either need to use other services like AT&T's, IBM's Watson, Dragon Dictation (all are on-line) or maybe consider including Sphinx CMU into your app if you absolutely demand off-line solution.
I want to introduce a new feature into my app: permanent voice recognition.
First of all I followed these posts:
Voice recognition
Speech recognition in Android
Offline Speech Recognition In Android (JellyBean)
and more others, plus other posts from different websites.
Problem:
What actually I'm trying to do is to have a permanent voice recognition without displaying google's voice activity. For example: When I start the application the voice recognition should start and listen. When the recognizer matches some words then my app will do different actions accordingly. I do not like to press a button every time I want to do voice recognition, and also I do not like to appear anything on the screen to talk to. Can I do that?
Any suggestions are welcome. Thank you! :)
Android can use voice recognition without any GUI. You can use SpeechRecognizer class to do this. But google doesn't allow you to use theirs voice tools for long time recognition. After 5-7 seconds of silence it will be stopped.
If you want to use limited comands vocabulary, you can use offline continious recognition like PocketSphinx.
For long time recognition you can use:
intent.putExtra("android.speech.extra.DICTATION_MODE", true);
I am aware of android's voice recognition capabilities but I have seen some applications like vligo for example that are able to detect a given sentence, how is this possible? as far as I know you can't set up android's api to do that.