Converting audio file into text file using pocketsphinx

Converting audio file into text file using pocketsphinx - android

Good day ma'am/sirs! I'm new to android app developing, and I'm really in need of help. I'm developing a Speech-to-Text app though its not the usual STT apps that are available on app-stores. I'm using pocketsphinx for offline speech recognition and conversion, and Android Studio IDE.
My app has three main features and those are:
Record - here is where the user will be able to record his/her speech. The recorded speech will be saved into the device's storage.
Library - Here is where the user will be able to see his/her recorded speech and converted audio-to-text files. Also the convert feature where the user may convert his/her recorded speech into text files.
Edit - here is where the user will be able to edit his/her audio/text files. Only cut, delete, and modify(only text) are the available features.
My main problem is, is it actually possible to convert a recorded speech into text by using pocketsphinx? To make it more understandable, I've tried demos of pocketsphinx and what I've experienced through it is when you speak through your device, it directly converts what you said. Unlike my idea, where you may record your speech, and convert it into text whenever you want. I'm so confused if its possible, if yes, may someone tell/explain to me how? If no, may someone tell/explain to me the other ways to follow my idea? Thanks in advance!

Related

Handwriting recognition in android studio

I need to recognize some text written on a postit. The text has no meaning, it is a succession of letters written in block capitals.
I inquired and discovered that it is a problem of localization of handwriting and then of recognition of the handwriting. There is google OCR, but those models only recognize English language phrases.
I leave you an example image:enter image description here
I would like that, for each postit, the text is recognized, so for the first postit: "769213" for the second "ALHSFP"

The Cloud Vision API is able to detect more than just "English language phrases". Try Google Translate; it can translate directly from camera input; Lens might also use the same. You could as well use TensorFlow Lite on Android. Keras is required for training models (won't run on Android).
For example:
https://cloud.google.com/vision/docs/handwriting
https://towardsdatascience.com/build-a-handwritten-text-recognition-system-using-tensorflow-2326a3487cd5
https://keras.io/examples/vision/handwriting_recognition/
The mere difference is, that the one requires a network connection, while the other doesn't.

Android voice modulator

I want to create an android app that inputs the user's voice and modulate it in real-time.
Therefore, I wanted to ask if there are any libraries or functions to input the speech/voice and change its pitch,etc.
Any in built functions...?
Thank you.

How to make voice recognition check against a local database in android?

Do you remember in old cellphones you could make a speech shortcut to call a person.
I am trying to make an app in android with that function. The user records a word or sound it wants to control the application with and the voice-recognizer will only check if the sound it hears equals the sound previously recorded.
Does anyone know how to make this or know of a guide? I've been searching for months without finding a satisfying solution.
Thanks

You need to convert both reference sounds and recorded sound to features. For that you need to split sound on frames and extract FFT or directly mel-cepstrum. You can use any MFCC library out there for that.
After you get features, you can compare them with DTW algorithm. You can find some details here
http://en.wikipedia.org/wiki/Dynamic_time_warping
The DTW will return you the threshold which you can use to select the right person to call to.
Similar quesitons is
Simplest algorithm of measuring how similar of two short audio

Implementing voice recognition in android

i have used the code provided in this link for the speech recognition. in emulator it is saying recognizer not present,so i installed it on mobile. when i click on speak button it is working. but when i speak some names "rajesh" it is showing some possible verbs and all but not the name. but i want to use the input to select a contact from the address book in order to make a call . so please tell me how to carry on in this direction. one more thing, every time i need to develop the code in eclipse then install it on mobile and then check for output. is there any alternative to edit and check the app code in the mobile from eclipse.
please provide me any possible links. i want to develop a call app for blind,if the voice recognition does not work, what else could be done to take input from the user.

Names are hard for Speech recognition. There are more possible names in the world than words in any dictionary, so being able to recognise any arbitrary name is hard. Though common names are easier.
Anyway, if you want to recognise a customized list of words/names, You might want to look at Dragon Mobile from Nuance. Here is a copy-and-paste from another similar question I answered:
If you use 3rd party Android recognition from Nuance (The people behind DragonDictate), it supports a "grammar mode" where you can somewhat restrict the phrases that will be recognised during recognition.
Importantly, if you add unusual names into a Custom Vocabulary, they SHOULD become recognizable (Complex pronunciation issues aside).
You can find information if you dig through:
http://dragonmobile.nuancemobiledeveloper.com ,
looking for 'Custom Vocabularies'. Grammar mode is essentially a special mode of custom vocabularies.
At the time of writing, there was a document here that makes some mention of grammar mode:
http://dragonmobile.nuancemobiledeveloper.com/downloads/custom_vocabulary/Guide_to_Custom_Vocabularies_v1.5.pdf - It only really becomes clear when you try to progress in their provisioning web GUI.
You have to set up an account, and jump through other hoops, but there is a free tier. This is the only potential way I have found to constrain a recognition vocabulary.
Well, short of running up PocketSphinx, but that is still described as a 'Research' 'PreAlpha'.
No, I don't work for Nuance. Not sure anyone does. They may have all been eaten by zombies. You would guess as much reading their support forums. They never reply.

Voice recognition and localization?

Does anyone have experience with java voice recognition and localization?
I'm thinking to build an android application, with some basic voice recognition options, but I want to implement localization for that based on some translate tool, maybe Google translate, and users can update his "dictionary" with new languages from remote dictionary...this project is in first phase, and I'm still brainstorming, so does anyone have some experience or is something like that even possible?

Why not just use Android's built-in speech recognition? It's REALLY easy (you just set up an Intent then catch it when it returns) and the results are surprisingly good.
android.speech
I'm not sure exactly what you're trying to do, but this will allow you to specify the language to recognize.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.