I created an android game using gl. I want to create a story mode for the game. In the story mode obviously there is a story.. I don't want to use my voice to narrate the story and the characters' voices and I don't want to get someone speak instead. Is there any program that i can generate voices and use Text-To-Speech or something like that? I don't want the robot voice that people usually use for videos and stuff. I want to actually create the voice and write text that it will read and also to record it. Is there something like that?
Android does have a built-in TTS engine:
http://developer.android.com/resources/articles/tts.html
However, the results probably won't be up to scratch for narration -- machine TTS is still kind of weak, especially if you're trying to convey any emotion. It isn't likely to work as a substitute for real voice actors.
Related
This question is to help the "Hard of hearing community" so that they can READ the phone/mobile call because they can not hear it.
Android 11 provides an API "AudioPlaybackCaptureConfiguration". This API gives apps the ability to copy the audio being played by other apps.
Google also implemented the same on Pixel mobiles as shownn here - https://www.youtube.com/watch?v=7hb3p8LZIq8 . But it has few limitations -
It supports only english language, How to enable support for the regional language
The current implementation translates voice to text using a local mobile engine i.e. voice is not going to google server(all the processing is happening offline in mobile itself), so accuracy is also low.
After seeing a lot of posts here it seems developers are facing issue while implementing the same to capture the caller voice and then transcibe it due to some restriction by Google.
How to record internal audio on Android devices or record MediaPlayer Audio Stream?
Is there anyway to capture the caller voice (https://developer.android.com/guide/topics/media/playback-capture#allowing_playback_capture) ? Like in the youtube video I shared above, Google must be capturing caller voice and its offline engine is processing that voice and converting it to text. So can we capture caller voice using some way and then send that voice to some server API or to Google Live Transcribe app (or whatever it is) for better accuracy and then the converted text will be displayed on the screen (as per user choice of language).
I am also a developer though not a mobile one. So some terminology may be wrong , please excuse it and provide your suggestion.
Can we modify the Android source code itself according to our requirement and remove that limitation so that we can achieve what we want to do even if it require to build custom Android OS ?
So I installed Pocketsphinx on my app and the keyword function is working great. I followed this tutorial https://cmusphinx.github.io/wiki/tutorialandroid/ for my setup. The problem I have is that it hijacks the microphone so you can no longer use it for voice to text or anything else as long as it is waiting for the keyword. I know when you use Ok Google you can still use voice to text with it. Would someone be able to point me in a direction of how I can use voice command keywords and still be able to use the microphone for other things?
I am working on my Glass app demo and I used droidAtScreen to project the screencast from MyGlass for the presentation. The problem is that I cannot demonstrate the voice responses from the Glass based on user input. My backup plan is to record a video for demonstration and insert the voice output manually. Does anyone know if there is a better way to do both screen and audio cast for Google Glass app demo? Thanks for the help.
Have you tried Android Screen Monitor? I always use ASM.jar for any demonstration and it works fine with both audio and video demonstrations.
The link to download ASM.jar is here.
Detailed description is here. If you're using Droid#Screen than probably you know how to run Android Screen Monitor (ASM.jar), but here is a link for a reference that explains the process in detail.
This is how I solved the problem initially.
Use Screencast-O-Matic to record the video of screencast on my
laptop. The screencast is done using DroidAtScreen with
highest frame rate possible option checked. It has better frame rate than ASM screencast. During the video record session, my voice was captured. (so in other words, choose a
quiet place!!)
For simulating the Text-to-Speech engine voice, I used SitePal
demo site and the voice is Julie (US). It's the closest voice I could
find that matches the Google Glass speech engine. To record the
voice, I used Audacity and export it to .wav audio file. The key
is to play the video and find the exact time to insert the audio
file using any standard movie maker software.
UPDATE
Just finished the demo presentation at IT Expo. To my surprise, the simplest solution worked the best.
Create the video demo (under 2 minutes) as mentioned above but insist
on asking an audience to try it out.
Ask the person to say what
he/she heard from the Glass app as a response to the action (ex. The
item is saved)
I am working on an android project for controlling an arduino robot using speech recognition. i wanted an offline speech recognition unit to recognise only a few words. so thought of implementing audio fingerprinting for the purpose. so is there anyway i can use ths to recognise a few simple words.???
What you need to implement is more related to audio recognition/classification. You will not get what you want using audio fingerprinting.
Lets say you have 5 words, you need to record these words (as many times as possible and pronounced by different people if possible). Then you need to extract audio features (such as MFCC) from these recordings and to train a classifier (such as a SVM) with 5 classes (one for each word).
I have create an app for android that creates and stores a single tone, then plays it back utilizing android audio track class. Here's the issue: on my phone I can only play tones up to a frequency of about 11kHz, and on a virtual phone run from my PC (same exact code) I can get frequencies up to about 14kHz. What could cause this cutoff?
Using a tone generator app from the market, my phone can produce up to 20kHz signals, so I know it is not a hardware issue.
Thanks.
It might help if you provide some of the code for how you're generating the tone.
For audio stuff, you should go here http://music.columbia.edu/mailman/listinfo/andraudio and signup then ask there. There's a great community of Android developers for that list all dedicated to audio development.
Also, self-promotion, I run a forum website (relatively new and needing updates) and I plan to add an Android Audio forum on it once I get enough interested folks. If you're interested, sign up here