I want to develop an android app.
One feature of the app is recognize 2 people's voice.
It will be kike this - when the app will open, 2 people will talk in front of it.
App will detect 2 persons speaking and will calculate how much % (let 2 persons are A and B) person A spoke and same for person B.
So, say, after 1 minute the app will tell A talks 80% and B talks 20%.
So, what I need is how to differentiate 2 people's voice.
I have tried SpeechRecognizer and android.speech.tts . But I can't make it working.
Is it possible in android to differentiate 2 people's voice?
Thanks in advance for helping.
SpeechRecognizer or TTS will not help you as they have designed to recognize speech. You have to use DSP technology, in order to recognize the speaker. Due to the complexity, i don't think you can achieve this within the device itself. You can save your audio (using something like AudioRecord in Android) and then send it to a server. in the server side you can run a speaker recognition program. ALIZE is a quite popular open source tool for this.
Related
Skip the first two paragraphs if your not interested in why I'm asking this question.
Here is the situation: I'm using a Moto Z Play with the Projector Modification, the mod is really cool and allows me to literally project my phone screen onto the wall. I've been writing a personal assistant program that helps me with my daily life I.E. Sorting gmails, reminding me of calendar events, keeping track of anything I want it to remember and reminding me of those things when I've asked it to, and much more. Its basically a personal secretary.
One new feature I just added was a habit tracker. I created a small graphical interface on my phone using Tasker that would email my "assistant" who would then record the habit and create a really cool graph that shows my past habit record as well as using a neural network to predict the next days habit. Only problem is, the graph got really intricate really fast. I want to show a months worth of habits (16 total habits), creating what can be up to a 16 x 31 floating point graph with labels. My laptop screen is just not big enough to display all of that without it just being a mess! I really want to display the graph from my projector mod, the entire wall will definitely be big enough to show all that data.
Ok, now my question (thanks for hanging in there I know that was a lot):
Is there any way that I can display an image on my phone from a Python program without creating a standalone app? Even if my phone needs to be plugged into my computer to stream the data through a cable.
I would use a service like Kivy to create a standalone app, but then it wouldn't be hooked up to my assistant, completely defeating the purpose.
I'm not looking for anything similar to a notification, I really want to draw over the entire screen of my phone. This is something I did with Processing (Java library) a while back, but now I'm using Python because it's more machine learning friendly.
I've looked into a lot of services but nothing seems to be able to do this. Remember that I dont need to send anything back from my phone, simply display an image on the screen until the desktop side program tells it to stop.
Not my expertise but if I would need to do something like that I would make a web-service of the python app using django and go to the url with my phone. Don't know if it help....
Regardless of "how" or "what", the answer is, you will always need some software running on the Android to capture the stream of data (images) and display it in the screen.
The point is, you don't have to write this software yourself. The obvious example that come to mind is use any DLNA compatible software, VLC for example, and have your python to generate a h264 stream and point VLC to it. Another way would be use some http service from your python and simply load it in the browser.
hope it helps.
Does anyone have any idea how to create an application on an android as a microphone? Like speaking into the android device and it will amplify the voice out?
Yes , I agree .
However , I have to do this as this is my final year project assigned by my supervisor . So i have to do it by hook or by crook . ):
I have already created the application to amplify the voice out from my android device when i speak to it. But there's echo , very high frequency and sensitive to the background .
Do you guys have any solution to this ?
You really mean a megaphone, as in a self-contained voice amplification device.
Sure, technically it's possible, but there are several reasons to not bother. Most importantly, the amplifiers and speakers on handheld devices cannot match the volume you can already achieve with your voice. Also, you would have to work out the feedback challenges - definitely solvable (phase shifts, minor delays, etc.) but effort nonetheless.
Bottom line: I don't think it's worth doing because even if you make it work, someone standing next to you will be able to shout louder than your handheld device can amplify your voice. Not trying to be negative here, just realistic.
I want to make an android application that allow user change the voice during phone call. For example: You are a man, you can change the voice to a woman or robot when talking over phone. It is like a funny prank.
I work around android's API and google for some days but still have no idea. Some one told is impossible but I see some app on google play can do:
https://play.google.com/store/apps/details?id=com.gridmob.android.funnycall
So I think there are some ways to do that.
I think about recording and play back by using AudioTracker but I have 2more problem:
1. I cannot mute the voice from phone call, so the phone only play my sound after processing
2. record and process will make a long delay (slow-realtime)
Can any one share some solution for this?
The app you linked isn't changing voices on the phone: it uses SIP (or similar) to place a call through the authors' servers and the voice changing happens there. That's why you only get a small number of free minutes of use before you have to pay them.
Yes it uses a sip server to do this process. The reason you cannot actually create an app that does this on the phone is because of two things. The first thing being, sound processing for the phone is locked. You can't unlock this because its strictly engineered through hardware not software. A pc can do this because it uses a standard sound card in which software can modify its frequencies. The second thing is phone manufactures are required to design their phones in a standard format. There are laws that force these companies to make it impossible to do any voice morphing. It is against the law to impersonate someone you are not, over any telephone network.
Hard way
You get the input voice, you use voice recognition to detect the words, then you use speech-to-text with your desired voice as output.
Less hard way
Sound processing: Changing frequencies, amplitude etc.
i am building an application "Voice Calculator" which takes input as a voice and display result based on the input.
i dont want to use a google servers for voice recognization, is there any way through i can achive my goal.
i want to take input as " two plus three multiply four hundred twenty two minus one hundred" etc. so i would like to record and compare every words,
that can be converted in to text and which can be used to perform calculation.
can any one guide me , how to achive this? i am done with designing calculator with its functionality,
i hope i am able to explain my doubt, looking for help.. thank u..
I have used Google API for voice recognition, although I wanted an off-line version, I need to rely on voice recognition.
Have a look at Voice Recognization for android example.
Granted that not many device support it yet, but Jelly Bean will allow you to download Google's voice controls to the device for offline use.
I am writing an application that will behave similar to the existing Voice recognition but will be sending the sound data to a proprietary web service to perform the speech recognition part. I am using the standard MediaRecord (which is AMR-NB encoded) which seems to be perfect to speech recognition. The only data provided by this is the Amplitude via the getMaxAmplitude() method.
I am trying to detect when the person starts to talk so that when the person stops talking for about 2 seconds I can proceed to send the sound data to the web service. Right now I am using a threshold for the amplitude that if its goes over a value (i.e. 1500) then I assume the person is speaking. My concern is that the amplitude levels may vary by device (i.e. Nexus One v Droid), so I am looking for a more standard approach to this that can be derived from the amplitude values.
P.S.
I looked at graphing-amplitude but it doesn't provide a way to do it with just the amplitude.
Well, this might not be of much help but how about starting by measuring the offset noise captured by the microphone of the device by the application, and apply the threshold dynamically based on that? That way you would make it adaptable to the different devices' microphones and also to the environment the user is using it at, at a given time.
1500 is too low of a number. Measuring the change in amplitude will work better.
However, it will still result in miss detections.
I fear the only way to solve this problem is to figure out how to recognize a simple word or tone rather than simply detect noise.
There are now multiple VAD library designed for Android. One of these are:
https://github.com/gkonovalov/android-vad
Most of the smartphones come with a proximity sensor. Android has API for using these sensors. This would be adequate for the job you described. When the user moves the phone near to his ear, you can code the app to start recording. It should be easy enough.
Sensor class for android