Is it possible to read an audio stream during (GSM) phone call? I would like to write an encoding application, and I do not want to go with SIP&VoIP. Thank you.
This will be phone and OS dependent and there are several apps that claim they record audio (Total Recall, Record my call on Android) but they generally seem to record via the microphone meaning the far end sound is poor.
I don't believe either the apple or android api's support access to the raw voice stream today.
Something to be aware of also is that it is not always legal to do this without informing the other party (i.e. the person on the other end of the call that you are planning to 'capture' the voice stream somehow) in many places - this may not be relevant for your particular plans but worth mentioning anyway.
If you have the option of doing the work in the network or on a PABX then you can create a basic (if not very efficient) solution by simply creating a three way (or conference) call.
Related
This question might seem to be a repetition of the questions such as following:
How to play an audio file on a voice call in android
Background Audio for a Call in Progress - Possible?
The answers of these questions suggests that it is not possible to play a pre-recorded audio on a voice call in android. I want to know why it is not possible? What is the limitation (hardware/software)? Is it really a limitation or done purposely? Can we alter the source code of android to make it possible?
I think this is a limitation, imposed for security reasons and restricted at the OS level.
Let's analyze the security threat, first of all. If you were able to play custom audio files to the callee, a whole world of cons opens up: you could trick customer supports, you could pretend to be someone else, you could give unauthorized purchase confirmations, and so on. For this reason, neither Android nor iOS allows this functionality.
On Android, you won't be able to do so in a programmatic way, simply because the current APIs won't allow you to do so. It is stated in the official documentation as well, as pointed out here. If you dig into the source code, you can probably enable this feature by accessing the microphone output during a phone call, but that would require running your custom version of Android. A good starting point would be the AudioTrack source, available here.
EDIT: a good example of an audio mod involves enabling the Nexus 5 earpiece as a second loudspeaker (requires root). Can be found here.
After a thorough research, what I have come to know is that there are more than one limitations/hurdles to make it possible. These limitations/hurdles are at three different levels.
First limitation is at API level, because there is no high-level API to play sound files in the conversation audio during a call as mentioned in Android official documentation.
Second limitation is at Radio Interface Layer (RIL). RIL passes on complete control of the call to Radio Daemon (rild) of the Linux library which then further passes the control to the vendor RIL. That means we cannot manipulate voice call in android source code.
Even if we are able to remove these two limitations, we may still not be able to play audio file to an ongoing voice call. Because there is a third limitation. Every vendor has their own library of RIL that communicates with Radio Daemon (rild). This requires that vendor RIL to be open source which is not actually. Hardware vendors do not usually make their device drivers code available.
Detail discussion on this topic is present at this link.
This is software related due to the prioritization of audio routing in Android.
Take a look into the CallManager where you can dig into the method setAudioMode(). After the audio mode was set to MODE_IN_COMMUNICATION the following code is called
audioManager.requestAudioFocusForCall(AudioManager.STREAM_VOICE_CALL,
AudioManager.AUDIOFOCUS_GAIN_TRANSIENT);
From this point on the telephony service has the highest priority and won't let any other audio play in parallel.
Note: You can play back the audio data only to the standard output device. Currently, that is the mobile device speaker or a Bluetooth headset. You cannot play sound files in the conversation audio during a call.
See official link
http://developer.android.com/guide/topics/media/mediaplayer.html
By implementing the AudioManager.OnAudioFocusChangeListener you can get the state of the audiomanager. so by this if any music is playing in the background you can get the AudioManager states(playing and pausing is completely in developer hands) similarly......
Some of the native music players in android device where handling this, they restrict the music when call is in TelephonyManager.EXTRA_STATE_OFFHOOK.so this scenario is also completely in developer hand (whether to handle or not) if he is not handling both will play parallel y
I'm writing an android app that lets user record his voice through microphone & save it in storage & link it to a specific content (like a Contact). Later, user call that voice again & the app should compare it with saved audio files & find the one that matches the voice.
I searched a lot & found some libraries that do this online, like EchoPrint that generates fingerprint from recorded audio & sends it to opensource server & returns the result. But I need to do this offline.
Has anybody know such library?
If you are aiming to compare an old recording of a user with a new call as it comes in, audio fingerprinting solutions like Dejavu in Python on a server or Echoprint in C++ won't help you. They are for doing recognition and retrieval on recorded audio segments plus noise.
They cannot deal with the variabilites in human voice. See an explanation here.
If that's the case, what you are referring to is speaker recognition, which is much harder and involves quite a bit of machine learning. It would be tough to do this for a large corpus of users (especially offline on a phone), but for determining between a couple users, it might be doable.
Below is a good Library. Which is Easy to use. But you need to convert your Audio Files to Wave Format prior to this.
https://code.google.com/p/musicg/
Android's SpeechRecognizer apparently doesn't allow to record the input on which you're doing speech recognition into an audio file.
That is, either you record voice using a MediaRecorder (or AudioRecord for that matter) or you do Speech Recognition with a SpeechRecognizer, in which case the audio isn't recorded into a file (at least not one you can access); but you can't do both at the same time.
The question of how to achieve recording audio and doing speech recognition at the same time in Android has been asked several times, and the most popular "solution" is to record a flac file and use Google's unofficial Speech API which allows you to send a flac file via a POST request and obtain a json response with the transcription.
http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/ (outdated Android version)
https://github.com/katchsvartanian/voiceRecognition/tree/master/VoiceRecognition
http://mikepultz.com/2013/07/google-speech-api-full-duplex-php-version/
That works pretty well but has a huge limitation which is it can't be used with files longer than about 10-15 seconds (the exact limit is not clear and may depend on file size or perhaps the amount of words). This makes it not suitable for my needs.
Also, slicing the audio file into smaller files is NOT a possible solution; even forgetting about the difficulties in properly splitting the file at the right positions (not in the middle of a word), many consecutive requests to the abovementioned web service api will randomly result in empty responses (Google says there's a usage limit of 50 requests per day, but as usual they don't disclose the details of the real usage limits which clearly restrict bursts of requests).
So, all this would seem to indicate that getting a transcription of speech while at the same time recording the input into an audio file in Android is IMPOSSIBLE.
HOWEVER, the Google Keep Android app does exactly that.
It allows you to speak, transcrbes what you said into text, and saves both the text and the audio recording (well it's not clear where it stores it, but you can replay it).
And it has no length limitation.
So the question is: DOES ANYBODY HAVE AN IDEA OF HOW GOOGLE KEEP DOES IT?
I would look at the source code but it doesn't seem to be available, is it?
I sniffed the packets Google Keep sends and receives while doing speech recognition, and it definitely does NOT use the speech api mentioned above. All the traffic is TLS and (from the outside) it looks pretty much the same as when you're using SpeechRecognizer.
So does perhaps a way exist to kind of "split" (i.e. duplicate, or multiplex) the microphone input stream into two streams, and feed one of them to a SpeechRecognizer and the other to a MediaRecorder?
Google Keep launches RecognizerIntent with certain undocumented extras and expects the resulting intent to contain the URI of the recorded audio. If RecognizerIntent is serviced by Google Voice Search then it all works out and Keep gets the audio.
See record/save audio from voice recognition intent for more information and a code sample that calls the recognizer in the same way as Keep (probably) does.
Note that this behavior is not part of Android. It's simply the current undocumented way of how two closed-source Google apps communicate with each other.
It uses onPartialResults(Bundle)
This event returns text recognized from recorded speech while it's still recording
It's also available on Xamarin
I'm currently searching for options on how to manipulate audio on android. The goal is to process audio from the microphone in real time during a phone call. The best solution would be to do this on a native call. But rebuilding a telephone app (no VOIP) would be fine too. Are there any ways to achieve this with Android APIs (also undocumented)?
If not, which steps would be necessary to get things running?
On iOS there are some apps which manipulate voice but create a VOIP connection. I heard that on Android you can "clone" the telephone app and eventually feed it with your own audio stream? Aren't there apps which add noises during a call? What kind of APIs are involved?
The best solution would be to do this on a native call.
This is not possible. You have no access to the in-call audio stream, except perhaps in speakerphone mode.
But rebuilding a telephone app (no VOIP) would be fine too.
The last official word from Google (2010), the entire OS has no access to the in-call audio stream, as it is all handled at a lower level. Even if newer versions of Android do have access to the in-call audio stream, "rebuilding a telephone app" is only possible if you are creating custom firmware.
As a drawback i could imagine to record the Downlink Stream using the MediaRecorder API and write it back to hardware using AudioTracks write() method. By this i could manipulate incoming voice. But still I think this wont work during phone calls. And I do not see a way to choose different hardware destinations.
Is there a way to write an app which records phone calls in android via android-sdk?
so that you can store phonecalls as .wav or something else?
I like to write something like a dictaphone.
Do you think, there is a chance to make this possible?
No. Android does not allow you to record the audio stream due to legal terms.
However due to buggy hardware implementations it is possible, on some devices, to record the audio stream.