Android: Audio Matching (Audio Fingerprinting) - android

I'm writing an android app that plays an audio file and records what the phone is outputting simultaneously. When the recording is done, it would compare the recording against the original audio played and return whether they match and to what certainty.
I searched a lot and I found some libraries for audio fingerprinting, but they're mostly for music identification purposes.
Is there any libraries out there that I could use for this purpose? Would it make sense to write a custom algorithm for this?

You could compare the sound waves sample to sample (as numbers), then compute maximal, minimal, average difference, etc..

Related

Real time audio analysis Android

I've got a rather complicated problem that I need to solve at work. It's pretty far out of my remit of "Android App Developer" - I would class it as a very specialized audio engineering problem.
I am tasked with developing an application, which needs to be able to stream either a local audio file or audio from streaming service apps such as, but not limited to, Spotify, to another device over Bluetooth.
In addition, the app needs to be able to estimate the BPM of the streamed audio (it is assumed all audio will be musical) and use this BPM value to control the playback speed of a lighting sequence.
This question is about how to estimate the BPM of the streamed music.
For the case where the audio file is local, I can think of some solutions for this, such as hardcoding the BPM into the app, in a map against the audio resources URL.
I have also investigated and experimented with "static" library (aubio) than can estimate BPM from an audio file, but not on the fly. It assumes .wav format. This won't be sufficient for what we are trying to achieve here.
However, given the requirement for streaming external audio from streaming service apps such as Spotify, a static analysis solution is pointless as the solution wouldn't work for the streaming service case, and the streaming service case solution will work for both cases.
Therefore, I have come to the conclusion that somehow, I need to on the fly analyze the streamed audio, perhaps with FFT or peak detection algorithms.
This question isn't about the actual BPM estimation algorithm itself (or the implementation details of how I would get there) and is about the basic starting point of such a solution:
How might I go about getting A) the raw bytes of streamed audio for both the local file case and the external streaming service app case and B) how might I process these bytes into a data structure representing the audio stream in a way amenable to running audio analysis algorithms on it.
I realize this is very open ended, quite vague question, but this is so far out of my comfort zone I've no idea how to even formulate a more coherent question.
Any help would be greatly appreciated!
I'd start by creating some separate, more tightly defined questions for the different pieces. For example, ask how to get access to the raw bytes when streaming local file, or streaming URL-sourced audio. Android has some nice support for streaming, including the ability to stream PCM, so I'd be pretty surprised if getting a hook for access to the byte stream were not possible.
Once you have a hooking point, to convert the bytes to "something useful" I'd look at using the audio format to tell you how to read the incoming bytes. The format should tell you how many channels (mono or stereo), the encoding (e.g., signed PCM is common, might be normalized floats), the number of bits per value (16 is common) and the order of the bytes (big-endian vs little endian).
I know that there are posts that will explain how to convert the raw audio bytes to PCM values based on this info, including some on stackoverflow. They should be reachable via search. I think signed normalized floats is the most common data representation used for processing audio signals.

What is the amplification factor required for android's media recorder to match the output of iOS's AVAudioRecorder?

I have a cross-platform(iOS and Android) app where I will record audio clips then send it to the server to do some machine learning operations. In my iOS app, I use AVAudioRecorder for recording the audio. In the Android app, I use MediaRecorder for recording the audio. In the mobile initially, I use m4a format because of size constrictions. After reaching the server I will convert it to wav format before using it in the ML operations.
My Problem is, in iOS the AVAudioRecorder by OS default does a factor of Amplification to the raw audio data before we the developer get access to the raw data. But in Android, the MediaRecorder doesn't provide any sort of default Amplification to the raw data. In other words, in iOS I will never get the raw audio stream from the microphone whereas in Android I will always only get the raw audio stream from the microphone. The distinction is clearly visible if you can record the same audio in both iPhone and Android phones side by side with a common audio source, then import the recorded audio in Audacity for visual representation. I have attached a sample representation screenshot below.
In the image, the first track is the Android recording and the second track is from the iOS recording. When I hear both the audio through headphones I can vaguely distinguish them but when I visualize the data points, you can clearly see the difference in the image. These distinctions are bad for ML operations.
Clearly in the iPhone, there is a certain amplification factor involved which I would like to implement in the Android also.
Is anyone aware of the amplification factor? OR are there any other possible alternatives?
It's quite possible that the difference is that the effect of Automatic Gain Control.
You can disable this in your app's AVAudioSession by setting its mode to AVAudioSessionModeMeasurement which you do once in your application - usually at startup. This disables a great deal of input signal processing.
Reading your problem description, you might be better off enabling AGC on Android.
If neither of these yields results, you might want to gain scale both signals so they are just below clipping.
let audioSession = AVAudioSession.sharedInstance()
audio.session.setMode(AVAudioSessionModeMeasurement)

How do I read time samples of audio files in Android?

I want to write an app on Android to record snoring sounds of a sleeper and analyze it afterwards (i.e., not in real-time) for signs of a medical condition called obstructive sleep apnea.
The Android devices I've experimented with have voice recorders that produce a file format called .3ga. I want to programmatically read in the audio file and look at the amplitude for each individual time-sample. Then I can analyze that for patterns. Would this be easier if I converted this to a different format, e.g., MP3, and if so how can I do that programmatically?
I did a Google search on this and most of the hits seemed to be related to audio recording or playback which are unrelated to what I'm trying to do. I haven't coded anything yet because I don't know how to get started.
You are looking to do sample-based analysis on a raw audio signal, but the formats you mention are compressed. You will need to either deal with raw samples directly, or decompress the audio and then analyze.
Since you said you can do this work after-the-fact, why not upload to a server and analyze there?

Audio analysis on Android phone

I want to develop an android app that takes in the audio signal and generates the time and frequency domain values.
I have the audio in a buffer which is obtained from the android MIC. I want to use this buffer values to generate a frequency domain graph. Is there some code that would help me find the FFT of an audio signal??
I have seen Moonblink's Audalyzer code and there are some missing components in the code. I could find lots of missing modules. I want to find a better logic that would take in audio and perform some calculations on it.
I found these for you using duckduckgo:
Android audio FFT to retrieve specific frequency magnitude using audiorecord
http://www.digiphd.com/android-java-reconstruction-fast-fourier-transform-real-signal-libgdx-fft/
This should help

Android audio and voice processing

I am new to android and presently doing android voice recording application. I want top know which format is best for saving audio file in android. (i.e RAW-AMR or 3gp or mp4).So rhat we can hear playback sound loudly in device.
Is there any alternative way to increase audio sound through voice processing in android.
Thanks in advance.
Question: Which bear is best? Answer: Black Bear
Seriously though, you would need to state your criteria for the audio file for us to make a codec recommendation. Does it need to be portable? Best compression? Highest fidelity?
The codec that you choose has no affect on the loudness of audio that will be played over the device, so this should not factor into your criteria.
Is there an alternative way to increase audio?
Yes, if you are recording audio from the microphone then you can amplify the audio data before you save it to a file.
Let an audio sample from the microphone be represented by the function:
f(t)
Amplification is achieved by multiplying the audio sample by some factor A
A * f(t)
You can use AGC(Automatic Gain Control) module from WebRTC to increase sound level.
I didn't find any simple Java API yet. You can use C++ API via JNI.
Have a look here, WebRTC AGC (Automatic Gain Control) .
I want top know which format is best for saving audio file in android.
To save voice audio on Android (or any other platform), take a look at Opus. It's a free, state-of-the-art audio codec that also supports voice mode.

Categories

Resources