i'm try to get pcm data by class AudioRecord, the source of audio from headset,it linked a device ,the device will send some wave to my app(i hope you can understand what i say).![difference device's wave],the picture at http://i.stack.imgur.com/5iZnY.png
we see the picture, wave 1 and wave 2,i can get the right result,because i can calculate the point of one cycle, but using sony xl36h, i received wave not closeness real wave, device actually send signal closeness to wave 1.
my question is what caused this phenomenon, how to get the closeness wave such like wave1? i think it maybe Sony optimize the bottom layer of audio ,if that ,should i use NDK to avoid that?
should i use NDK to avoid that?
No, you will get the same results with the NDK.
AudioRecord provides already access to the raw PCM data. The difference between the devices occures because they use different audio modules. The modules have different hardware features (low pass filters/ sensibility) and you can not disable them through software. The reason behind that is that these features reduce noise.
Related
I've got a rather complicated problem that I need to solve at work. It's pretty far out of my remit of "Android App Developer" - I would class it as a very specialized audio engineering problem.
I am tasked with developing an application, which needs to be able to stream either a local audio file or audio from streaming service apps such as, but not limited to, Spotify, to another device over Bluetooth.
In addition, the app needs to be able to estimate the BPM of the streamed audio (it is assumed all audio will be musical) and use this BPM value to control the playback speed of a lighting sequence.
This question is about how to estimate the BPM of the streamed music.
For the case where the audio file is local, I can think of some solutions for this, such as hardcoding the BPM into the app, in a map against the audio resources URL.
I have also investigated and experimented with "static" library (aubio) than can estimate BPM from an audio file, but not on the fly. It assumes .wav format. This won't be sufficient for what we are trying to achieve here.
However, given the requirement for streaming external audio from streaming service apps such as Spotify, a static analysis solution is pointless as the solution wouldn't work for the streaming service case, and the streaming service case solution will work for both cases.
Therefore, I have come to the conclusion that somehow, I need to on the fly analyze the streamed audio, perhaps with FFT or peak detection algorithms.
This question isn't about the actual BPM estimation algorithm itself (or the implementation details of how I would get there) and is about the basic starting point of such a solution:
How might I go about getting A) the raw bytes of streamed audio for both the local file case and the external streaming service app case and B) how might I process these bytes into a data structure representing the audio stream in a way amenable to running audio analysis algorithms on it.
I realize this is very open ended, quite vague question, but this is so far out of my comfort zone I've no idea how to even formulate a more coherent question.
Any help would be greatly appreciated!
I'd start by creating some separate, more tightly defined questions for the different pieces. For example, ask how to get access to the raw bytes when streaming local file, or streaming URL-sourced audio. Android has some nice support for streaming, including the ability to stream PCM, so I'd be pretty surprised if getting a hook for access to the byte stream were not possible.
Once you have a hooking point, to convert the bytes to "something useful" I'd look at using the audio format to tell you how to read the incoming bytes. The format should tell you how many channels (mono or stereo), the encoding (e.g., signed PCM is common, might be normalized floats), the number of bits per value (16 is common) and the order of the bytes (big-endian vs little endian).
I know that there are posts that will explain how to convert the raw audio bytes to PCM values based on this info, including some on stackoverflow. They should be reachable via search. I think signed normalized floats is the most common data representation used for processing audio signals.
I’m struggling since days trying to obtain a raw audio stream from the microphone. I am trying different ways: the low-level JNI way with Oboe Library (either AAudio and OpenSL ES implementations) and the Android’s AudioRecord Java classes.
The problem I am facing is that I am not able to retrieve amplitudes near -/+1.0 while being sure of saturating the microphone input with a calibrated pure tone with such a high amplitude.
I think that the problem is that I am not able to effectively disable the signal preprocessing from AndroidOS (Automatic Gain Control or Noise Cancelling).
AutomaticGainControl.create(id).setEnabled(false)
(not working!)
Also, it seems that it is not possible also to disable any additional microphone rather than the one "selected" (done that as selecting the setPreferredDevice on AudioRecord instance). Used as audio source: unprocessed, mic, voice_recognition.
Is there anyway doing this or am I missing something?
Thank you
Which audio source are you using for your recording? VOICE_RECOGNITION or UNPROCESSED are mandated to not have any pre-processing enabled by default (i.e. see https://source.android.com/compatibility/10/android-10-cdd#5_11_capture_for_unprocessed) and therefore would allow you to check your signal path.
I have a cross-platform(iOS and Android) app where I will record audio clips then send it to the server to do some machine learning operations. In my iOS app, I use AVAudioRecorder for recording the audio. In the Android app, I use MediaRecorder for recording the audio. In the mobile initially, I use m4a format because of size constrictions. After reaching the server I will convert it to wav format before using it in the ML operations.
My Problem is, in iOS the AVAudioRecorder by OS default does a factor of Amplification to the raw audio data before we the developer get access to the raw data. But in Android, the MediaRecorder doesn't provide any sort of default Amplification to the raw data. In other words, in iOS I will never get the raw audio stream from the microphone whereas in Android I will always only get the raw audio stream from the microphone. The distinction is clearly visible if you can record the same audio in both iPhone and Android phones side by side with a common audio source, then import the recorded audio in Audacity for visual representation. I have attached a sample representation screenshot below.
In the image, the first track is the Android recording and the second track is from the iOS recording. When I hear both the audio through headphones I can vaguely distinguish them but when I visualize the data points, you can clearly see the difference in the image. These distinctions are bad for ML operations.
Clearly in the iPhone, there is a certain amplification factor involved which I would like to implement in the Android also.
Is anyone aware of the amplification factor? OR are there any other possible alternatives?
It's quite possible that the difference is that the effect of Automatic Gain Control.
You can disable this in your app's AVAudioSession by setting its mode to AVAudioSessionModeMeasurement which you do once in your application - usually at startup. This disables a great deal of input signal processing.
Reading your problem description, you might be better off enabling AGC on Android.
If neither of these yields results, you might want to gain scale both signals so they are just below clipping.
let audioSession = AVAudioSession.sharedInstance()
audio.session.setMode(AVAudioSessionModeMeasurement)
I'm writing an android app that plays an audio file and records what the phone is outputting simultaneously. When the recording is done, it would compare the recording against the original audio played and return whether they match and to what certainty.
I searched a lot and I found some libraries for audio fingerprinting, but they're mostly for music identification purposes.
Is there any libraries out there that I could use for this purpose? Would it make sense to write a custom algorithm for this?
You could compare the sound waves sample to sample (as numbers), then compute maximal, minimal, average difference, etc..
I want to develop an android app that takes in the audio signal and generates the time and frequency domain values.
I have the audio in a buffer which is obtained from the android MIC. I want to use this buffer values to generate a frequency domain graph. Is there some code that would help me find the FFT of an audio signal??
I have seen Moonblink's Audalyzer code and there are some missing components in the code. I could find lots of missing modules. I want to find a better logic that would take in audio and perform some calculations on it.
I found these for you using duckduckgo:
Android audio FFT to retrieve specific frequency magnitude using audiorecord
http://www.digiphd.com/android-java-reconstruction-fast-fourier-transform-real-signal-libgdx-fft/
This should help