Android Mixing Audio Input

Android Mixing Audio Input - android

I've read a lot of questions on stackoverflow and other pages about this topic, but didn't find a real up-to-date solution:
In an Android-App I've got two audio files (local file system), which are encoded in mp3, ogg or wav.
I just want to play them exactly synchronously, have seeking possibilities and control the volume of each single track. Using MediaPlayer this isn't possible because of the well known latency issues in Android.
So I think having two Audio-Player-Instances (of whatever library) will allways result in bad latencies, so it seems not to be the solution.
So in my opinion the only solution would be to mix together the audio inputs to a somewhat mixed input, which can be played by one Player. I read a lot about Androids AudioTrack and buffers and the OpenSL ES implementation, but allways ended with the notice: buffers only support PCM raw audio data. Ok, so I have to decode the mp3/ogg by myself?
My Question now is: Is there any library that can help me to a) do exactly what I want with a simple API or b) decode mp3/ogg to memory to use that data with AudioTrack or OpenSL?
If it's native or Java is unimportant, it just has to work.
The minimum API-Level is 15+ (Android 4.0.3, most current Version while creating this question).

Related

Real time audio analysis Android

I've got a rather complicated problem that I need to solve at work. It's pretty far out of my remit of "Android App Developer" - I would class it as a very specialized audio engineering problem.
I am tasked with developing an application, which needs to be able to stream either a local audio file or audio from streaming service apps such as, but not limited to, Spotify, to another device over Bluetooth.
In addition, the app needs to be able to estimate the BPM of the streamed audio (it is assumed all audio will be musical) and use this BPM value to control the playback speed of a lighting sequence.
This question is about how to estimate the BPM of the streamed music.
For the case where the audio file is local, I can think of some solutions for this, such as hardcoding the BPM into the app, in a map against the audio resources URL.
I have also investigated and experimented with "static" library (aubio) than can estimate BPM from an audio file, but not on the fly. It assumes .wav format. This won't be sufficient for what we are trying to achieve here.
However, given the requirement for streaming external audio from streaming service apps such as Spotify, a static analysis solution is pointless as the solution wouldn't work for the streaming service case, and the streaming service case solution will work for both cases.
Therefore, I have come to the conclusion that somehow, I need to on the fly analyze the streamed audio, perhaps with FFT or peak detection algorithms.
This question isn't about the actual BPM estimation algorithm itself (or the implementation details of how I would get there) and is about the basic starting point of such a solution:
How might I go about getting A) the raw bytes of streamed audio for both the local file case and the external streaming service app case and B) how might I process these bytes into a data structure representing the audio stream in a way amenable to running audio analysis algorithms on it.
I realize this is very open ended, quite vague question, but this is so far out of my comfort zone I've no idea how to even formulate a more coherent question.
Any help would be greatly appreciated!

I'd start by creating some separate, more tightly defined questions for the different pieces. For example, ask how to get access to the raw bytes when streaming local file, or streaming URL-sourced audio. Android has some nice support for streaming, including the ability to stream PCM, so I'd be pretty surprised if getting a hook for access to the byte stream were not possible.
Once you have a hooking point, to convert the bytes to "something useful" I'd look at using the audio format to tell you how to read the incoming bytes. The format should tell you how many channels (mono or stereo), the encoding (e.g., signed PCM is common, might be normalized floats), the number of bits per value (16 is common) and the order of the bytes (big-endian vs little endian).
I know that there are posts that will explain how to convert the raw audio bytes to PCM values based on this info, including some on stackoverflow. They should be reachable via search. I think signed normalized floats is the most common data representation used for processing audio signals.

OpenSL - Poor Seeking with Audio Player Object

I'm writing an Android application that needs to be able to seek to specific points in a large mp3 audio file (~90minutes) with good accuracy.
Currently, I'm using an OpenSL approach with an audio player object with a URI data source that specifies the mp3 file and MIME information.
To test this out, I use the SLSeekITF interface on the player to seek to specific points (specified in milliseconds). However, I find that the seeking performance is poor and inconsistent. Often the audio is 1-10 seconds off from where it should be. Sometimes ahead, sometimes behind. Performance is a little bit better using shorter mp3 files, but nowhere near close enough.
The Seek modes ("accurate" and "fast") don't seem to make any difference on SLSeekITF.
On other platforms, I can get the seek position to be very accurate < 50msec which is barely noticeable, so I know this is possible.
-Does anyone know how to get better accuracy out of the OpenSL audio player?
-Are there known issues with this implementation?
-Are there other mp3 decoders available that offer better performance?
Thanks

I also posted this question on the Google NDK Group:
https://groups.google.com/forum/#!topic/android-ndk/rzVr3A0DjBs
While I never got an official answer from anyone at Google, the feedback I received seem to indicate that Media Player and/or playing audio from a URI with OpenSL ES is known to be buggy.
I ended up solving this problem by using a 3rd party mp3 decoder with seeking ability and a Buffer Queue Audio Player object in OpenSL ES to playback the audio samples.
Hardly easy to do, but it works.

Streaming multiple OGG simultaneously in Android

I need to be able to play two or more (let's say, up to 5) short ogg files simultaneously. And by simultaneously I mean in perfect synchrony. I am able to load them to SoundPool and play, but this sometimes creates a noticeable difference in playback start time, which I want to get rid of.
From my understanding this can be avoided if mixing PCMs into one buffer and playing. But OGG's are not PCMs and need to be somehow efficiently decoded before playing and latency must be very low, ideally as soon as user presses the button. So I figured I need a way to stream OGG into PCM and as I receive buffers I would mix them and feed to AudioTrack. My requirement is Android 2.3.3+, so I cannot use any new codecs provided in Jelly Bean.
Also although OGGs themselves are small, there is a lot of them. So keeping them all decoded in memory (SoundPool or some pre-decoding) may case problems too.
Can someone give me a tip where to dig? Can OpenSL ES do that for me? Or should I think about integrating ffmpeg? And is it even possible to stream simultaneus files with low latency?
Thanks

You can play sounds using AssetPlayers, but this sometimes creates a noticeable difference in playback start time, yeh...
So, i recomend to decode ogg using Ogg Vorbis (like here) and then using this PCM buffer for BufferPlayer.
Btw, check this OpenSL ES wrappers
https://github.com/Suvitruf/Android-ndk/tree/master/OpenSLES

How to decode MP3 in Android within app?

I'm currently working on an app that lets the user choose an MP3 audio file. The file is then processed by my app.
For this processing, the application would need to decode audio files to get the raw PCM output.
To decode MP3, I have two options:
Use the Android system to decode MP3 and get the PCM data.
Decode the MP3 myself on the phone, WITHOUT paying MP3 licensing fees.
My question is whether #1 is technically possible? And for #2, whether the MP3 license on the phone covers an app as well?

To my knowledge, there is no Android-provided way to decode MP3s.
I've used JLayer in the past, and can recommend it for MP3 processing. Using the NDK with a c++ library might be faster, but if you're looking to keep it Java, that's what I'd use. It's still faster than real-time, roughly 30 seconds to decode all frames in an average bitrate 3 minute MP3. That's with an Galaxy S(1GHz), so any newer phones are faster.
As far as licensing goes, I can't help you there. JLayer itself is LGPL, but the world of MP3 licensing is murkier than used motor oil. After a few days of searching for a concrete answer, I just gave up and did it. The world at large seems divided on who even holds the license in the first place.

the Android system can decode mp3 file now, see here it describes the media codec, container, and network protocol support provided by the Android platform.
The MedieCodec is a very powful framework to encode and decode media file.

Option 1 is definitely not possible (unless you want to target ICS+ devices and are willing to write native C code to decode MP3s with OpenSL). Geobits recommendation of jLayer is a good one. For the most part, dealing with jLayer is a breeze. Here's a good blog post that will help: http://mindtherobot.com/blog/624/android-audio-play-an-mp3-file-on-an-audiotrack/

Sound effect mixing with OpenSL on Android

I'm currently implementing a sound effect mixing on Android via OpenSL. I have an initial implementation going, but I've encountered some issues.
My implementation is as follows:
1) For each sound effect I create several AudioPlayer objects (one for each simultaneous sound) that uses an SLDataLocator_AndroidFD data source that in turn refers to an OGG file. For example, if I have a gun firing sound (lets call it gun.ogg) that is played in rapid succession, I use around 5 AudioPlayer objects that refer to the same gun.ogg audio source and also the same outputmix object.
2) When I need to play that sound effect, I search through all the AudioPlayer objects I created and find one that isn't currently in the SL_PLAYSTATE_PLAYING state and use it to play the effect.
3) Before playing a clip, I seek to the start of it using SLPlayItf::SetPosition.
This is working out alright so far, but there is some crackling noise that occurs when playing sounds in rapid succession. I read on the Android NDK newsgroup that OpenSL on Android has problems with switching data sources. Has anyone come across this issue?
I'm also wondering if anyone else seen or come up with a sound mixing approach for OpenSL on Android. If so, does your approach differ from mine? Any advice on the crackling noise?
I've scoured the internet for OpenSL documentation and example code, but there isn't much out there with regards to mixing (only loading which I've figured out already). Any help would be greatly appreciated.

This is probably not the best approach (creating many instances of audio players). Unfortunately the Android version (2.3) of OpenSL ES doesn't support SLDynamicSourceItf. Which would be similar to OpenAL's source binding interface. One approach would be to create multiple stream players. You would then search for a stream player that isn't currently playing and start streaming your sound effect to it from memory. It's not ideal but it's doable.
You should probably not use the ogg format for sound effects either. You're better off with WAV (PCM) as it won't need to be decoded.
Ogg is fine for streaming background music.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.