I'm very new to opensl es. I'm currently experimenting with the recording and playback features of opensl es for android. Right now I have a recording function which stores data in a buffer queue. I can then playback the buffer queue. Would anyone be able to explain how I can correctly manipulate the data in the buffer queue? so the playback sounds different from the recording.
My current configuration:
sampleFormat.pcmFormat_ = static_cast<uint16_t>(engine.bitsPerSample_);
//the buffer
uint8_t *buf_;
Is there any type of conversion or decoding I need to do to the data in the buffer before manipulating it?
I would really appreciate some help.
Your question is broad, what I can do is tell you how you are supposed to use it, and how you could manipulate audio data obtained from recording.
1) Once you setup your OpenSL_ES engine, recorder and player properly (many examples out there), you have given OpenSL_ES a buffer where to read pcm data from mic, and also a buffer where to read from data you would like to provide for the sink of playback, along with 2 callback functions which will be called upon completion, once the process of reading data has finished (after some time according to your settings like sample rate, size of buffer, etc), the record callback is called, from a thread created by OpenSL_ES which depending on the device and configuration might be a high priority thread usually called fast track (so you are not working on your thread in the callback, but in OpenSL_ES' thread and have to be careful not to do blocking operations there). Now if what you want is to playback audio as fast as posible, work your audio signal processing from inside the callback, if response time is not too important for you, you may use the callback as a signal for your thread to start reading process audio data in the buffer as you wish. In both cases to playback the audio you must enqueue the data (processed or unprocessed) for the playback process (playback also calls player callback upon finishing).
2) Now, if you want to process audio, you need to apply filters, there are many kinds of audio signal filters that can be applied, you should look for dynamic filters in case of real time playback. (some filters require lot of data to start processing and may be bad for real time, some others are optimized to use small chunks of data and dynamically adapt output). So you would need to make a chain of filters in a certain order to obtain what you want. The audio world is huge, you need to read quite a lot to start understanding audio processing. Audio performance is another thing and depends directly from the device you have (hard, soft).
3) Data manipulation to the buffer you obtain depends on your processor. For instance endianess, some processors may work with little or big endian and you get your data in big endian format. There is no compression so pcm data is ready for processing. (if you would like to create a wav from it you only need to add a wave header and add pcm data in the data chunk of the header, if you want other format like mp3 you also need to process your data with a compression algorithm according to the format you would like and add that data to the proper header)
Also to playback data through OpenSL_ES you need uncompressed audio data, so you can't play mp3 directly, you need to uncompress it into pcm data first
This is the basic functioning of OpenSL_ES, hope that answers your question. If something is unclear let me know.
PS: Android says Audio manipulation is easier now with the new library AAudio, which promises to accomplish the same tasks as OpenSL_ES with a third of it's complexity (there might be some issues with latency, some people have encountered but I bet they are being fixed as you read)
Related
I've got a rather complicated problem that I need to solve at work. It's pretty far out of my remit of "Android App Developer" - I would class it as a very specialized audio engineering problem.
I am tasked with developing an application, which needs to be able to stream either a local audio file or audio from streaming service apps such as, but not limited to, Spotify, to another device over Bluetooth.
In addition, the app needs to be able to estimate the BPM of the streamed audio (it is assumed all audio will be musical) and use this BPM value to control the playback speed of a lighting sequence.
This question is about how to estimate the BPM of the streamed music.
For the case where the audio file is local, I can think of some solutions for this, such as hardcoding the BPM into the app, in a map against the audio resources URL.
I have also investigated and experimented with "static" library (aubio) than can estimate BPM from an audio file, but not on the fly. It assumes .wav format. This won't be sufficient for what we are trying to achieve here.
However, given the requirement for streaming external audio from streaming service apps such as Spotify, a static analysis solution is pointless as the solution wouldn't work for the streaming service case, and the streaming service case solution will work for both cases.
Therefore, I have come to the conclusion that somehow, I need to on the fly analyze the streamed audio, perhaps with FFT or peak detection algorithms.
This question isn't about the actual BPM estimation algorithm itself (or the implementation details of how I would get there) and is about the basic starting point of such a solution:
How might I go about getting A) the raw bytes of streamed audio for both the local file case and the external streaming service app case and B) how might I process these bytes into a data structure representing the audio stream in a way amenable to running audio analysis algorithms on it.
I realize this is very open ended, quite vague question, but this is so far out of my comfort zone I've no idea how to even formulate a more coherent question.
Any help would be greatly appreciated!
I'd start by creating some separate, more tightly defined questions for the different pieces. For example, ask how to get access to the raw bytes when streaming local file, or streaming URL-sourced audio. Android has some nice support for streaming, including the ability to stream PCM, so I'd be pretty surprised if getting a hook for access to the byte stream were not possible.
Once you have a hooking point, to convert the bytes to "something useful" I'd look at using the audio format to tell you how to read the incoming bytes. The format should tell you how many channels (mono or stereo), the encoding (e.g., signed PCM is common, might be normalized floats), the number of bits per value (16 is common) and the order of the bytes (big-endian vs little endian).
I know that there are posts that will explain how to convert the raw audio bytes to PCM values based on this info, including some on stackoverflow. They should be reachable via search. I think signed normalized floats is the most common data representation used for processing audio signals.
I am trying to play raw sound data using AudioTrack class in Android, I am using the write method, but I noticed that there is a latency between the write method returns and the actual sound is played, to make it simple let us use AudioRecord class as the following psedu code:
//init AudioTrack
//init AudioRecord
while(true){
byte [] buffer = new byte[1000];
int read = audioRecord(buffer,0,1000);
audioTrack.write(buffer,0,read);
}
I expect to get latency that is read / sample rate seconds but the actual sound is played after and extra of about 0.5 seconds, I really need the audio to be played with minimum latency, so does anyone has an explanation of what is going on and is there any available solution or should I accept this as it is a hardware issue?
I'm assuming your goal is to come up with some interactive audio solution (that is, where sound is played in response to some user action), because in this scenario low latency really matters.
On Android, to achieve the lowest latency you need to use Open SL ES API which is available to native (C++) code via NDK. The only Java side mechanism that can achieve low latency is SoundPool class, but it has limitations in what kind of sounds you can play.
For more information, see the page on high-performance audio, and also check out this SO answer: Low-latency audio playback on Android
I am trying to stream audio through a server. I have set up everything, and it's working fine with recording and playing back static audio, but when I am trying to stream an audio there is a delay on the playing side.
I did a Google search, but couldn't find the proper way of doing this. I am using AudioRecord & the Audiotrack Android media API for sending & receiving audio data. Can anybody tell me how to handle this delay?
I have added my code on GOOGLE GROUP to get clear picture.
I had tried in this way, holding 5 chunks of audio data in a buffer which comes through the server & playing back when it fills 5 chunks of data and again getting next 5 chunks of audio data and filling it like that it goes till 1024 bytes of data (it writes to the audiotrack & the play method is called).This too has a delay,any other solutions??
If you're really trying to do this unbuffered, make sure whatever playback tool you're using is trying to play it back without a buffer. You will be hard-pressed to not have a delay. Nothing on TV, radio, etc. is really 'live'--there is always some kind of delay. With internet streams, you're sending a large amount of data constantly. Even besides the time for it to travel, all this data has to be kept in a particular order and nobody wants choppy playback while the enduser's computer attempts playback. I've had flash players for major networks keep massive cache files on my computer while it's handling playback, but their players do not skip/wait to buffer/etc. (If you load up something and notice a few 100 MBs of extra memory being used, maybe even more during playback, that's what that is.)
You might be able to get away with a very small buffer (the standard in the past used to be 30-60 seconds and a lot of players still default to this) using VLC. I have been able to set its buffer very low but it is on incredibly low quality streams/videos. The big problem you have though I'd guess is your playback is setting the buffer and if your playback is set to 60 seconds buffer, it doesn't matter what you do serverside...the client end will wait until it has that much of a chunk and then begin playback.
I wrote an iPhone app some time ago that creates sound programatically. It uses an AudioQueue to generate sound. With the AudioQueue, I can register for a callback whenever the system needs sound, and respond by filling a buffer with raw audio data. The buffers are small, so the sound can respond to user inputs with reasonably low latency.
I'd like to do a similar app on Android, but I'm not sure how. The MediaPlayer and SoundPool classes seems to be for playing canned media from files, which is not what I need. The JetPlayer appears to be some sort of MIDI playback engine.
Is there an equivalent to AudioQueue in the Android Java API? Do I have to use native code to accomplish what I want?
Thanks.
With the AudioQueue, I can register for a callback whenever the system needs sound, and respond by filling a buffer with raw audio data.
The closest analogy to this in Android is AudioTrack. Rather than the callback (pull) mechanism you are using, AudioTrack is more of a push model, where you keep writing to the track (presumably in a background thread) using blocking calls.
I'm looking for a way to programmatically save an array of shorts as PCM data. I know that this should be possible, but I haven't found a very easy way to do this on Android.
Essentially, I'm taking voltage data, and I want to save it in PCM format. My function looks something like this:
public void audifySignal(short[] signal) {
// Create a WAV file from the incoming signal
}
Any suggestions would be awesome, or even references. Seems like the audio APIs built in to android are more geared for directly recording from the mic, and not so much for lower level signal processing type work (at least for saving raw data to a file). I'd also like to avoid having to manually write the PCM file headers and what not...
Thanks!
Sam, I dunno about Android-specific libraries, but I'll go ahead and say this:
Raw PCM data is pretty straight forward. It's generally just sequential data. Maybe you need to understand the WAV format in order to understand what PCM is and how it works.
WAV is fairly widely used as a container for uncompressed audio. Gaining an understanding of how the WAV file contains the data will cast a fair bit of light on how raw digital audio works in general.
This page helped me a fair bit:
http://www.sonicspot.com/guide/wavefiles.html
Interestingly you can more or less fire ANY data at a sound-card and it'll play it. It'll probably sound crazy to us humans as the sound card doesn't care about whether it sounds garbled or not.
Whether it sounds pleasing to the ear or not will depend upon whether you've provided the correct sample size, number of channels, frequency and some PCM data that conforms to the former.
See you can't "detect" the sample size, the number of channels or the correct frequency from the raw PCM data itself. You have to store this crucial data ALONG with the PCM data so that other pieces of software can let the sound-card know how to handle your PCM data.
That's where the WAV container format comes in.
There are other formats but WAV is pretty commonplace and it's therefore a good place to start.
Cheers
Tristen
You can use Android's AudioTrack to write raw PCM data that you want to get played, but it's not a function to generate the wav file or so.