I use oboe to play sounds in my ndk library, and I use OpenSL with Android extensions to decode wav files into PCM. Decoded signed 16-bit PCM are stored in-memory (std::forward_list<int16_t>), and then they are sent into the oboe stream via a callback. The sound that I can hear from my phone is alike original wav file in volume level, however, 'quality' of such a sound is not -- it bursting and crackle.
I am guessing that I send PCM in audio stream in wrong order or format (sampling rate ?). How can I can use OpenSL decoding with oboe audio stream ?
To decode files to PCM, I use AndroidSimpleBufferQueue as a sink, and AndroidFD with AAssetManager as a source:
// Loading asset
AAsset* asset = AAssetManager_open(manager, path, AASSET_MODE_UNKNOWN);
off_t start, length;
int fd = AAsset_openFileDescriptor(asset, &start, &length);
AAsset_close(asset);
// Creating audio source
SLDataLocator_AndroidFD loc_fd = { SL_DATALOCATOR_ANDROIDFD, fd, start, length };
SLDataFormat_MIME format_mime = { SL_DATAFORMAT_MIME, NULL, SL_CONTAINERTYPE_UNSPECIFIED };
SLDataSource audio_source = { &loc_fd, &format_mime };
// Creating audio sink
SLDataLocator_AndroidSimpleBufferQueue loc_bq = { SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 1 };
SLDataFormat_PCM pcm = {
.formatType = SL_DATAFORMAT_PCM,
.numChannels = 2,
.samplesPerSec = SL_SAMPLINGRATE_44_1,
.bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_16,
.containerSize = SL_PCMSAMPLEFORMAT_FIXED_16,
.channelMask = SL_SPEAKER_FRONT_LEFT | SL_SPEAKER_FRONT_RIGHT,
.endianness = SL_BYTEORDER_LITTLEENDIAN
};
SLDataSink sink = { &loc_bq, &pcm };
And then I register callback, enqueue buffers and move PCM from buffer to storage until it's done.
NOTE: wav audio file is also 2 channeled signed 16 bit 44.1Hz PCM
My oboe stream configuration is the same:
AudioStreamBuilder builder;
builder.setChannelCount(2);
builder.setSampleRate(44100);
builder.setCallback(this);
builder.setFormat(AudioFormat::I16);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);
Audio rendering is working like that:
// Oboe stream callback
audio_engine::onAudioReady(AudioStream* self, void* audio_data, int32_t num_frames) {
auto stream = static_cast<int16_t*>(audio_data);
sound->render(stream, num_frames);
}
// Sound::render method
sound::render(int16_t* audio_data, int32_t num_frames) {
auto iter = pcm_data.begin();
std::advance(iter, cur_frame);
const int32_t rem_size = std::min(num_frames, size - cur_frame);
for(int32_t i = 0; i < rem_size; ++i, std::next(iter), ++cur_frame) {
audio_data[i] += *iter;
}
}
It looks like your render() method is confusing samples and frames.
A frame is a set of simultaneous samples.
In a stereo stream, each frame has TWO samples.
I think your iterator works on a sample basis. In other words next(iter) will advance to the next sample, not the next frame. Try this (untested) code.
sound::render(int16_t* audio_data, int32_t num_frames) {
auto iter = pcm_data.begin();
const int samples_per_frame = 2; // stereo
std::advance(iter, cur_sample);
const int32_t num_samples = std::min(num_frames * samples_per_frame,
total_samples - cur_sample);
for(int32_t i = 0; i < num_samples; ++i, std::next(iter), ++cur_sample) {
audio_data[i] += *iter;
}
}
In short: essentially, I was experiencing an underrun, because of usage of std::forward_list to store PCM. In such a case (using iterators to retrieve PCM), one has to use a container whose iterator implements LegacyRandomAccessIterator (e.g. std::vector).
I was sure that the linear complexity of methods std::advance and std::next doesn't make any difference there in my sound::render method. However, when I was trying to use raw pointers and pointer arithmetic (thus, constant complexity) with debugging methods that were suggested in the comments (Extracting PCM from WAV with Audacity, then loading this asset with AAssetManager directly into memory), I realized, that amount of "corruption" of output sound was directly proportional to the position argument in std::advance(iter, position) in render method.
So, if the amount of sound corruption was directly proportional to the complexity of std::advance (and also std::next), then I have to make the complexity constant -- by using std::vector as an container. And using an answer from #philburk, I got this as a working result:
class sound {
private:
const int samples_per_frame = 2; // stereo
std::vector<int16_t> pcm_data;
...
public:
render(int16_t* audio_data, int32_t num_frames) {
auto iter = std::next(pcm_data.begin(), cur_sample);
const int32_t s = std::min(num_frames * samples_per_frame,
total_samples - cur_sample);
for(int32_t i = 0; i < s; ++i, std::advance(iter, 1), ++cur_sample) {
audio_data[i] += *iter;
}
}
}
Related
I'm working on an Android app that plays back audio. To minimize latency I'm using C++ via JNI to play the app using the C++ library oboe.
Currently, before playback, the app has to decode the given file (e.g. an mp3), and then plays back the decoded raw audio stream. This leads to waiting time before playback starts if the file is bigger.
So I would like to do the decoding beforehand, save it, and when playback is requested just play thre decoded data from the saved file.
I have next to no knowledge of how to do proper file i/o in C++ and have a hard time wrapping my head around it. It is possible that my problem can be solved just with the right library, I'm not sure.
So currently I am saving my file like this:
bool Converter::doConversion(const std::string& fullPath, const std::string& name) {
// here I'm setting up the extractor and necessary inputs. Omitted since not relevant
// this is where the decoder is called to decode a file to raw audio
constexpr int kMaxCompressionRatio{12};
const long maximumDataSizeInBytes = kMaxCompressionRatio * (size) * sizeof(int16_t);
auto decodedData = new uint8_t[maximumDataSizeInBytes];
int64_t bytesDecoded = NDKExtractor::decode(*extractor, decodedData);
auto numSamples = bytesDecoded / sizeof(int16_t);
auto outputBuffer = std::make_unique<float[]>(numSamples);
// This block is necessary to get the correct format for oboe.
// The NDK decoder can only decode to int16, we need to convert to floats
oboe::convertPcm16ToFloat(
reinterpret_cast<int16_t *>(decodedData),
outputBuffer.get(),
bytesDecoded / sizeof(int16_t));
// This is how I currently save my outputBuffer to a file. This produces a file on the disc.
std::string outputSuffix = ".pcm";
std::string outputName = std::string(mFolder) + name + outputSuffix;
std::ofstream outfile(outputName.c_str(), std::ios::out | std::ios::binary);
outfile.write(reinterpret_cast<const char *>(&outputBuffer), sizeof outputBuffer);
return true;
}
So I believe I take my float array, convert it to a char array and save it. I am not certain this correct, but that is my best understanding of it.
There is a file afterwards, anyway.
Edit: As I found out when analyzing my saved file I only store 8 bytes.
Now how do I load this file again and restore the contents of my outputBuffer?
Currently I have this bit, which is clearly incomplete:
StorageDataSource *StorageDataSource::openPCM(const char *fileName, AudioProperties targetProperties) {
long bufferSize;
char * buffer;
std::ifstream stream(fileName, std::ios::in | std::ios::binary);
stream.seekg (0, std::ios::beg);
bufferSize = stream.tellg();
buffer = new char [bufferSize];
stream.read(buffer, bufferSize);
stream.close();
If this is correct, what do I have to do to restore the data as the original type? If I am doing it wrong, how does it work the right way?
I figured out how to do it thanks to #Michael's comments.
This is how I save my data now:
bool Converter::doConversion(const std::string& fullPath, const std::string& name) {
// here I'm setting up the extractor and necessary inputs. Omitted since not relevant
// this is where the decoder is called to decode a file to raw audio
constexpr int kMaxCompressionRatio{12};
const long maximumDataSizeInBytes = kMaxCompressionRatio * (size) * sizeof(int16_t);
auto decodedData = new uint8_t[maximumDataSizeInBytes];
int64_t bytesDecoded = NDKExtractor::decode(*extractor, decodedData);
auto numSamples = bytesDecoded / sizeof(int16_t);
// converting to float has moved to the reading function, so now i save decodedData directly.
std::string outputSuffix = ".pcm";
std::string outputName = std::string(mFolder) + name + outputSuffix;
std::ofstream outfile(outputName.c_str(), std::ios::out | std::ios::binary);
outfile.write((char*)decodedData, numSamples * sizeof (int16_t));
return true;
}
And this is how I read the stored file again:
long bufferSize;
char * inputBuffer;
std::ifstream stream;
stream.open(fileName, std::ifstream::in | std::ifstream::binary);
if (!stream.is_open()) {
// handle error
}
stream.seekg (0, std::ios::end); // seek to the end
bufferSize = stream.tellg(); // get size info, will be 0 without seeking to the end
stream.seekg (0, std::ios::beg); // seek to beginning
inputBuffer = new char [bufferSize];
stream.read(inputBuffer, bufferSize); // the actual reading into the buffer. would be null without seeking back to the beginning
stream.close();
// done reading the file.
auto numSamples = bufferSize / sizeof(int16_t); // calculate my number of samples, so the audio is correctly interpreted
auto outputBuffer = std::make_unique<float[]>(numSamples);
// the decoding bit now happens after the file is open. This avoids confusion
// The NDK decoder can only decode to int16, we need to convert to floats
oboe::convertPcm16ToFloat(
reinterpret_cast<int16_t *>(inputBuffer),
outputBuffer.get(),
bufferSize / sizeof(int16_t));
// here I continue working with my outputBuffer
The important bits of information/understanding C++ I didn't have or get were
a) the size of a pointer is not the same as the size of the data it
points to and
b) how seeking a stream works. I needed to put the
needle back to the start before I would find any data in my buffer.
I'm working on a native android project and trying to use OpenSL to play some audio effects. Working from the native audio sample project VisualGDB provides, I've written the code posted below.
Near the end, you can see I have commented a line that stores the contents of a variable called hello in the buffer to the destination. hello comes from the sample project, and contains about 700 lines of character bytes like this:
"\x02\x00\x01\x00\xff\xff\x09\x00\x0c\x00\x10\x00\x07\x00\x07\x00"
which make an audio file of someone saying "hello". When reading that byte data into the stream, my code works fine and I hear "hello" when I run the application. When I read from wav file to play the asset I want, however, I only hear static. The size of the data buffer is the same as the size of the file, so it appears it's being read in properly. The static plays for the duration of the wav file (or very close to it).
I really know nothing about data formats or audio programming. I've tried tweaking the format_pcm variables some with different enum values, but had no success. Using a tool called GSpot I found on The Internet, I know the following about the audio file I'm trying to play:
File Size: 557 KB (570,503 bytes) (this is the same size as the data buffer
AAsset_read returns
Codec: PCM Audio
Sample rate: 48000Hz
Bit rate: 1152 kb/s
Channels: 1
Any help or direction would be greatly appreciated.
SLDataLocator_AndroidSimpleBufferQueue loc_bufq = { SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 1 };
SLDataFormat_PCM format_pcm;
format_pcm.formatType = SL_DATAFORMAT_PCM;
format_pcm.numChannels = 1;
format_pcm.samplesPerSec = SL_SAMPLINGRATE_48;// SL_SAMPLINGRATE_8;
format_pcm.bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_8; // SL_PCMSAMPLEFORMAT_FIXED_16;
format_pcm.containerSize = 16;
format_pcm.channelMask = SL_SPEAKER_FRONT_CENTER;
format_pcm.endianness = SL_BYTEORDER_LITTLEENDIAN;
SLDataSource audioSrc = { &loc_bufq, &format_pcm };
// configure audio sink
SLDataLocator_OutputMix loc_outmix = { SL_DATALOCATOR_OUTPUTMIX, manager->GetOutputMixObject() };
SLDataSink audioSnk = { &loc_outmix, NULL };
//create audio player
const SLInterfaceID ids[3] = { SL_IID_BUFFERQUEUE, SL_IID_EFFECTSEND, SL_IID_VOLUME };
const SLboolean req[3] = { SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE };
SLEngineItf engineEngine = manager->GetEngine();
result = (*engineEngine)->CreateAudioPlayer(engineEngine, &bqPlayerObject, &audioSrc, &audioSnk,
3, ids, req);
// realize the player
result = (*bqPlayerObject)->Realize(bqPlayerObject, SL_BOOLEAN_FALSE);
// get the play interface
result = (*bqPlayerObject)->GetInterface(bqPlayerObject, SL_IID_PLAY, &bqPlayerPlay);
// get the buffer queue interface
result = (*bqPlayerObject)->GetInterface(bqPlayerObject, SL_IID_BUFFERQUEUE,
&bqPlayerBufferQueue);
// register callback on the buffer queue
result = (*bqPlayerBufferQueue)->RegisterCallback(bqPlayerBufferQueue, bqPlayerCallback, NULL);
// get the effect send interface
result = (*bqPlayerObject)->GetInterface(bqPlayerObject, SL_IID_EFFECTSEND,
&bqPlayerEffectSend);
// get the volume interface
result = (*bqPlayerObject)->GetInterface(bqPlayerObject, SL_IID_VOLUME, &bqPlayerVolume);
// set the player's state to playing
result = (*bqPlayerPlay)->SetPlayState(bqPlayerPlay, SL_PLAYSTATE_PLAYING);
uint8* pOutBytes = nullptr;
uint32 outSize = 0;
result = MyFileManager::GetInstance()->OpenFile(m_strAbsolutePath, (void**)&pOutBytes, &outSize, true);
const char* filename = m_strAbsolutePath->GetUTF8String();
result = (*bqPlayerBufferQueue)->Enqueue(bqPlayerBufferQueue, pOutBytes, outSize);
// result = (*bqPlayerBufferQueue)->Enqueue(bqPlayerBufferQueue, hello, sizeof(hello));
if (SL_RESULT_SUCCESS != result) {
return JNI_FALSE;
}
Several things were to blame. The format of the wave files I was testing with was not what the specification described. There seemed to be a lot of empty data after the first chunk of header data. Also, the buffer that needs to be passed to the queue needs to be a char* of just the wav data, not the header. I'd wrongly assumed the queue parsed the header out.
I'm trying to set up OpenSL AudioPlayer to use memory I've allocated to playback a wav file. I want to do this so I can have multiple AudioPlayers that share the same data and conserve memory.
I've tried to give openSL the entire file and tell it that it is a WAVE with format_mime
SLDataLocator_Address loc_fd = {SL_DATALOCATOR_ADDRESS, data, size};
SLDataFormat_MIME format_mime = { SL_DATAFORMAT_MIME, (SLchar*)"audio/x-wav",SL_CONTAINERTYPE_WAV};
SLDataSource audioSrc = { &loc_fd, &format_mime };
// configure audio sink
SLDataLocator_OutputMix loc_outmix = { SL_DATALOCATOR_OUTPUTMIX,outputMixObject };
SLDataSink audioSnk = { &loc_outmix, 0 };
// create audio player
const SLInterfaceID ids[2] = { SL_IID_SEEK, SL_IID_PLAYBACKRATE };
const SLboolean req[2] = { SL_BOOLEAN_FALSE, SL_BOOLEAN_FALSE };
result = (*engineEngine)->CreateAudioPlayer(engineEngine,&uriPlayerObject[cntSOUND],&audioSrc, &audioSnk, 0, ids, req);
and I have parsed the WAVE data myself and loaded format_pcm
SLDataFormat_PCM format_pcm;
format_pcm.formatType = SL_DATAFORMAT_PCM;
char* wavParser = isWAVE(data);
if(wavParser == NULL)
{
Log("NOT A WAVE!");
return -1;
}
char* fmtChunk = getChunk("fmt ", data, size);
parsefmtChunk(fmtChunk, &format_pcm);
char* dataChunk = getChunk("data",data, size);
dataChunk += 4;
unsigned int dataSize = *((unsigned int*)dataChunk);
dataChunk += 4;
format_pcm.channelMask = 0;
format_pcm.containerSize = 16;
format_pcm.endianness = SL_BYTEORDER_LITTLEENDIAN;
loc_fd.pAddress = dataChunk;
loc_fd.length = dataSize;
The parsefmtChunk function is
void parsefmtChunk(char* fmtchunk, SLDataFormat_PCM* pcm)
{
char* data = fmtchunk + 8;
unsigned short audioFormat = *((unsigned short*)data);
if(audioFormat != 1)
{
Log("Not PCM!");
Log("Reached Line:%d in File %s", __LINE__, __FILE__);
return;
}
data += 2;
pcm->numChannels = *((unsigned short*)data);
data += 2;
pcm->samplesPerSec = *((unsigned int*)data);
data += 4;
//Byte Rate
data += 4;
//Block Align
data += 2;
//BitsPerSample
pcm->bitsPerSample = *((unsigned short*)data);
(Are Byte Rate and Block Align supposed to be used somehow to fill out the pcm struct?)
but whenever I create the audioplayer I get SL_RESULT_CONTENT_UNSUPPORTED
This is what I log from my parsefmt function
Channels:2
samplesPerSec:44100
bitsPerSample:16
from android-ndk-r8b/docs/opensles/index.html
PCM data format
The PCM data format can be used with buffer queues only.
So SLDataFormat_PCM CANNOT be used with SLDataLocator_Address like I assumed.
I can do what I want with a Buffer Queue instead by using just one big queue like so
bufferqueueitf->Enqueue(bufferqueueitf,dataChunk,dataSize);
Have you tried this?
SLDataFormat_MIME format_mime = {SL_DATAFORMAT_MIME, NULL, SL_CONTAINERTYPE_UNSPECIFIED};
The Android implementation of OpenSL ES isn't totally compliant and http://mobilepearls.com/labs/native-android-api/ndk/docs/opensles/ recommends the following:
The Android implementation of OpenSL ES requires that mimeType be initialized to either NULL or a valid UTF-8 string, and that containerType be initialized to a valid value. In the absence of other considerations, such as portability to other implementations, or content format which cannot be identified by header, we recommend that you set the mimeType to NULL and containerType to SL_CONTAINERTYPE_UNSPECIFIED.
Also, make sure you're giving it a valid URI.
I'm trying to record from the microphone, add some effects, and the save this to a file
I've started with the example native-audio included in the Android NDK.
I'va managed to add some reverb and play it back but I haven't found any examples or help on how to accomplish this.
Any and all help is welcome.
OpenSL is not a framework for file formats and access. If you want a raw PCM file, simply open it for writing and put all buffers from OpenSL callback into the file. But if you want encoded audio, you need your own codec and format handler. You can use ffmpeg libraries, or built-in stagefright.
Update write playback buffers to local raw PCM file
We start with native-audio-jni.c
#include <stdio.h>
FILE* rawFile = NULL;
int bClosing = 0;
...
void bqPlayerCallback(SLAndroidSimpleBufferQueueItf bq, void *context)
{
assert(bq == bqPlayerBufferQueue);
assert(NULL == context);
// for streaming playback, replace this test by logic to find and fill the next buffer
if (--nextCount > 0 && NULL != nextBuffer && 0 != nextSize) {
SLresult result;
// enqueue another buffer
result = (*bqPlayerBufferQueue)->Enqueue(bqPlayerBufferQueue, nextBuffer, nextSize);
// the most likely other result is SL_RESULT_BUFFER_INSUFFICIENT,
// which for this code example would indicate a programming error
assert(SL_RESULT_SUCCESS == result);
(void)result;
// AlexC: here we write:
if (rawFile) {
fwrite(nextBuffer, nextSize, 1, rawFile);
}
}
if (bClosing) { // it is important to do this in a callback, to be on the correct thread
fclose(rawFile);
rawFile = NULL;
}
// AlexC: end of changes
}
...
void Java_com_example_nativeaudio_NativeAudio_startRecording(JNIEnv* env, jclass clazz)
{
bClosing = 0;
rawFile = fopen("/sdcard/rawFile.pcm", "wb");
...
void Java_com_example_nativeaudio_NativeAudio_shutdown(JNIEnv* env, jclass clazz)
{
bClosing = 1;
...
Pass the raw vector from c to java and encode it in mp3 with mediaRecorder, I don't know if you can set the audio source from a raw vector, but maybe...
I use the android.provider.MediaStore.Audio.Media.EXTERNAL_CONTENT_URI intent to load music files from the SD Card.
Intent tmpIntent1 = new Intent(Intent.ACTION_PICK, android.provider.MediaStore.Audio.Media.EXTERNAL_CONTENT_URI);
startActivityForResult(tmpIntent1, 0);
and in onActivityResult
Uri mediaPath = Uri.parse(data.getData().toString());
MediaPlayer mp = MediaPlayer.create(this, mediaPath);
mp.start();
Now MediaPlayer plays the audio in stereo. Is there any way to convert the selected music/audio file or the output from stereo to mono in the app itself?
I looked up API for SoundPool and AudioTrack, but didn't find how to convert mp3 files audio to mono.
Apps like PowerAMP have those Stereo <-> Mono switches that when pressed immediately convert the output audio to mono signal and back again, how do they do it?
Do you load .wav- files respectively PCM- data? If so then you could easily read each sample of each channel, superpose them and divide them by the amount of channels to get a mono signal.
If you store your stereo signal in form of interleaved signed shorts, the code to calculate the resulting mono signal might look like this:
short[] stereoSamples;//get them from somewhere
//output array, which will contain the mono signal
short[] monoSamples= new short[stereoSamples.length/2];
//length of the .wav-file header-> 44 bytes
final int HEADER_LENGTH=22;
//additional counter
int k=0;
for(int i=0; i< monoSamples.length;i++){
//skip the header andsuperpose the samples of the left and right channel
if(k>HEADER_LENGTH){
monoSamples[i]= (short) ((stereoSamples[i*2]+ stereoSamples[(i*2)+1])/2);
}
k++;
}
I hope, I was able to help you.
Best regards,
G_J
First of all, get an encoder to convert your audio file to short array (short[]).
Then convert the short array to mono by splitting the streams.
Then average the split streams.
short[] stereo = new short[BUF_SIZE];
short[] monoCh1 = new short[BUF_SIZE/2];
short[] monoCh2 = new short[BUF_SIZE/2];
short[] monoAvg = new short[BUF_SIZE/2];
stereo = sampleBuffer.getBuffer(); //get the raw S16 PCM buffer here
for( int i=0 ; i < stereo.length ; i+=2 )
{
monoCh1[i/2] = (short) stereo[i];
monoCh2[i/2] = (short) stereo[i+1];
monoAvg[i/2] = (short) ( ( stereo[i] + stereo[i+1] ) / 2 );
}
Now you have:
PCM mono stream from channel 1 (Left channel) in monoCh1
PCM mono stream from channel 2 (Right channel) in monoCh2
PCM mono stream from averaging both channels in monoAvg