I use the android.provider.MediaStore.Audio.Media.EXTERNAL_CONTENT_URI intent to load music files from the SD Card.
Intent tmpIntent1 = new Intent(Intent.ACTION_PICK, android.provider.MediaStore.Audio.Media.EXTERNAL_CONTENT_URI);
startActivityForResult(tmpIntent1, 0);
and in onActivityResult
Uri mediaPath = Uri.parse(data.getData().toString());
MediaPlayer mp = MediaPlayer.create(this, mediaPath);
mp.start();
Now MediaPlayer plays the audio in stereo. Is there any way to convert the selected music/audio file or the output from stereo to mono in the app itself?
I looked up API for SoundPool and AudioTrack, but didn't find how to convert mp3 files audio to mono.
Apps like PowerAMP have those Stereo <-> Mono switches that when pressed immediately convert the output audio to mono signal and back again, how do they do it?
Do you load .wav- files respectively PCM- data? If so then you could easily read each sample of each channel, superpose them and divide them by the amount of channels to get a mono signal.
If you store your stereo signal in form of interleaved signed shorts, the code to calculate the resulting mono signal might look like this:
short[] stereoSamples;//get them from somewhere
//output array, which will contain the mono signal
short[] monoSamples= new short[stereoSamples.length/2];
//length of the .wav-file header-> 44 bytes
final int HEADER_LENGTH=22;
//additional counter
int k=0;
for(int i=0; i< monoSamples.length;i++){
//skip the header andsuperpose the samples of the left and right channel
if(k>HEADER_LENGTH){
monoSamples[i]= (short) ((stereoSamples[i*2]+ stereoSamples[(i*2)+1])/2);
}
k++;
}
I hope, I was able to help you.
Best regards,
G_J
First of all, get an encoder to convert your audio file to short array (short[]).
Then convert the short array to mono by splitting the streams.
Then average the split streams.
short[] stereo = new short[BUF_SIZE];
short[] monoCh1 = new short[BUF_SIZE/2];
short[] monoCh2 = new short[BUF_SIZE/2];
short[] monoAvg = new short[BUF_SIZE/2];
stereo = sampleBuffer.getBuffer(); //get the raw S16 PCM buffer here
for( int i=0 ; i < stereo.length ; i+=2 )
{
monoCh1[i/2] = (short) stereo[i];
monoCh2[i/2] = (short) stereo[i+1];
monoAvg[i/2] = (short) ( ( stereo[i] + stereo[i+1] ) / 2 );
}
Now you have:
PCM mono stream from channel 1 (Left channel) in monoCh1
PCM mono stream from channel 2 (Right channel) in monoCh2
PCM mono stream from averaging both channels in monoAvg
Related
I have an Android app where there is some raw audio bytes stored in a variable.
If I use an AudioTrack to play this audio data, it only works if I use AudioTrack.MODE_STREAM:
byte[] recordedAudioAsBytes;
public void playButtonPressed(View v) {
// this verifies that audio data exists as expected
for (int i=0; i<recordedAudioAsBytes.length; i++) {
Log.i("ABC", "byte[" + i + "] = " + recordedAudioAsBytes[i]);
}
// STREAM MODE ACTUALLY WORKS!!
/*
AudioTrack player = new AudioTrack(AudioManager.STREAM_MUSIC, SAMPLERATE, CHANNELS,
ENCODING, MY_CHOSEN_BUFFER_SIZE, AudioTrack.MODE_STREAM);
player.play();
player.write(recordedAudioAsBytes, 0, recordedAudioAsBytes.length);
*/
// STATIC MODE DOES NOT WORK
AudioTrack player = new AudioTrack(AudioManager.STREAM_MUSIC, SAMPLERATE, PLAYBACK_CHANNELS,
ENCODING, MY_CHOSEN_BUFFER_SIZE, AudioTrack.MODE_STATIC);
player.write(recordedAudioAsBytes, 0, recordedAudioAsBytes.length);
player.play();
}
If I use AudioTrack.MODE_STATIC, the output is glitchy -- it just makes a nasty pop and sounds very short with hardly anything audible.
So why is that? Does STATIC_MODE require that the audio data have a header?
That's all I can think of.
If you'd like to see all the code, check this question.
It seems to me that you are using the same MY_CHOSEN_BUFFER_SIZE for 'streaming' and 'static' mode!? This might explain why it sounds short...
In order to use Audiotracks 'static-mode' you have to use the size of your Byte-Array (bigger will also work) as buffersize. The Audio will be treated as one big chunk of data.
See: AudioTrack.Builder
setBufferSizeInBytes()... "If using the AudioTrack in static mode (see AudioTrack#MODE_STATIC), this is the maximum size of the sound that will be played by this instance."
I use oboe to play sounds in my ndk library, and I use OpenSL with Android extensions to decode wav files into PCM. Decoded signed 16-bit PCM are stored in-memory (std::forward_list<int16_t>), and then they are sent into the oboe stream via a callback. The sound that I can hear from my phone is alike original wav file in volume level, however, 'quality' of such a sound is not -- it bursting and crackle.
I am guessing that I send PCM in audio stream in wrong order or format (sampling rate ?). How can I can use OpenSL decoding with oboe audio stream ?
To decode files to PCM, I use AndroidSimpleBufferQueue as a sink, and AndroidFD with AAssetManager as a source:
// Loading asset
AAsset* asset = AAssetManager_open(manager, path, AASSET_MODE_UNKNOWN);
off_t start, length;
int fd = AAsset_openFileDescriptor(asset, &start, &length);
AAsset_close(asset);
// Creating audio source
SLDataLocator_AndroidFD loc_fd = { SL_DATALOCATOR_ANDROIDFD, fd, start, length };
SLDataFormat_MIME format_mime = { SL_DATAFORMAT_MIME, NULL, SL_CONTAINERTYPE_UNSPECIFIED };
SLDataSource audio_source = { &loc_fd, &format_mime };
// Creating audio sink
SLDataLocator_AndroidSimpleBufferQueue loc_bq = { SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 1 };
SLDataFormat_PCM pcm = {
.formatType = SL_DATAFORMAT_PCM,
.numChannels = 2,
.samplesPerSec = SL_SAMPLINGRATE_44_1,
.bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_16,
.containerSize = SL_PCMSAMPLEFORMAT_FIXED_16,
.channelMask = SL_SPEAKER_FRONT_LEFT | SL_SPEAKER_FRONT_RIGHT,
.endianness = SL_BYTEORDER_LITTLEENDIAN
};
SLDataSink sink = { &loc_bq, &pcm };
And then I register callback, enqueue buffers and move PCM from buffer to storage until it's done.
NOTE: wav audio file is also 2 channeled signed 16 bit 44.1Hz PCM
My oboe stream configuration is the same:
AudioStreamBuilder builder;
builder.setChannelCount(2);
builder.setSampleRate(44100);
builder.setCallback(this);
builder.setFormat(AudioFormat::I16);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);
Audio rendering is working like that:
// Oboe stream callback
audio_engine::onAudioReady(AudioStream* self, void* audio_data, int32_t num_frames) {
auto stream = static_cast<int16_t*>(audio_data);
sound->render(stream, num_frames);
}
// Sound::render method
sound::render(int16_t* audio_data, int32_t num_frames) {
auto iter = pcm_data.begin();
std::advance(iter, cur_frame);
const int32_t rem_size = std::min(num_frames, size - cur_frame);
for(int32_t i = 0; i < rem_size; ++i, std::next(iter), ++cur_frame) {
audio_data[i] += *iter;
}
}
It looks like your render() method is confusing samples and frames.
A frame is a set of simultaneous samples.
In a stereo stream, each frame has TWO samples.
I think your iterator works on a sample basis. In other words next(iter) will advance to the next sample, not the next frame. Try this (untested) code.
sound::render(int16_t* audio_data, int32_t num_frames) {
auto iter = pcm_data.begin();
const int samples_per_frame = 2; // stereo
std::advance(iter, cur_sample);
const int32_t num_samples = std::min(num_frames * samples_per_frame,
total_samples - cur_sample);
for(int32_t i = 0; i < num_samples; ++i, std::next(iter), ++cur_sample) {
audio_data[i] += *iter;
}
}
In short: essentially, I was experiencing an underrun, because of usage of std::forward_list to store PCM. In such a case (using iterators to retrieve PCM), one has to use a container whose iterator implements LegacyRandomAccessIterator (e.g. std::vector).
I was sure that the linear complexity of methods std::advance and std::next doesn't make any difference there in my sound::render method. However, when I was trying to use raw pointers and pointer arithmetic (thus, constant complexity) with debugging methods that were suggested in the comments (Extracting PCM from WAV with Audacity, then loading this asset with AAssetManager directly into memory), I realized, that amount of "corruption" of output sound was directly proportional to the position argument in std::advance(iter, position) in render method.
So, if the amount of sound corruption was directly proportional to the complexity of std::advance (and also std::next), then I have to make the complexity constant -- by using std::vector as an container. And using an answer from #philburk, I got this as a working result:
class sound {
private:
const int samples_per_frame = 2; // stereo
std::vector<int16_t> pcm_data;
...
public:
render(int16_t* audio_data, int32_t num_frames) {
auto iter = std::next(pcm_data.begin(), cur_sample);
const int32_t s = std::min(num_frames * samples_per_frame,
total_samples - cur_sample);
for(int32_t i = 0; i < s; ++i, std::advance(iter, 1), ++cur_sample) {
audio_data[i] += *iter;
}
}
}
I'm developing an Xamarin Android application where video and audio are recorded with some playing music. I need to merge all these streams together.
I'm trying to figure out what I'm missing when recording audio using byte[] (or ByteBuffer which I also tried out) in the audioRecord.read() function. The output WAV file seems right (is clearly playable at a 44100Hz sample rate), but a delay appears after a couple of seconds and tends to get bigger and bigger.
When using shorts, I don't have any delay in the MIC recorded audio. The big issue using shorts is that no mather what I do, I can't have a sample rate higher then 8000hz (but this isn't the current issue although if someone knows how to fix it I'll take it :) )
The final merged file is an mp4 with AAC audio, merged using ffmpeg, but I don't think this is the issue.
Could it be related to 8000Hz (using short) and 44100Hz (using byte) ? Or I'm a adding something when using byte[] since I don't check how many bytes are read ?
Here are the parts involved in the issue:
//output file initialization
mDataOutputStream = new FileOutputStream(new Java.IO.File(mRawFilePath));
public void Run()
{
...
short[] shortBuf = new short[bufferSize / 2];
//byte[] byteBuf = new byte[bufferSize];
while(isRecording) {
//using shorts
audioRecorder.Read(shortBuf, 0, shortBuf.Length);
WriteShortsToFile(buf);
//using byte[]
//audioRecorder.Read(byteBuf, 0, byteBuf.Length);
//WriteBytesToFile(buf);
}
...
}
public void WriteShortsToFile(short[] shorts)
{
for (int i = 0; i < shorts.Length; i++)
{
mDataOutputStream.WriteByte(shorts[i] & 0xFF);
mDataOutputStream.WriteByte((shorts[i] >> 8) & 0xFF);
}
}
public void WriteBytesToFile(byte[] buf)
{
mDataOutputStream.Write(buf);
}
Finally got it working as it should.
I changed the allocation size of the short array to bufferSize.
//constructor
android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_AUDIO);
/////////////
//thread run() method
int N = AudioRecord.getMinBufferSize(8000,AudioFormat.CHANNEL_IN_MONO,AudioFormat.ENCODING_PCM_16BIT);
AudioRecord recorder = new AudioRecord(AudioSource.MIC, 8000, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, N*10);
recorder.startRecording();
while(!stopped)
{
try {
//if not paused upload audio
if (uploadAudio == true) {
short[][] buffers = new short[256][160];
int ix = 0;
//allocate buffer for audio data
short[] buffer = buffers[ix++ % buffers.length];
//write audio data to track
N = recorder.read(buffer,0,buffer.length);
//create bytes big enough to hold audio data
byte[] bytes2 = new byte[buffer.length * 2];
//convert audio data from short[][] to byte[]
ByteBuffer.wrap(bytes2).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer().put(buffer);
//encode audio data for ulaw
read(bytes2, 0, bytes2.length);
See here for ulaw encoder code. Im using the read, maxAbsPcm and encode methods
//send audio data
//os.write(bytes2,0,bytes2.length);
}
} finally {
}
}
os.close();
}
catch(Throwable x)
{
Log.w("AudioWorker", "Error reading voice AudioWorker", x);
}
finally
{
recorder.stop();
recorder.release();
}
///////////
So this works ok. The audio is sent in the proper format to the server and played at the opposite end. However the audio skips often. Example: saying 1,2,3,4 will play back with the 4 cut off.
I believe it to be a performance issue because I have timed some of these methods and when they take 0 or less seconds everything works but they quite often take a couple seconds. With the converting of bytes and encoding taking the most.
Any idea how I can optimize this code to get better performance? Or maybe a way to deal with lag (possibly build a cache)?
anyone know of any usefull links for learning audio dsp for android?
or a sound library?
im trying to make a basic mixer for playing wav files but realised i dont know enough about dsp, and i cant find anything at all for android.
i have a wav file loaded into a byte array and an AudioTrack on a short loop.
how can i feed the data in?
i expect this post will be ignored but its worth a try.
FileInputStream is = new FileInputStream(filePath);
BufferedInputStream bis = new BufferedInputStream(is);
DataInputStream dis = new DataInputStream(bis);
int i = 0;
while (dis.available() > 0) {
byteData[i] = dis.readByte(); //byteData
i++;
}
final int minSize = AudioTrack.getMinBufferSize( 44100, AudioFormat.CHANNEL_CONFIGURATION_STEREO, AudioFormat.ENCODING_PCM_16BIT );
track = new AudioTrack( AudioManager.STREAM_MUSIC, 44100, AudioFormat.CHANNEL_CONFIGURATION_STEREO, AudioFormat.ENCODING_PCM_16BIT,
minSize, AudioTrack.MODE_STREAM);
track.play();
bRun=true;
new Thread(new Runnable() {
public void run() {
track.write(byteData, 0, minSize);
}
}).start();
I'll give this a shot just because I was in your position a few months ago...
If you already have the wav file audio samples in a byte array, you simple need to pass the samples to the audio track object (lookup the write() methods).
To mix audio together you simply add the sames from each track. For example, add the first sample from track 1 to track 2, add the second sample from track 1 to track 2 and so on. The end result would ideally be a third array containing the added samplws which you pass to the 'write' method of your audio track instance.
You must be mindful of clipping here. If your data type 'short' then the maximum value allowed is 32768. A simple way to ensure that your added samples do not exceed this limit is to peform the addition and store the result in a variable whose data type is larger than a short (eg. int) and evaluate the result. If it's greater than 32768 then make it equal to 32768 and cast it back to a short.
int result = track1[i] + track2[i];
if(result > 32768) {
result = 32768;
}
else if(result < -32768) {
result = -32768;
}
mixedAudio[i] = (short)result;
Notice how the snippet above also tests for the minimum range of a short.
Appologies for the lack of formatting here, I'm on my mobile phone on a train :-)
Good luck.