I'd like to capture the outgoing audio from a game and record it into an audio file as it's played. Is this possible within the framework in OpenSL? Like by connecting the OutputMix to an AudioRecorder, or something?
You could register a callback to the queue and obtain the output buffer before / after it is enqueued into the buffer queue for output. You could have a wavBuffer (a short array the length of the buffer size) that is written into on each enqueueing of a new buffer. The contents of this buffer are then written to a file.
outBuffer = p->outputBuffer[p->currentOutputBuffer]; // obtain float buffer
for ( int i = 0; i < bufferSize; ++i )
wavBuffer = ( short ) outBuffer[ i ] * 32768; // convert float to short
// now append contents of wavBuffer into a file
The basic OpenSL setup for the queue callback is explained in some detail on this page
And a very basic means of creating a WAV file in C++ can be found here note that you must have a pretty definitive idea of the actual size of the total WAV file as it's part of its header.
Related
I have multiple audio files converted into byte array for each, and i have start time and end time of each audio file (these audio files are converted from video so i have its time stamps).
Now i want to merge these byte array of audio files to create one single audio mp3 file. I tried to simply merge these two array and it successfully creates a single mp3 file but it doesn't have any pause in between.
I want to add silent audio between each file based on the difference of start time and end time of each file. To achieve that, i added 0 bytes for each second difference between each array but it doesn't add sufficient pause.
Is there any way to determine how many bytes it will require to add for each second?
Every audio has 8000 sample Rate Hertz.
Following is the code i am using to add 0 bytes.
ArrayList<Byte> buffer = previousTranslation.getAudioBuffer();
// to get gape in seconds between two audio files from a video
int diff = (int)differenceBetweenTwoCalendar(
previousTranslation.getSentenceEndTime(),
translation.getSentenceStartTime());
// Add 0 bytes for each second between two audio buffers(bytes array)
for(int j = 0; j < diff; j++)
buffer.add((byte) 0);
// Add seconds part of audio buffer byte array
buffer.addAll(translation.getAudioBuffer();
I have a server that encodes real-time voice into mono or stereo mp3 thanks to libmp3lame and sends it chunk by chunk through a WebSocket.
I'm trying to make an Android App that receives those mp3 chunks and play them with the most appropriate Audio player Android have. I went with AudioTrack since it seems pretty easy to add chunks to the player as well as "stream" oriented. (Since what I'm doing is sending to the track some byte array and not a full song that is locally stocked in the Android phone).
Since AudioTrack does not support compressed audio format (such as MP3), I have to decode those chunks into PCM to play them afterward. I'm using the famous JLayer to do this real-time decoding. Thanks to that, I can play each sample into my AudioTrack and hear what the server is sending.
My problem is that the received/player audio is badly hashed. (I can understand whatever the speaker is saying perfectly, but the quality is bad, like if the speaker had a "robotic voice").
Here is the code I'm using to receive/decode/play those byte[].
public void addSample(byte[] data) throws BitstreamException, DecoderException, IOException {
// JLayer decoder
Decoder decoder = new Decoder();
// Input Stream with the byte[] voice data
InputStream bis = new ByteArrayInputStream(data);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
Bitstream bits = new Bitstream(bis);
// Decoding MP3 data into PCM in a PCM BUFFER
SampleBuffer pcmBuffer = (SampleBuffer) decoder.decodeFrame(bits.readFrame(), bits);
// Sending the PCMBuffer data into Audio Track to play it
mTrack.write(pcmBuffer.getBuffer(), 0, pcmBuffer.getBufferLength());
bits.closeFrame();
}
And here is my AudioTrack initialization
mTrack= new AudioTrack.Builder()
.setAudioAttributes(new AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_MEDIA)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build())
.setAudioFormat(new AudioFormat.Builder()
.setEncoding(AudioFormat.ENCODING_PCM_16BIT)
.setSampleRate(48000)
.setChannelMask(AudioFormat.CHANNEL_OUT_STEREO)
.build())
.setBufferSizeInBytes(AudioTrack.getMinBufferSize(48000, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT))
.build();
mTrack.play();
So to understand what was happening I tried to lag each data contained in the pcmBuffer. It seems like a huge part of those data where 0 at the very beginning of the buffer (I'd say 1/5 of the buffer is 0, all of them located at the beginning). So then I took an oscilloscope and tried to get the signal my Android phone was receiving. Here is the result:
As you can see, each frame is present, but as some "blank" or 0 data values. Those 0 in the beginning of each frame makes the signal hashed and pretty annoying to listen.
I have no idea whether this comes from the MP3 signal itself, the way I'm playing it, AudioTrack, JLayer, or the way I'm decoding it. So if anyone has an idea it would be really awesome.
EDIT :
Found out something interesting. By decoding each frame header I can have access to a lot of information such as the time in ms for each frame. I logged it :
System.out.println(bits.readFrame().ms_per_frame());
I found out that each of my frames are 24ms. When I look back at the oscilloscope, I can see that each frame actually take 24ms, but the beginning/end of each frame is filled with 0. So first of all, is it a decoding problem ? If it is not, how can I have a clear signal without small breakup in each frame ?
I've been printing all the data that each frame is sending me, each frame starts with a looot of zeros. How am I supposed to have a clear signal if each frame have some kind of audio void ?
If I print the MP3 data that I'm receiving each frame (96 bits), I have the first four bytes (probably the header?) that always have the same value :
"-1, -5, 20, -60"
Then I have a fifth bit that is always equal to 0, and sometimes a sixth bit that is also equal to 0. Should I be removing those ?
I have followed this example to convert raw audio data coming from AudioRecord to mp3, and it happened successfully, if I store this data in a file the mp3 file and play with music player then it is audible.
Now my question is instead of storing mp3 data to a file i need to play it with AudioTrack, the data is coming from the Red5 media server as live stream, but the problem is AudioTrack can only play PCM data, so i can only hear noise from my data.
Now i am using JLayer to my require task.
My code is as follows.
int readresult = recorder.read(audioData, 0, recorderBufSize);
int encResult = SimpleLame.encode(audioData,audioData, readresult, mp3buffer);
and this mp3buffer data is sent to other user by Red5 stream.
data received at other user is in form of stream, so for playing it the code is
Bitstream bitstream = new Bitstream(data.read());
Decoder decoder = new Decoder();
Header frameHeader = bitstream.readFrame();
SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);
short[] pcm = output.getBuffer();
player.write(pcm, 0, pcm.length);
But my code freezes at bitstream.readFrame after 2-3 seconds, also no sound is produced before that.
Any guess what will be the problem? Any suggestion is appreciated.
Note: I don't need to store the mp3 data, so i cant use MediaPlayer, as it requires a file or filedescriptor.
just a tip, but try to
output.close();
bitstream.closeFrame();
after yours write code. I'm processing MP3 same as you do, but I'm closing buffers after usage and I have no problem.
Second tip - do it in Thread or any other Background process. As you mentioned these deaf 2 seconds, media player may wait until you process whole stream because you are loading it in same thread.
Try both tips (and you should anyway). In first, problem could be in internal buffers; In second you probably fulfill Media's input buffer and you locked app (same thread, full buffer cannot receive your input and code to play it and release same buffer is not invoked because writing locks it...)
Also, if you don't doing it now, check for 'frameHeader == null' due to file end.
Good luck.
You need to loop through the frames like this:
While (frameHeader = bitstream.readFrame()){
SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);
short[] pcm = output.getBuffer();
player.write(pcm, 0, pcm.length);
bitstream.close();
}
And make sure you are not running them on main thread.(This is probably the reason of freezing.)
I am reading the Android documents about MediaCodec and other online tutorials/examples. As I understand it, the way to use the MediaCodec is like this (decoder example in pseudo code):
//-------- prepare audio decoder, format, buffers, and files --------
MediaExtractor extractor;
MediaCodec codec;
ByteBuffer[] codecInputBuffers;
ByteBuffer[] codecOutputBuffers;
extractor = new MediaExtractor();
extractor.setDataSource();
MediaFormat format = extractor.getTrackFormat(0);
//---------------- start decoding ----------------
codec = MediaCodec.createDecoderByType(mime);
codec.configure(format, null /* surface */, null /* crypto */, 0 /* flags */);
codec.start();
codecInputBuffers = codec.getInputBuffers();
codecOutputBuffers = codec.getOutputBuffers();
extractor.selectTrack(0);
//---------------- decoder loop ----------------
while (MP3_file_not_EOS) {
//-------- grasp control of input buffer from codec --------
codec.dequeueInputBuffer();
//---- fill input buffer with data from MP3 file ----
extractor.readSampleData();
//-------- release input buffer so codec can have it --------
codec.queueInputBuffer();
//-------- grasp control of output buffer from codec --------
codec.dequeueOutputBuffer();
//-- copy PCM samples from output buffer into another buffer --
short[] PCMoutBuffer = copy_of(OutputBuffer);
//-------- release output buffer so codec can have it --------
codec.releaseOutputBuffer();
//-------- write PCMoutBuffer into a file, or play it -------
}
//---------------- stop decoding ----------------
codec.stop();
codec.release();
Is this the right way to use the MediaCodec? If not, please enlighten me with the right approach. If this is the right way, how do I measure the performance of the MediaCodec? Is it the time difference between when codec.dequeueOutputBuffer() returns and when codec.queueInputBuffer() returns? I'd like an accuracy/precision of microseconds. Your ideas and thoughts are appreciated.
(merging comments and expanding slightly)
You can't simply time how long a single buffer submission takes, because the codec might want to queue up more than one buffer before doing anything. You will need to measure it in aggregate, timing the duration of the entire file decode with System.nanoTime(). If you turn the copy_of operation into a no-op and just discard the decoded data, you'll keep the output side (writing the decoded data to disk) out of the calculation.
Excluding the I/O from the input side is more difficult. As noted in the MediaCodec docs, the encoded input/output "is not a stream of bytes, it's a stream of access units". So you'd have to populate any necessary codec-specific-data keys in MediaFormat, and then identify individual frames of input so you can properly feed the codec.
An easier but less accurate approach would be to conduct a separate pass in which you time how long it takes to read the input data, and then subtract that from the total time. In your sample code, you would keep the operations on extractor (like readSampleData), but do nothing with codec (maybe dequeue one buffer and just re-use it every time). That way you only measure the MediaExtractor overhead. The trick here is to run it twice, immediately before the full test, and ignore the results from the first -- the first pass "warms up" the disk cache.
If you're interested in performance differences between devices, it may be the case that the difference in input I/O time, especially from a "warm" cache, is similar enough and small enough that you can just disregard it and not go through all the extra gymnastics.
I have an android application that records AUDIO in raw format
how can i extract a sample of the recording?
for example if the raw file has 3 minutes of audio recorded, i would like to extract 20 seconds of the contents from an arbitrary start position
is this possible?
If the file contains interleaved PCM data with no header and you know the properties of the audio data (sample rate, number of channels, etc) the problem can be solved with basic math:
The number of bytes of audio data per second is sampleRate * bytesPerSample * numChannels.
The starting offset in bytes would be then be bytesPerSecond * offsetInSeconds, and the size of the chunk to read (in bytes) would be bytesPerSecond * lengthInSeconds.