I have an android application that records AUDIO in raw format
how can i extract a sample of the recording?
for example if the raw file has 3 minutes of audio recorded, i would like to extract 20 seconds of the contents from an arbitrary start position
is this possible?
If the file contains interleaved PCM data with no header and you know the properties of the audio data (sample rate, number of channels, etc) the problem can be solved with basic math:
The number of bytes of audio data per second is sampleRate * bytesPerSample * numChannels.
The starting offset in bytes would be then be bytesPerSecond * offsetInSeconds, and the size of the chunk to read (in bytes) would be bytesPerSecond * lengthInSeconds.
Related
I have multiple audio files converted into byte array for each, and i have start time and end time of each audio file (these audio files are converted from video so i have its time stamps).
Now i want to merge these byte array of audio files to create one single audio mp3 file. I tried to simply merge these two array and it successfully creates a single mp3 file but it doesn't have any pause in between.
I want to add silent audio between each file based on the difference of start time and end time of each file. To achieve that, i added 0 bytes for each second difference between each array but it doesn't add sufficient pause.
Is there any way to determine how many bytes it will require to add for each second?
Every audio has 8000 sample Rate Hertz.
Following is the code i am using to add 0 bytes.
ArrayList<Byte> buffer = previousTranslation.getAudioBuffer();
// to get gape in seconds between two audio files from a video
int diff = (int)differenceBetweenTwoCalendar(
previousTranslation.getSentenceEndTime(),
translation.getSentenceStartTime());
// Add 0 bytes for each second between two audio buffers(bytes array)
for(int j = 0; j < diff; j++)
buffer.add((byte) 0);
// Add seconds part of audio buffer byte array
buffer.addAll(translation.getAudioBuffer();
I have a server that encodes real-time voice into mono or stereo mp3 thanks to libmp3lame and sends it chunk by chunk through a WebSocket.
I'm trying to make an Android App that receives those mp3 chunks and play them with the most appropriate Audio player Android have. I went with AudioTrack since it seems pretty easy to add chunks to the player as well as "stream" oriented. (Since what I'm doing is sending to the track some byte array and not a full song that is locally stocked in the Android phone).
Since AudioTrack does not support compressed audio format (such as MP3), I have to decode those chunks into PCM to play them afterward. I'm using the famous JLayer to do this real-time decoding. Thanks to that, I can play each sample into my AudioTrack and hear what the server is sending.
My problem is that the received/player audio is badly hashed. (I can understand whatever the speaker is saying perfectly, but the quality is bad, like if the speaker had a "robotic voice").
Here is the code I'm using to receive/decode/play those byte[].
public void addSample(byte[] data) throws BitstreamException, DecoderException, IOException {
// JLayer decoder
Decoder decoder = new Decoder();
// Input Stream with the byte[] voice data
InputStream bis = new ByteArrayInputStream(data);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
Bitstream bits = new Bitstream(bis);
// Decoding MP3 data into PCM in a PCM BUFFER
SampleBuffer pcmBuffer = (SampleBuffer) decoder.decodeFrame(bits.readFrame(), bits);
// Sending the PCMBuffer data into Audio Track to play it
mTrack.write(pcmBuffer.getBuffer(), 0, pcmBuffer.getBufferLength());
bits.closeFrame();
}
And here is my AudioTrack initialization
mTrack= new AudioTrack.Builder()
.setAudioAttributes(new AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_MEDIA)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build())
.setAudioFormat(new AudioFormat.Builder()
.setEncoding(AudioFormat.ENCODING_PCM_16BIT)
.setSampleRate(48000)
.setChannelMask(AudioFormat.CHANNEL_OUT_STEREO)
.build())
.setBufferSizeInBytes(AudioTrack.getMinBufferSize(48000, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT))
.build();
mTrack.play();
So to understand what was happening I tried to lag each data contained in the pcmBuffer. It seems like a huge part of those data where 0 at the very beginning of the buffer (I'd say 1/5 of the buffer is 0, all of them located at the beginning). So then I took an oscilloscope and tried to get the signal my Android phone was receiving. Here is the result:
As you can see, each frame is present, but as some "blank" or 0 data values. Those 0 in the beginning of each frame makes the signal hashed and pretty annoying to listen.
I have no idea whether this comes from the MP3 signal itself, the way I'm playing it, AudioTrack, JLayer, or the way I'm decoding it. So if anyone has an idea it would be really awesome.
EDIT :
Found out something interesting. By decoding each frame header I can have access to a lot of information such as the time in ms for each frame. I logged it :
System.out.println(bits.readFrame().ms_per_frame());
I found out that each of my frames are 24ms. When I look back at the oscilloscope, I can see that each frame actually take 24ms, but the beginning/end of each frame is filled with 0. So first of all, is it a decoding problem ? If it is not, how can I have a clear signal without small breakup in each frame ?
I've been printing all the data that each frame is sending me, each frame starts with a looot of zeros. How am I supposed to have a clear signal if each frame have some kind of audio void ?
If I print the MP3 data that I'm receiving each frame (96 bits), I have the first four bytes (probably the header?) that always have the same value :
"-1, -5, 20, -60"
Then I have a fifth bit that is always equal to 0, and sometimes a sixth bit that is also equal to 0. Should I be removing those ?
In Android system, ” AudioRecord” could get a sound signal into an array, and the code is:
byte [] buffer = new byte[BUFFER_SIZE];
int r = mAudioRecord.read(buffer, 0, BUFFER_SIZE);
We need to confirm that which exactly is the type of the data, Is it the pressure of the sound, the voltage of the sound or the intensity of sound? In another word, the unit of the data should be Pascal (Pa), Volt (V) or Decibel (Db)?
Thanks a lot !
In link explain basic http://developer.android.com/reference/android/media/AudioRecord.html#read(byte[], int, int)
"""Reads audio data from the audio hardware for recording into a byte array. The format specified in the AudioRecord constructor should be ENCODING_PCM_8BIT to correspond to the data in the array.
ENCODING_PCM_8BIT
Audio data format: PCM 8 bit per sample."""
I'd like to capture the outgoing audio from a game and record it into an audio file as it's played. Is this possible within the framework in OpenSL? Like by connecting the OutputMix to an AudioRecorder, or something?
You could register a callback to the queue and obtain the output buffer before / after it is enqueued into the buffer queue for output. You could have a wavBuffer (a short array the length of the buffer size) that is written into on each enqueueing of a new buffer. The contents of this buffer are then written to a file.
outBuffer = p->outputBuffer[p->currentOutputBuffer]; // obtain float buffer
for ( int i = 0; i < bufferSize; ++i )
wavBuffer = ( short ) outBuffer[ i ] * 32768; // convert float to short
// now append contents of wavBuffer into a file
The basic OpenSL setup for the queue callback is explained in some detail on this page
And a very basic means of creating a WAV file in C++ can be found here note that you must have a pretty definitive idea of the actual size of the total WAV file as it's part of its header.
I'm porting Java project to Android.
As you know, the Android does not have javax.sound package.
I need to calculate frameLength.
My sound file size is 283KB. And frame size is 4 and frame rate is 44100 and sample size in bits is 16.
The frame length was 69632 when I used just pure Java.
Do you know any equation to get this?
Thank you.
For the raw PCM data, basically, you have 4 bytes per sample.
((283 KB) * (1024 bytes per KB)) / (4 bytes per frame) ==> 72448
But a wav can include a lot besides the raw PCM, e.g., song title or artist info.
Here's some more info about wav format. You might have to load the file as raw bytes to parse the header, but the header has the frame size in a predictable location.
Maybe someone else with Android experience has already concocted a method.
Maybe Google should treat Java as an intact entity and properly license and implement it.