I am trying to make a call recording app in Android. I am using loudspeaker to record both uplink and downlink audio. The only problem I am facing is the volume is too low. I've increased the volume of device using AudioManager to max and it can't go beyond that.
I've first used MediaRecorder, but since it had limited functions and provides compressed audio, I've tried with AudioRecorder. Still I havn't figured out how to increase the audio. I've checked on projects on Github too, but it's of no use. I've searched on stackoverflow for last two weeks, but couldn't find anything at all.
I am quite sure that it's possible, since many other apps are doing it. For instance Automatic Call recorder does that.
I understand that I have to do something with the audio buffer, but I am not quite sure what needs to be done on that. Can you guide me on that.
Update:-
I am sorry that I forgot to mention that I am already using Gain. My code is almost similar to RehearsalAssistant (in fact I derived it from there). The gain doesn't work for more than 10dB and that doesn't increase the audio volume too much. What I wanted is I should be able to listen to the audio without putting my ear on the speaker which is what lacking in my code.
I've asked a similar question on functioning of the volume/loudness at SoundDesign SE here. It mentions that the Gain and loudness is related but it doesn't set the actual loudness level. I am not sure how things work, but I am determined to get the loud volume output.
You obviously have the AudioRecord stuff running, so I skip the decision for sampleRate and inputSource. The main point is that you need to appropriately manipulate each sample of your recorded data in your recording loop to increase the volume. Like so:
int minRecBufBytes = AudioRecord.getMinBufferSize( sampleRate, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT );
// ...
audioRecord = new AudioRecord( inputSource, sampleRate, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, minRecBufBytes );
// Setup the recording buffer, size, and pointer (in this case quadruple buffering)
int recBufferByteSize = minRecBufBytes*2;
byte[] recBuffer = new byte[recBufferByteSize];
int frameByteSize = minRecBufBytes/2;
int sampleBytes = frameByteSize;
int recBufferBytePtr = 0;
audioRecord.startRecording();
// Do the following in the loop you prefer, e.g.
while ( continueRecording ) {
int reallySampledBytes = audioRecord.read( recBuffer, recBufferBytePtr, sampleBytes );
int i = 0;
while ( i < reallySampledBytes ) {
float sample = (float)( recBuffer[recBufferBytePtr+i ] & 0xFF
| recBuffer[recBufferBytePtr+i+1] << 8 );
// THIS is the point were the work is done:
// Increase level by about 6dB:
sample *= 2;
// Or increase level by 20dB:
// sample *= 10;
// Or if you prefer any dB value, then calculate the gain factor outside the loop
// float gainFactor = (float)Math.pow( 10., dB / 20. ); // dB to gain factor
// sample *= gainFactor;
// Avoid 16-bit-integer overflow when writing back the manipulated data:
if ( sample >= 32767f ) {
recBuffer[recBufferBytePtr+i ] = (byte)0xFF;
recBuffer[recBufferBytePtr+i+1] = 0x7F;
} else if ( sample <= -32768f ) {
recBuffer[recBufferBytePtr+i ] = 0x00;
recBuffer[recBufferBytePtr+i+1] = (byte)0x80;
} else {
int s = (int)( 0.5f + sample ); // Here, dithering would be more appropriate
recBuffer[recBufferBytePtr+i ] = (byte)(s & 0xFF);
recBuffer[recBufferBytePtr+i+1] = (byte)(s >> 8 & 0xFF);
}
i += 2;
}
// Do other stuff like saving the part of buffer to a file
// if ( reallySampledBytes > 0 ) { ... save recBuffer+recBufferBytePtr, length: reallySampledBytes
// Then move the recording pointer to the next position in the recording buffer
recBufferBytePtr += reallySampledBytes;
// Wrap around at the end of the recording buffer, e.g. like so:
if ( recBufferBytePtr >= recBufferByteSize ) {
recBufferBytePtr = 0;
sampleBytes = frameByteSize;
} else {
sampleBytes = recBufferByteSize - recBufferBytePtr;
if ( sampleBytes > frameByteSize )
sampleBytes = frameByteSize;
}
}
Thanks to Hartmut and beworker for the solution. Hartmut's code did worked at near 12-14 dB. I did merged the code from the sonic library too to increase volume, but that increase too much noise and distortion, so I kept the volume at 1.5-2.0 and instead tried to increase gain. I got decent sound volume which doesn't sound too loud in phone, but when listened on a PC sounds loud enough. Looks like that's the farthest I could go.
I am posting my final code to increase the loudness. Be aware that using increasing mVolume increases too much noise. Try to increase gain instead.
private AudioRecord.OnRecordPositionUpdateListener updateListener = new AudioRecord.OnRecordPositionUpdateListener() {
#Override
public void onPeriodicNotification(AudioRecord recorder) {
aRecorder.read(bBuffer, bBuffer.capacity()); // Fill buffer
if (getState() != State.RECORDING)
return;
try {
if (bSamples == 16) {
shBuffer.rewind();
int bLength = shBuffer.capacity(); // Faster than accessing buffer.capacity each time
for (int i = 0; i < bLength; i++) { // 16bit sample size
short curSample = (short) (shBuffer.get(i) * gain);
if (curSample > cAmplitude) { // Check amplitude
cAmplitude = curSample;
}
if(mVolume != 1.0f) {
// Adjust output volume.
int fixedPointVolume = (int)(mVolume*4096.0f);
int value = (curSample*fixedPointVolume) >> 12;
if(value > 32767) {
value = 32767;
} else if(value < -32767) {
value = -32767;
}
curSample = (short)value;
/*scaleSamples(outputBuffer, originalNumOutputSamples, numOutputSamples - originalNumOutputSamples,
mVolume, nChannels);*/
}
shBuffer.put(curSample);
}
} else { // 8bit sample size
int bLength = bBuffer.capacity(); // Faster than accessing buffer.capacity each time
bBuffer.rewind();
for (int i = 0; i < bLength; i++) {
byte curSample = (byte) (bBuffer.get(i) * gain);
if (curSample > cAmplitude) { // Check amplitude
cAmplitude = curSample;
}
bBuffer.put(curSample);
}
}
bBuffer.rewind();
fChannel.write(bBuffer); // Write buffer to file
payloadSize += bBuffer.capacity();
} catch (IOException e) {
e.printStackTrace();
Log.e(NoobAudioRecorder.class.getName(), "Error occured in updateListener, recording is aborted");
stop();
}
}
#Override
public void onMarkerReached(AudioRecord recorder) {
// NOT USED
}
};
simple use MPEG_4 format
To increase the call recording volume use AudioManager as follows:
int deviceCallVol;
AudioManager audioManager;
Start Recording:
audioManager = (AudioManager)context.getSystemService(Context.AUDIO_SERVICE);
//get the current volume set
deviceCallVol = audioManager.getStreamVolume(AudioManager.STREAM_VOICE_CALL);
//set volume to maximum
audioManager.setStreamVolume(AudioManager.STREAM_VOICE_CALL, audioManager.getStreamMaxVolume(AudioManager.STREAM_VOICE_CALL), 0);
recorder.setAudioSource(MediaRecorder.AudioSource.VOICE_CALL);
recorder.setOutputFormat(MediaRecorder.OutputFormat.MPEG_4);
recorder.setAudioEncoder(MediaRecorder.AudioEncoder.AAC);
recorder.setAudioEncodingBitRate(32);
recorder.setAudioSamplingRate(44100);
Stop Recording:
//revert volume to initial state
audioManager.setStreamVolume(AudioManager.STREAM_VOICE_CALL, deviceCallVol, 0);
In my app I use an open source sonic library. Its main purpose is to speed up / slow down speech, but besides this it allows to increase loudness too. I apply it to playback, but it must work for recording similarly. Just pass your samples through it before compressing them. It has a Java interface too. Hope this helps.
Related
I am encoding raw data on Android using ffmpeg libraries. The native code reads the audio data from an external device and encodes it into AAC format in an mp4 container. I am finding that the audio data is successfully encoded (I can play it with Groove Music, my default Windows audio player). But the metadata, as reported by ffprobe, has an incorrect duration of 0.05 secs - it's actually several seconds long. Also the bitrate is reported wrongly as around 65kbps even though I specified 192kbps.
I've tried recordings of various durations but the result is always similar - the (very small) duration and bitrate. I've tried various other audio players such as Quicktime but they play only the first 0.05 secs or so of the audio.
I've removed error-checking from the following. The actual code checks every call and no problems are reported.
Initialisation:
void AudioWriter::initialise( const char *filePath )
{
AVCodecID avCodecID = AVCodecID::AV_CODEC_ID_AAC;
int bitRate = 192000;
char *containerFormat = "mp4";
int sampleRate = 48000;
int nChannels = 2;
mAvCodec = avcodec_find_encoder(avCodecID);
mAvCodecContext = avcodec_alloc_context3(mAvCodec);
mAvCodecContext->codec_id = avCodecID;
mAvCodecContext->codec_type = AVMEDIA_TYPE_AUDIO;
mAvCodecContext->sample_fmt = AV_SAMPLE_FMT_FLTP;
mAvCodecContext->bit_rate = bitRate;
mAvCodecContext->sample_rate = sampleRate;
mAvCodecContext->channels = nChannels;
mAvCodecContext->channel_layout = AV_CH_LAYOUT_STEREO;
avcodec_open2( mAvCodecContext, mAvCodec, nullptr );
mAvFormatContext = avformat_alloc_context();
avformat_alloc_output_context2(&mAvFormatContext, nullptr, containerFormat, nullptr);
mAvFormatContext->audio_codec = mAvCodec;
mAvFormatContext->audio_codec_id = avCodecID;
mAvOutputStream = avformat_new_stream(mAvFormatContext, mAvCodec);
avcodec_parameters_from_context(mAvOutputStream->codecpar, mAvCodecContext);
if (!(mAvFormatContext->oformat->flags & AVFMT_NOFILE))
{
avio_open(&mAvFormatContext->pb, filePath, AVIO_FLAG_WRITE);
}
if ( mAvFormatContext->oformat->flags & AVFMT_GLOBALHEADER )
{
mAvCodecContext->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
}
avformat_write_header(mAvFormatContext, NULL);
mAvAudioFrame = av_frame_alloc();
mAvAudioFrame->nb_samples = mAvCodecContext->frame_size;
mAvAudioFrame->format = mAvCodecContext->sample_fmt;
mAvAudioFrame->channel_layout = mAvCodecContext->channel_layout;
av_samples_get_buffer_size(NULL, mAvCodecContext->channels, mAvCodecContext->frame_size,
mAvCodecContext->sample_fmt, 0);
av_frame_get_buffer(mAvAudioFrame, 0);
av_frame_make_writable(mAvAudioFrame);
mAvPacket = av_packet_alloc();
}
Encoding:
// SoundRecording is a custom class with the raw samples to be encoded
bool AudioWriter::encodeToContainer( SoundRecording *soundRecording )
{
int ret;
int frameCount = mAvCodecContext->frame_size;
int nChannels = mAvCodecContext->channels;
float *buf = new float[frameCount*nChannels];
while ( soundRecording->hasReadableData() )
{
//Populate the frame
int samplesRead = soundRecording->read( buf, frameCount*nChannels );
// Planar data
int nFrames = samplesRead/nChannels;
for ( int i = 0; i < nFrames; ++i )
{
for (int c = 0; c < nChannels; ++c )
{
samples[c][i] = buf[nChannels*i +c];
}
}
// Fill a gap at the end with silence
if ( samplesRead < frameCount*nChannels )
{
for ( int i = samplesRead; i < frameCount*nChannels; ++i )
{
for (int c = 0; c < nChannels; ++c )
{
samples[c][i] = 0.0;
}
}
}
encodeFrame( mAvAudioFrame ) )
}
finish();
}
bool AudioWriter::encodeFrame( AVFrame *frame )
{
//send the frame for encoding
int ret;
if ( frame != nullptr )
{
frame->pts = mAudFrameCounter++;
}
avcodec_send_frame(mAvCodecContext, frame );
while (ret >= 0)
{
ret = avcodec_receive_packet(mAvCodecContext, mAvPacket);
if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF )
{
break;
}
else
if (ret < 0) {
return false;
}
av_packet_rescale_ts(mAvPacket, mAvCodecContext->time_base, mAvOutputStream->time_base);
mAvPacket->stream_index = mAvOutputStream->index;
av_interleaved_write_frame(mAvFormatContext, mAvPacket);
av_packet_unref(mAvPacket);
}
return true;
}
void AudioWriter::finish()
{
// Flush by sending a null frame
encodeFrame( nullptr );
av_write_trailer(mAvFormatContext);
}
Since the resultant file contains the recorded music, the code to manipulate the audio data seems to be correct (unless I am overwriting other memory somehow).
The inaccurate duration and bitrate suggest that information concerning time is not being properly managed. I set the pts of the frames using a simple increasing integer. I'm unclear what the code that sets the timestamp and stream index achieves - and whether it's even necessary: I copied it from supposedly working code but I've seen other code without it.
Can anyone see what I'm doing wrong?
The timestamp need to be correct. Set the time_base to 1/sample_rate and increment the timestamp by 1024 each frame. Note: 1024 is aac specific. If you change codecs, you need to change the frame size.
I'm working on adding a live broadcasting feature to an Android app. I do so through RTMP and make use of the DailyMotion Android SDK, which in turn makes use of Kickflip.
Everything works perfect, except for the playback of the audio on the website (which makes use of Flash). The audio does work in VLC, so it seems to be an issue with Flash being unable to decode the AAC audio.
For the audio I instantiate an encoder with the "audio/mp4a-latm" mime type. The Android developer docs state the following about this mime type: "audio/mp4a-latm" - AAC audio (note, this is raw AAC packets, not packaged in LATM!). I expect that my problem lies here, but yet I have not been able to find a solution for it.
Pretty much all my research, including this SO question about the matter pointed me in the direction of adding an ADTS header to the audio byte array. That results in the following code in the writeSampleData method:
boolean isHeader = false;
if ((bufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG) != 0) {
isHeader = true;
} else {
pts = bufferInfo.presentationTimeUs - mFirstPts;
}
if (mFirstPts != -1 && pts >= 0) {
pts /= 1000;
byte data[] = new byte[bufferInfo.size + 7];
addADTStoPacket(data, bufferInfo.size + 7);
encodedData.position(bufferInfo.offset);
encodedData.get(data, 7, bufferInfo.size);
addDataPacket(new AudioPacket(data, isHeader, pts, mAudioFirstByte));
}
The addADTStoPacket method is identical to the one in the above mentioned SO post, but I will show it here regardless:
private void addADTStoPacket(byte[] packet, int packetLen) {
int profile = 2; //AAC LC
//39=MediaCodecInfo.CodecProfileLevel.AACObjectELD;
int freqIdx = 4; //44.1KHz
int chanCfg = 1; //CPE
// fill in ADTS data
packet[0] = (byte)0xFF;
packet[1] = (byte)0xF9;
packet[2] = (byte)(((profile-1)<<6) + (freqIdx<<2) +(chanCfg>>2));
packet[3] = (byte)(((chanCfg&3)<<6) + (packetLen>>11));
packet[4] = (byte)((packetLen&0x7FF) >> 3);
packet[5] = (byte)(((packetLen&7)<<5) + 0x1F)
packet[6] = (byte)0xFC;
}
The variables in the above method match the settings I have configured in the application, so I'm pretty sure that's fine.
The data is written to the output stream in the following method of the AudioPacket class:
#Override
public void writePayload(OutputStream outputStream) throws IOException {
outputStream.write(mFirstByte);
outputStream.write(mIsAudioSpecificConfic ? 0 : 1);
outputStream.write(mData);
}
Am I missing something here? I could present more code if necessary, but I think this covers the most related parts. Thanks in advance and I really hope someone is able to help, I've been stuck for a couple of days now...
I would like to produce mp4 file by multiplexing audio from mic (overwrite didGetAudioData) and video from camera (overwrite onpreviewframe).However, I encountered the sound and video synchronization problem, video will appear faster than audio. I wondered if the problem related to incompatible configurations or presentationTimeUs, could someone guide me how to fix the problem. Below were my software.
Video configuration
formatVideo = MediaFormat.createVideoFormat(MIME_TYPE_VIDEO, 640, 360);
formatVideo.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420SemiPlanar);
formatVideo.setInteger(MediaFormat.KEY_BIT_RATE, 2000000);
formatVideo.setInteger(MediaFormat.KEY_FRAME_RATE, 30);
formatVideo.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 5);
got video presentationPTS as below,
if(generateIndex == 0) {
videoAbsolutePtsUs = 132;
StartVideoAbsolutePtsUs = System.nanoTime() / 1000L;
}else {
CurrentVideoAbsolutePtsUs = System.nanoTime() / 1000L;
videoAbsolutePtsUs =132+ CurrentVideoAbsolutePtsUs-StartVideoAbsolutePtsUs;
}
generateIndex++;
audio configuration
format = MediaFormat.createAudioFormat(MIME_TYPE, 48000/*sample rate*/, AudioFormat.CHANNEL_IN_MONO /*Channel config*/);
format.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC);
format.setInteger(MediaFormat.KEY_SAMPLE_RATE,48000);
format.setInteger(MediaFormat.KEY_CHANNEL_COUNT,1);
format.setInteger(MediaFormat.KEY_BIT_RATE,64000);
got audio presentationPTS as below,
if(generateIndex == 0) {
audioAbsolutePtsUs = 132;
StartAudioAbsolutePtsUs = System.nanoTime() / 1000L;
}else {
CurrentAudioAbsolutePtsUs = System.nanoTime() / 1000L;
audioAbsolutePtsUs =CurrentAudioAbsolutePtsUs - StartAudioAbsolutePtsUs;
}
generateIndex++;
audioAbsolutePtsUs = getJitterFreePTS(audioAbsolutePtsUs, audioInputLength / 2);
long startPTS = 0;
long totalSamplesNum = 0;
private long getJitterFreePTS(long bufferPts, long bufferSamplesNum) {
long correctedPts = 0;
long bufferDuration = (1000000 * bufferSamplesNum) / 48000;
bufferPts -= bufferDuration; // accounts for the delay of acquiring the audio buffer
if (totalSamplesNum == 0) {
// reset
startPTS = bufferPts;
totalSamplesNum = 0;
}
correctedPts = startPTS + (1000000 * totalSamplesNum) / 48000;
if(bufferPts - correctedPts >= 2*bufferDuration) {
// reset
startPTS = bufferPts;
totalSamplesNum = 0;
correctedPts = startPTS;
}
totalSamplesNum += bufferSamplesNum;
return correctedPts;
}
Was my issue caused by applying jitter function for audio only? If yes, how could I apply jitter function for video? I also tried to find correct audio and video presentationPTS by https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/EncodeDecodeTest.java. But encodedecodeTest only provided video PTS. That's the reason my implementation used system nanotime for both audio and video. If I want to use video presentationPTS in encodedecodetest, how to construct the compatible audio presentationPTS? Thanks for help!
below are how i queue yuv frame to video mediacodec for reference. For audio part, it is identical except for different presentationPTS.
int videoInputBufferIndex;
int videoInputLength;
long videoAbsolutePtsUs;
long StartVideoAbsolutePtsUs, CurrentVideoAbsolutePtsUs;
int put_v =0;
int get_v =0;
int generateIndex = 0;
public void setByteBufferVideo(byte[] buffer, boolean isUsingFrontCamera, boolean Input_endOfStream){
if(Build.VERSION.SDK_INT >=18){
try{
endOfStream = Input_endOfStream;
if(!Input_endOfStream){
ByteBuffer[] inputBuffers = mVideoCodec.getInputBuffers();
videoInputBufferIndex = mVideoCodec.dequeueInputBuffer(-1);
if (VERBOSE) {
Log.w(TAG,"[put_v]:"+(put_v)+"; videoInputBufferIndex = "+videoInputBufferIndex+"; endOfStream = "+endOfStream);
}
if(videoInputBufferIndex>=0) {
ByteBuffer inputBuffer = inputBuffers[videoInputBufferIndex];
inputBuffer.clear();
inputBuffer.put(mNV21Convertor.convert(buffer));
videoInputLength = buffer.length;
if(generateIndex == 0) {
videoAbsolutePtsUs = 132;
StartVideoAbsolutePtsUs = System.nanoTime() / 1000L;
}else {
CurrentVideoAbsolutePtsUs = System.nanoTime() / 1000L;
videoAbsolutePtsUs =132+ CurrentVideoAbsolutePtsUs - StartVideoAbsolutePtsUs;
}
generateIndex++;
if (VERBOSE) {
Log.w(TAG, "[put_v]:"+(put_v)+"; videoAbsolutePtsUs = " + videoAbsolutePtsUs + "; CurrentVideoAbsolutePtsUs = "+CurrentVideoAbsolutePtsUs);
}
if (videoInputLength == AudioRecord.ERROR_INVALID_OPERATION) {
Log.w(TAG, "[put_v]ERROR_INVALID_OPERATION");
} else if (videoInputLength == AudioRecord.ERROR_BAD_VALUE) {
Log.w(TAG, "[put_v]ERROR_ERROR_BAD_VALUE");
}
if (endOfStream) {
Log.w(TAG, "[put_v]:"+(put_v++)+"; [get] receive endOfStream");
mVideoCodec.queueInputBuffer(videoInputBufferIndex, 0, videoInputLength, videoAbsolutePtsUs, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
} else {
Log.w(TAG, "[put_v]:"+(put_v++)+"; receive videoInputLength :" + videoInputLength);
mVideoCodec.queueInputBuffer(videoInputBufferIndex, 0, videoInputLength, videoAbsolutePtsUs, 0);
}
}
}
}catch (Exception x) {
x.printStackTrace();
}
}
}
How I solved this in my application was by setting the PTS of all video and audio frames against a shared "sync clock" (note the sync also means it's thread-safe) that starts when the first video frame (having a PTS 0 on its own) is available. So if audio recording starts sooner than video, audio data is dismissed (doesn't go into encoder) until video starts, and if it starts later, then the first audio PTS will be relative to the start of the entire video.
Ofcourse you are free to allow audio to start first, but players will usually skip or wait for the first video frame anyway. Also be careful that encoded audio frames will arrive "out of order" and MediaMuxer will fail with an error sooner or later. My solution was to queue them all like this: sort them by pts when a new one comes in, then write everything that is older than 500 ms (relative to the newest one) to MediaMuxer, but only those with a PTS higher than the latest written frame. Ideally this means data is smoothly written to MediaMuxer, with a 500 ms delay. Worst case, you will lose a few audio frames.
I'm working on an Android app and I would like to play some short sounds(~ 2s). I tried Soundpool but it doesn't really suit for me since it can't check if a sounds is already playing. So I decided to use AudioTrack.
It works quite good BUT most of the time, when it begins to play a sound there is a "click" sound.
I checked my audiofiles and they are clean.
I use audiotrack on stream mode. I saw that static mode is better for short sounds but after many searchs I still don't understand how to make it work.
I also read that the clicking noise can be caused by the header of the wav file, so maybe the sound would disappear if I skip this header with setPlaybackHeadPosition(int positionInFrames) function (that is supposed to work only in static mode)
Here is my code (so the problem is the ticking noise at the beginning)
int minBufferSize = AudioTrack.getMinBufferSize(44100, AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT);
audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC, 44100, AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT, minBufferSize, AudioTrack.MODE_STREAM);
audioTrack.play();
int i = 0;
int bufferSize = 2048; //don't really know which value to put
audioTrack.setPlaybackRate(88200);
byte [] buffer = new byte[bufferSize];
//there we open the wav file >
InputStream inputStream = getResources().openRawResource(R.raw.abordage);
try {
while((i = inputStream.read(buffer)) != -1)
audioTrack.write(buffer, 0, i);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
try {
inputStream.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Does anyone has a solution to avoid that noise? I tried this, that works sometimes but not everytime. Could someone show me how to implement audiotrack in MODE_STATIC ?
Thank you
I found that Scott Stensland's reasoning was fitting my issue (thanks!).
I eliminated the pop by running a dead simple linear fade-in filter over the beginning of the sample array. The filter makes sample values start from 0 and slowly increase in amplitude to their original value. By always starting at a value of 0 at the zero cross over point the pop never occurs.
A similar fade-out filter was applied at the end of the sample array. The filter duration can easily be adjusted.
import android.util.Log;
public class FadeInFadeOutFilter
{
private static final String TAG = FadeInFadeOutFilter.class.getSimpleName();
private final int filterDurationInSamples;
public FadeInFadeOutFilter ( int filterDurationInSamples )
{
this.filterDurationInSamples = filterDurationInSamples;
}
public void filter ( short[] audioShortArray )
{
filter(audioShortArray, audioShortArray.length);
}
public void filter ( short[] audioShortArray, int audioShortArraySize )
{
if ( audioShortArraySize/2 <= filterDurationInSamples ) {
Log.w(TAG, "filtering audioShortArray with less samples filterDurationInSamples; untested, pops or even crashes may occur. audioShortArraySize="+audioShortArraySize+", filterDurationInSamples="+filterDurationInSamples);
}
final int I = Math.min(filterDurationInSamples, audioShortArraySize/2);
// Perform fade-in and fade-out simultaneously in one loop.
final int fadeOutOffset = audioShortArraySize - filterDurationInSamples;
for ( int i = 0 ; i < I ; i++ ) {
// Fade-in beginning.
final double fadeInAmplification = (double)i/I; // Linear ramp-up 0..1.
audioShortArray[i] = (short)(fadeInAmplification * audioShortArray[i]);
// Fade-out end.
final double fadeOutAmplification = 1 - fadeInAmplification; // Linear ramp-down 1..0.
final int j = i + fadeOutOffset;
audioShortArray[j] = (short)(fadeOutAmplification * audioShortArray[j]);
}
}
}
In my case. It was WAV-header.
And...
byte[] buf44 = new byte[44];
int read = inputStream.read(buf44, 0, 44);
...solved it.
A common cause of audio "pop" is due to the rendering process not starting/stopping sound at the zero cross over point (assuming min/max of -1 to +1 cross over would be 0). Transducers like speakers or ear-buds are at rest (no sound input) which maps to this zero cross level. If an audio rendering process fails to start/stop from/to this zero, the transducer is being asked to do the impossible, namely instantaneously go from its resting state to some non-zero position in its min/max movement range, (or visa versa if you get a "pop" at the end).
Finally, after a lot of experimentation, I made it work without the click noise. Here is my code (unfortunaly, I can't read the size of the inputStream since the getChannel().size() method only works with FileInputStream type)
try{
long totalAudioLen = 0;
InputStream inputStream = getResources().openRawResource(R.raw.abordage); // open the file
totalAudioLen = inputStream.available();
byte[] rawBytes = new byte[(int)totalAudioLen];
AudioTrack track = new AudioTrack(AudioManager.STREAM_MUSIC,
44100,
AudioFormat.CHANNEL_CONFIGURATION_MONO,
AudioFormat.ENCODING_PCM_16BIT,
(int)totalAudioLen,
AudioTrack.MODE_STATIC);
int offset = 0;
int numRead = 0;
track.setPlaybackHeadPosition(100); // IMPORTANT to skip the click
while (offset < rawBytes.length
&& (numRead=inputStream.read(rawBytes, offset, rawBytes.length-offset)) >= 0) {
offset += numRead;
} //don't really know why it works, it reads the file
track.write(rawBytes, 0, (int)totalAudioLen); //write it in the buffer?
track.play(); // launch the play
track.setPlaybackRate(88200);
inputStream.close();
}
catch (FileNotFoundException e) {
Log.e(TAG, "Error loading audio to bytes", e);
} catch (IOException e) {
Log.e(TAG, "Error loading audio to bytes", e);
} catch (IllegalArgumentException e) {
Log.e(TAG, "Error loading audio to bytes", e);
}
So the solution to skip the clicking noise is to use MODE_STATIC and setPlaybackHeadPosition function to skip the beginning of the audio file (that is probably the header or I don't know what).
I hope that this part of code will help someone, I spent too many time trying to find a static mode code sample without finding a way to load a raw ressource.
Edit: After testing this solution on various devices, it appears that they have the clicking noise anyway.
For "setPlaybackHeadPosition" to work, you have to play and pause first. It doesn't work if your track is stopped or not started. Trust me. This is dumb. But it works:
track.play();
track.pause();
track.setPlaybackHeadPosition(100);
// then continue with track.write, track.play, etc.
I’m trying to build a music analytics app for android platform.
the app is using MediaRecorder.AudioSource.MIC
to record the music form the MIC and them encode it PCM 16BIT with 11025 freq, but the recorded audio sample are very low quality is there any way to make it better, decrease the noise?
mRecordInstance = new AudioRecord(MediaRecorder.AudioSource.MIC,FREQUENCY, CHANNEL,ENCODING, minBufferSize);
mRecordInstance.startRecording();
do
{
samplesIn += mRecordInstance.read(audioData, samplesIn, bufferSize - samplesIn);
if(mRecordInstance.getRecordingState() == AudioRecord.RECORDSTATE_STOPPED)
break;
}
while (samplesIn < bufferSize);
Thanks in Advance
The solution above didnt work for me.
So, i searched around and found this article.
Long story short, I used MediaRecorder.AudioSource.VOICE_RECOGNITION instead of AudioSource.MIC, which gave me really good results and noise in the background did reduce very much.
The great thing about this solution is, it can be used with both AudioRecord and MediaRecorder class.
The best combination of SR and buffer size is very device dependant, so your results will vary depending on the hardware. I use this utility to figure out what the best combination is for devices running Android 4.2 and above;
public static DeviceValues getDeviceValues(Context context) {
try {
AudioManager am = (AudioManager) context.getSystemService(Context.AUDIO_SERVICE);
try {
Method getProperty = AudioManager.class.getMethod("getProperty", String.class);
Field bufferSizeField = AudioManager.class.getField("PROPERTY_OUTPUT_FRAMES_PER_BUFFER");
Field sampleRateField = AudioManager.class.getField("PROPERTY_OUTPUT_SAMPLE_RATE");
int bufferSize = Integer.valueOf((String)getProperty.invoke(am, (String)bufferSizeField.get(am)));
int sampleRate = Integer.valueOf((String)getProperty.invoke(am, (String)sampleRateField.get(am)));
return new DeviceValues(sampleRate, bufferSize);
} catch(NoSuchMethodException e) {
return selectBestValue(getValidSampleRates(context));
}
} catch(Exception e) {
return new DeviceValues(DEFAULT_SAMPLE_RATE, DEFAULT_BUFFER_SIZE);
}
}
This uses reflection to check if the getProperty method is available, because this method was introduced in API level 17. If you are developing for a specific device type, you might want to experiment with various buffer sizes and sample rates. The defaults that I use as a fallback are;
private static final int DEFAULT_SAMPLE_RATE = 22050;
private static final int DEFAULT_BUFFER_SIZE = 1024;
Additionally I check the various SR by seeing if getMinBufferSize returns a reasonable value for use;
private static List<DeviceValues> getValidSampleRates(Context context) {
List<DeviceValues> available = new ArrayList<DeviceValues>();
for (int rate : new int[] {8000, 11025, 16000, 22050, 32000, 44100, 48000, 96000}) { // add the rates you wish to check against
int bufferSize = AudioRecord.getMinBufferSize(rate, AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT);
if (bufferSize > 0 && bufferSize < 2048) {
available.add(new DeviceValues(rate, bufferSize * 2));
}
}
return available;
}
This depends on the logic that if getMinBufferSize returns 0, the sample rate is not available in the device. You should experiment with these values for your particular use case.
Though it is an old question following solution will be helpful.
We can use MediaRecorder to record audio with ease.
private void startRecording() {
MediaRecorder recorder = new MediaRecorder();
recorder.setAudioSource(MediaRecorder.AudioSource.MIC);
recorder.setOutputFormat(MediaRecorder.OutputFormat.MPEG_4);
recorder.setAudioEncoder(MediaRecorder.AudioEncoder.AAC);
recorder.setAudioEncodingBitRate(96000)
recorder.setAudioSamplingRate(44100)
recorder.setOutputFile(".../audioName.m4a");
try {
recorder.prepare();
} catch (IOException e) {
Log.e(LOG_TAG, "prepare() failed");
}
recorder.start();
}
Note:
MediaRecorder.AudioEncoder.AAC is used as MediaRecorder.AudioEncoder.AMR_NB encoding is no longer supported in iOS. Reference
AudioEncodingBitRate should be used either 96000 or 128000 as required for clarity of sound.