Is there a way to determine the audio format of an audio file in Android? On normal java I do it like this:
File file= new File(...);
AudioInputStream stream = AudioSystem.getAudioInputStream(file);
AudioFormat format= stream.getFormat();
android.media.AudioTrack[1] has the following methods to access information about audio data:
getChannelCount to determine the number of channels
getChannelConfiguration to determine if you deal with mono or stereo content
getSampleRate to find out the sampling frequency
and
getAudioFormat to determine if you deal with 8bit or 16bit sample width.
The AudioTrack.getXXX methods you list merely return the values supplied to the constructor. This doesn't solve the original poster's issue.
Related
I have an RTMP stream I want to play in my app using the Exoplayer library. My setup for that is as follows:
TrackSelector trackSelector = new DefaultTrackSelector();
RtmpDataSourceFactory rtmpDataSourceFactory = new RtmpDataSourceFactory(bandwidthMeter);
ExtractorsFactory extractorsFactory = new DefaultExtractorsFactory();
factory = new ExtractorMediaSource.Factory(rtmpDataSourceFactory);
factory.setExtractorsFactory(extractorsFactory);
createSource();
mPlayer = ExoPlayerFactory.newSimpleInstance(mActivity, trackSelector, new DefaultLoadControl(
new DefaultAllocator(true, C.DEFAULT_BUFFER_SEGMENT_SIZE),
1000, // min buffer
3000, // max buffer
1000, // playback
2000, //playback after rebuffer
DefaultLoadControl.DEFAULT_TARGET_BUFFER_BYTES,
true
));
vwExoPlayer.setPlayer(mPlayer);
mPlayer.addListener(mVideoStreamHandler);
mPlayer.addVideoListener(new VideoListener() {
#Override
public void onVideoSizeChanged(int width, int height, int unappliedRotationDegrees, float pixelWidthHeightRatio) {
Log.d("hasil", "onVideoSizeChanged: w:" + width + ", h:" + height);
String res = width + "x" + height;
resolution.setText(res);
}
#Override
public void onRenderedFirstFrame() {
}
});
Where createSource() is as follows:
private void createSource() {
mMediaSource180 = factory.createMediaSource(Uri.parse(API.GAME_VIDEO_STREAM_URL_180));
mMediaSource360 = factory.createMediaSource(Uri.parse(API.GAME_VIDEO_STREAM_URL_360));
mMediaSource720 = factory.createMediaSource(Uri.parse(API.GAME_VIDEO_STREAM_URL_720));
mMediaSourceAudio = factory.createMediaSource(Uri.parse(API.GAME_AUDIO_STREAM_URL));
}
My current problem is that only the first three ExtractorMediaSources work fine in Exoplayer. The mMediaSourceAudio refuses to play in Exoplayer, but works just fine in the VLC Media Player for Android.
Right now I have a suspicion that the format is AAC-LTP, or whatever AAC variant that requires a codec available in VLC but not in default Android. However, I do not have access to the encoding process so I don't know for sure.
If this isn't the case, what is it?
EDIT:
I've been debugging the BandwidthMeter and added a MediaSourceEventListener. When I use the normal Video sources, onDownstreamFormatChanged() gets called, but not when I use that Audio Stream source.
In addition, the BandwidthMeter works fine, with bytes always downloaded in all parts of the stream and more bytes when the video stream comes in, but only in the Audio only stream that, when I call mPlayer.getBufferedPosition(), the returned value is always 0. Also, when I use the Audio Stream source, no OMX code was called - no decoders were set up.
Am I seeing a malformed audio stream, or do I need to change my Exoplayer's settings?
EDIT 2:
Further debugging reveals that, in all the Video streams and Audio stream, the same FlvExtractor is used. Even though the Video streams have the avc video track encoding and mp4a-latm audio track encoding. Is this normal?
Turns out it's because the stream was recognized to have two tracks/sampleQueues. One Audio track, and one track with null format. That null track was supposed to be the video track, which was supposed to exist according to the stream's flvHeader flag.
For now, I get around this by creating a custom MediaSource using a custom MediaPeriod. Said custom MediaPeriod having code to separate the video and audio tracks of the SampleQueues, then using the audio-only SampleQueue[] instead of the source SampleQueue[] when I want to play the audio-only stream.
Though this gives me another point of concern: There's something one can do to alter the 'has audio track (flag & 0x04) and video track (flag & 0x01)' flag in the rtmp stream, right?
Thanks for the comments, I'm new to ExoPlayer. But your comments helped me in debugging and getting multiple workarounds to the issue.
I tried to use custom MediaSource and custom MediaPeriod to address this audio issue. I have observed video format data coming after audio data incase of video+audio wowza stream, so the function maybeFinishPrepare() will wait for getting both video and audio format tag data before invoking onPrepared, incase if video tagData is received first. Incase of audio data received first, it wont wait and will call onPrepare().
With the above changes, I was able to play audio alone and video_audio wowza streams, where rtmp tagHeader with tagTypes were coming in the order of video tagData and then followed by audio data.
I wasn't able to use the same patch with srs server to play both audio_only and video_audio streams with the same changes. srs server is giving tagData in the order of audio and then video tagData,
So, I debugged further in FlvExtractor. In readFlvHeader, I have overriden the hasAudio and hasVideo variables. These variables will be set based on the first few tagHeaders(5 or 6). I used peekFully on input for 6 times in a loop. In each loop after fetching tagType and tagDataSize, tagDataSize is used to input.advancePeekPosition(), and tagType is used to identify whether we have audio/video format data in tagData. After peeking for first 6 consecutive tagHeaders, I was able to get actual values of hasAudio and hasVideo, and ignored the flvHeaders.flags, which were used to set these variables.
Custom FlvExtractor workaround, looked cleaner than custom MediaSource/MediaPeriod, as we will create those many tracks as necessary, as we are setting proper hasVideo/hasAudio values.
I have a server that encodes real-time voice into mono or stereo mp3 thanks to libmp3lame and sends it chunk by chunk through a WebSocket.
I'm trying to make an Android App that receives those mp3 chunks and play them with the most appropriate Audio player Android have. I went with AudioTrack since it seems pretty easy to add chunks to the player as well as "stream" oriented. (Since what I'm doing is sending to the track some byte array and not a full song that is locally stocked in the Android phone).
Since AudioTrack does not support compressed audio format (such as MP3), I have to decode those chunks into PCM to play them afterward. I'm using the famous JLayer to do this real-time decoding. Thanks to that, I can play each sample into my AudioTrack and hear what the server is sending.
My problem is that the received/player audio is badly hashed. (I can understand whatever the speaker is saying perfectly, but the quality is bad, like if the speaker had a "robotic voice").
Here is the code I'm using to receive/decode/play those byte[].
public void addSample(byte[] data) throws BitstreamException, DecoderException, IOException {
// JLayer decoder
Decoder decoder = new Decoder();
// Input Stream with the byte[] voice data
InputStream bis = new ByteArrayInputStream(data);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
Bitstream bits = new Bitstream(bis);
// Decoding MP3 data into PCM in a PCM BUFFER
SampleBuffer pcmBuffer = (SampleBuffer) decoder.decodeFrame(bits.readFrame(), bits);
// Sending the PCMBuffer data into Audio Track to play it
mTrack.write(pcmBuffer.getBuffer(), 0, pcmBuffer.getBufferLength());
bits.closeFrame();
}
And here is my AudioTrack initialization
mTrack= new AudioTrack.Builder()
.setAudioAttributes(new AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_MEDIA)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build())
.setAudioFormat(new AudioFormat.Builder()
.setEncoding(AudioFormat.ENCODING_PCM_16BIT)
.setSampleRate(48000)
.setChannelMask(AudioFormat.CHANNEL_OUT_STEREO)
.build())
.setBufferSizeInBytes(AudioTrack.getMinBufferSize(48000, AudioFormat.CHANNEL_OUT_STEREO, AudioFormat.ENCODING_PCM_16BIT))
.build();
mTrack.play();
So to understand what was happening I tried to lag each data contained in the pcmBuffer. It seems like a huge part of those data where 0 at the very beginning of the buffer (I'd say 1/5 of the buffer is 0, all of them located at the beginning). So then I took an oscilloscope and tried to get the signal my Android phone was receiving. Here is the result:
As you can see, each frame is present, but as some "blank" or 0 data values. Those 0 in the beginning of each frame makes the signal hashed and pretty annoying to listen.
I have no idea whether this comes from the MP3 signal itself, the way I'm playing it, AudioTrack, JLayer, or the way I'm decoding it. So if anyone has an idea it would be really awesome.
EDIT :
Found out something interesting. By decoding each frame header I can have access to a lot of information such as the time in ms for each frame. I logged it :
System.out.println(bits.readFrame().ms_per_frame());
I found out that each of my frames are 24ms. When I look back at the oscilloscope, I can see that each frame actually take 24ms, but the beginning/end of each frame is filled with 0. So first of all, is it a decoding problem ? If it is not, how can I have a clear signal without small breakup in each frame ?
I've been printing all the data that each frame is sending me, each frame starts with a looot of zeros. How am I supposed to have a clear signal if each frame have some kind of audio void ?
If I print the MP3 data that I'm receiving each frame (96 bits), I have the first four bytes (probably the header?) that always have the same value :
"-1, -5, 20, -60"
Then I have a fifth bit that is always equal to 0, and sometimes a sixth bit that is also equal to 0. Should I be removing those ?
In Android system, ” AudioRecord” could get a sound signal into an array, and the code is:
byte [] buffer = new byte[BUFFER_SIZE];
int r = mAudioRecord.read(buffer, 0, BUFFER_SIZE);
We need to confirm that which exactly is the type of the data, Is it the pressure of the sound, the voltage of the sound or the intensity of sound? In another word, the unit of the data should be Pascal (Pa), Volt (V) or Decibel (Db)?
Thanks a lot !
In link explain basic http://developer.android.com/reference/android/media/AudioRecord.html#read(byte[], int, int)
"""Reads audio data from the audio hardware for recording into a byte array. The format specified in the AudioRecord constructor should be ENCODING_PCM_8BIT to correspond to the data in the array.
ENCODING_PCM_8BIT
Audio data format: PCM 8 bit per sample."""
I have followed this example to convert raw audio data coming from AudioRecord to mp3, and it happened successfully, if I store this data in a file the mp3 file and play with music player then it is audible.
Now my question is instead of storing mp3 data to a file i need to play it with AudioTrack, the data is coming from the Red5 media server as live stream, but the problem is AudioTrack can only play PCM data, so i can only hear noise from my data.
Now i am using JLayer to my require task.
My code is as follows.
int readresult = recorder.read(audioData, 0, recorderBufSize);
int encResult = SimpleLame.encode(audioData,audioData, readresult, mp3buffer);
and this mp3buffer data is sent to other user by Red5 stream.
data received at other user is in form of stream, so for playing it the code is
Bitstream bitstream = new Bitstream(data.read());
Decoder decoder = new Decoder();
Header frameHeader = bitstream.readFrame();
SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);
short[] pcm = output.getBuffer();
player.write(pcm, 0, pcm.length);
But my code freezes at bitstream.readFrame after 2-3 seconds, also no sound is produced before that.
Any guess what will be the problem? Any suggestion is appreciated.
Note: I don't need to store the mp3 data, so i cant use MediaPlayer, as it requires a file or filedescriptor.
just a tip, but try to
output.close();
bitstream.closeFrame();
after yours write code. I'm processing MP3 same as you do, but I'm closing buffers after usage and I have no problem.
Second tip - do it in Thread or any other Background process. As you mentioned these deaf 2 seconds, media player may wait until you process whole stream because you are loading it in same thread.
Try both tips (and you should anyway). In first, problem could be in internal buffers; In second you probably fulfill Media's input buffer and you locked app (same thread, full buffer cannot receive your input and code to play it and release same buffer is not invoked because writing locks it...)
Also, if you don't doing it now, check for 'frameHeader == null' due to file end.
Good luck.
You need to loop through the frames like this:
While (frameHeader = bitstream.readFrame()){
SampleBuffer output = (SampleBuffer) decoder.decodeFrame(frameHeader, bitstream);
short[] pcm = output.getBuffer();
player.write(pcm, 0, pcm.length);
bitstream.close();
}
And make sure you are not running them on main thread.(This is probably the reason of freezing.)
I am currently developing an application that needs to record audio, encode it as AAC, stream it, and do the same in reverse - receiving stream, decoding AAC and playing audio.
I successfully recorded AAC (wrapped in a MP4 container) using the MediaRecorder, and successfully up-streamed audio using the AudioRecord class. But, I need to be able to encode the audio as I stream it, but none of these classes seem to help me do that.
I researched a bit, and found that most people that have this problem end up using a native library like ffmpeg.
But I was wondering, since Android already includes StageFright, that has native code that can do encoding and decoding (for example, AAC encoding and AAC decoding), is there a way to use this native code on my application? How can I do that?
It would be great if I only needed to implement some JNI classes with their native code. Plus, since it is an Android library, it would be no licensing problems whatever (correct me if I'm wrong).
yes, you can use libstagefright, it's very powerful.
Since stagefright is not exposed to NDK, so you will have to do extra work.
There are two ways:
(1) build your project using android full source tree. This way takes a few days to setup, once ready, it's very easy, and you can take full advantage of stagefright.
(2) you can just copy include file to your project, it's inside this folder:
android-4.0.4_r1.1/frameworks/base/include/media/stagefright
then you will have export the library function by dynamically loading libstagefright.so, and you can link with your jni project.
To encode/decode using statgefright, it's very straightforward, a few hundred of lines can will do.
I used stagefright to capture screenshots to create a video, which will be available in our Android VNC server, to be released soon.
the following is a snippet, I think it's better than using ffmpeg to encode a movie. You can add audio source as well.
class ImageSource : public MediaSource {
ImageSource(int width, int height, int colorFormat)
: mWidth(width),
mHeight(height),
mColorFormat(colorFormat)
{
}
virtual status_t read(
MediaBuffer **buffer, const MediaSource::ReadOptions *options) {
// here you can fill the buffer with your pixels
}
...
};
int width = 720;
int height = 480;
sp<MediaSource> img_source = new ImageSource(width, height, colorFormat);
sp<MetaData> enc_meta = new MetaData;
// enc_meta->setCString(kKeyMIMEType, MEDIA_MIMETYPE_VIDEO_H263);
// enc_meta->setCString(kKeyMIMEType, MEDIA_MIMETYPE_VIDEO_MPEG4);
enc_meta->setCString(kKeyMIMEType, MEDIA_MIMETYPE_VIDEO_AVC);
enc_meta->setInt32(kKeyWidth, width);
enc_meta->setInt32(kKeyHeight, height);
enc_meta->setInt32(kKeySampleRate, kFramerate);
enc_meta->setInt32(kKeyBitRate, kVideoBitRate);
enc_meta->setInt32(kKeyStride, width);
enc_meta->setInt32(kKeySliceHeight, height);
enc_meta->setInt32(kKeyIFramesInterval, kIFramesIntervalSec);
enc_meta->setInt32(kKeyColorFormat, colorFormat);
sp<MediaSource> encoder =
OMXCodec::Create(
client.interface(), enc_meta, true, image_source);
sp<MPEG4Writer> writer = new MPEG4Writer("/sdcard/screenshot.mp4");
writer->addSource(encoder);
// you can add an audio source here if you want to encode audio as well
//
//sp<MediaSource> audioEncoder =
// OMXCodec::Create(client.interface(), encMetaAudio, true, audioSource);
//writer->addSource(audioEncoder);
writer->setMaxFileDuration(kDurationUs);
CHECK_EQ(OK, writer->start());
while (!writer->reachedEOS()) {
fprintf(stderr, ".");
usleep(100000);
}
err = writer->stop();