MediaCodec audio/video muxing issues ond Android

MediaCodec audio/video muxing issues ond Android - android

I am transcoding videos based on the example given by Google (https://android.googlesource.com/platform/cts/+/master/tests/tests/media/src/android/media/cts/ExtractDecodeEditEncodeMuxTest.java)
Basically, transocding of MP4 files works, but on some phones I get some weird results. If for example I transcode a video with audio on an HTC One, the code won't give any errors but the file cannot play afterward on the phone. If I have a 10 seconds video it jumps to almost the last second and you only here some crackling noise. If you play the video with VLC the audio track is completely muted.
I did not alter the code in terms of encoding/decoding and the same code gives correct results on a Nexus 5 or MotoX for example.
Anybody having an idea why it might fail on that specific device?
Best regard and thank you,
Florian

I made it work in Android 4.4.2 devices by following changes:
Set AAC profile to AACObjectLC instead of AACObjectHE
private static final int OUTPUT_AUDIO_AAC_PROFILE = MediaCodecInfo.CodecProfileLevel.AACObjectLC;
During creation of output audio format, use sample rate and channel count of input format instead of fixed values
MediaFormat outputAudioFormat = MediaFormat.createAudioFormat(OUTPUT_AUDIO_MIME_TYPE,
inputFormat.getInteger(MediaFormat.KEY_SAMPLE_RATE),
inputFormat.getInteger(MediaFormat.KEY_CHANNEL_COUNT));
Put a check just before audio muxing audio track to control presentation timestamps. (To avoid timestampUs X < lastTimestampUs X for Audio track error)
if (audioPresentationTimeUsLast == 0) { // Defined in the begining of method
audioPresentationTimeUsLast = audioEncoderOutputBufferInfo.presentationTimeUs;
} else {
if (audioPresentationTimeUsLast > audioEncoderOutputBufferInfo.presentationTimeUs) {
audioEncoderOutputBufferInfo.presentationTimeUs = audioPresentationTimeUsLast + 1;
}
audioPresentationTimeUsLast = audioEncoderOutputBufferInfo.presentationTimeUs;
}
// Write data
if (audioEncoderOutputBufferInfo.size != 0) {
muxer.writeSampleData(outputAudioTrack, encoderOutputBuffer, audioEncoderOutputBufferInfo);
}
Hope this helps...

If original CTS tests fail you need to go to device vendors and ask for fixes

Related

Android oboe c++ Some sounds distorted on playback

I'm using the Android oboe library for high performance audio in a music game.
In the assets folder I have 2 .raw files (both 48000Hz 16 bit PCM wavs and about 60kB)
std_kit_sn.raw
std_kit_ht.raw
These are loaded into memory as SoundRecordings and added to a Mixer. kSampleRateHz is 48000:
stdSN= SoundRecording::loadFromAssets(mAssetManager, "std_kit_sn.raw");
stdHT= SoundRecording::loadFromAssets(mAssetManager, "std_kit_ht.raw");
mMixer.addTrack(stdSN);
mMixer.addTrack(stdFT);
// Create a builder
AudioStreamBuilder builder;
builder.setFormat(AudioFormat::I16);
builder.setChannelCount(1);
builder.setSampleRate(kSampleRateHz);
builder.setCallback(this);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);
LOGD("After creating a builder");
// Open stream
Result result = builder.openStream(&mAudioStream);
if (result != Result::OK){
LOGE("Failed to open stream. Error: %s", convertToText(result));
}
LOGD("After openstream");
// Reduce stream latency by setting the buffer size to a multiple of the burst size
mAudioStream->setBufferSizeInFrames(mAudioStream->getFramesPerBurst() * 2);
// Start the stream
result = mAudioStream->requestStart();
if (result != Result::OK){
LOGE("Failed to start stream. Error: %s", convertToText(result));
}
LOGD("After starting stream");
They are called appropriately to play with standard code (as per Google tutorials) at required times:
stdSN->setPlaying(true);
stdHT->setPlaying(true); //Nasty Sound
The audio callback is standard (as per Google tutorials):
DataCallbackResult SoundFunctions::onAudioReady(AudioStream *mAudioStream, void *audioData, int32_t numFrames) {
// Play the stream
mMixer.renderAudio(static_cast<int16_t*>(audioData), numFrames);
return DataCallbackResult::Continue;
}
The std_kit_sn.raw plays fine. But std_kit_ht.raw has a nasty distortion. Both play with low latency. Why is one playing fine and the other has a nasty distortion?

I loaded your sample project and I believe the distortion you hear is caused by clipping/wraparound during mixing of sounds.
The Mixer object from the sample is a summing mixer. It just adds the values of each track together and outputs the sum.
You need to add some code to reduce the volume of each track to avoid exceeding the limits of an int16_t (although you're welcome to file a bug on the oboe project and I'll try to add this in an upcoming version). If you exceed this limit you'll get wraparound which is causing the distortion.
Additionally, your app is hardcoded to run at 22050 frames/sec. This will result in sub-optimal latency across most mobile devices because the stream is forced to upsample to the audio device's native frame rate. A better approach would be to leave the sample rate undefined when opening the stream - this will give you the optimal frame rate for the current audio device - then use a resampler on your source files to supply audio at this frame rate.

audio latency issues

In the application which I want to create, I face some technical obstacles. I have two music tracks in the application. For example, a user imports the music background as a first track. The second path is a voice recorded by the user to the rhythm of the first track played by the speaker device (or headphones). At this moment we face latency. After recording and playing back in the app, the user hears the loss of synchronisation between tracks, which occurs because of the microphone and speaker latencies.
Firstly, I try to detect the delay by filtering the input sound. I use android’s AudioRecord class, and the method read(). This method fills my short array with audio data.
I found that the initial values of this array are zeros so I decided to cut them out before I will start to write them into the output stream.
So I consider those zeros as a „warmup” latency of the microphone. Is this approach correct? This operation gives some results, but it doesn’t resolve the problem, and at this stage, I’m far away from that.
But the worse case is with the delay between starting the speakers and playing the music. This delay I cannot filter or detect. I tried to create some calibration feature which counts the delay. I play a „beep” sound through the speakers, and when I start to play it, I also begin to measure time. Then, I start recording and listen for this sound being detected by the microphone. When I recognise this sound in the app, I stop measuring time. I repeat this process several times, and the final value is the average from those results. That is how I try to measure the latency of the device. Now, when I have this value, I can simply shift the second track backwards to achieve synchronisation of both records (I will lose some initial milliseconds of the recording, but I skip this case, for now, there are some possibilities to fix it).
I thought that this approach would resolve the problem, but it turned out this is not as simple as I thought. I found two issues here:
1. Delay while playing two tracks simultaneously
2. Random in device audio latency.
The first: I play two tracks using AudioTrack class and I run method play() like this:
val firstTrack = //creating a track
val secondTrack = //creating a track
firstTrack.play()
secondTrack.play()
This code causes delays at the stage of playing tracks. Now, I don’t even have to think about latency while recording; I cannot play two tracks simultaneously without delays. I tested this with some external audio file (not recorded in my app) - I’m starting the same audio file using the code above, and I can see a delay. I also tried it with MediaPlayer class, and I have the same results. In this case, I even try to play tracks when callback OnPreparedListener invoke:
val firstTrack = //AudioPlayer
val secondTrack = //AudioPlayer
second.setOnPreparedListener {
first.start()
second.start()
}
And it doesn’t help.
I know that there is one more class provided by Android called SoundPool. According to the documentation, it can be better with playing tracks simultaneously, but I can’t use it because it supports only small audio files and that can't limit me.
How can I resolve this problem? How can I start playing two tracks precisely at the same time?
The second: Audio latency is not deterministic - sometimes it is smaller, and sometimes it’s huge, and it’s out of my hands. So measuring device latency can help but again - it cannot resolve the problem.
To sum up: is there any solution, which can give me exact latency per device (or app session?) or other triggers which detect actual delay, to provide the best synchronisation while playback two tracks at the same time?
Thank you in advance!

Synchronising audio for karaoke apps is tough. The main issue you seem to be facing is variable latency in the output stream.
This is almost certainly caused by "warm up" latency: the time it takes from hitting "play" on your backing track to the first frame of audio data being rendered by the audio device (e.g. headphones). This can have large variance and is difficult to measure.
The first (and easiest) thing to try is to use MODE_STREAM when constructing your AudioTrack and prime it with bufferSizeInBytes of data prior to calling play (more here). This should result in lower, more consistent "warm up" latency.
A better way is to use the Android NDK to have a continuously running audio stream which is just outputting silence until the moment you hit play, then start sending audio frames immediately. The only latency you have here is the continuous output latency.
If you decide to go down this route I recommend taking a look at the Oboe library (full disclosure: I am one of the authors).
To answer one of your specific questions...
Is there a way to calculate the latency of the audio output stream programatically?
Yes. The easiest way to explain this is with a code sample (this is C++ for the AAudio API but the principle is the same using Java AudioTrack):
// Get the index and time that a known audio frame was presented for playing
int64_t existingFrameIndex;
int64_t existingFramePresentationTime;
AAudioStream_getTimestamp(stream, CLOCK_MONOTONIC, &existingFrameIndex, &existingFramePresentationTime);
// Get the write index for the next audio frame
int64_t writeIndex = AAudioStream_getFramesWritten(stream);
// Calculate the number of frames between our known frame and the write index
int64_t frameIndexDelta = writeIndex - existingFrameIndex;
// Calculate the time which the next frame will be presented
int64_t frameTimeDelta = (frameIndexDelta * NANOS_PER_SECOND) / sampleRate_;
int64_t nextFramePresentationTime = existingFramePresentationTime + frameTimeDelta;
// Assume that the next frame will be written into the stream at the current time
int64_t nextFrameWriteTime = get_time_nanoseconds(CLOCK_MONOTONIC);
// Calculate the latency
*latencyMillis = (double) (nextFramePresentationTime - nextFrameWriteTime) / NANOS_PER_MILLISECOND;
A caveat: This method relies on accurate timestamps being reported by the audio hardware. I know this works on Google Pixel devices but have heard reports that it isn't so accurate on other devices so YMMV.

Following the answer of donturner, here's a Java version (that also uses other methods depending on the SDK version)
/** The audio latency has not been estimated yet */
private static long AUDIO_LATENCY_NOT_ESTIMATED = Long.MIN_VALUE+1;
/** The audio latency default value if we cannot estimate it */
private static long DEFAULT_AUDIO_LATENCY = 100L * 1000L * 1000L; // 100ms
/**
* Estimate the audio latency
*
* Not accurate at all, depends on SDK version, etc. But that's the best
* we can do.
*/
private static void estimateAudioLatency(AudioTrack track, long audioFramesWritten) {
long estimatedAudioLatency = AUDIO_LATENCY_NOT_ESTIMATED;
// First method. SDK >= 19.
if (Build.VERSION.SDK_INT >= 19 && track != null) {
AudioTimestamp audioTimestamp = new AudioTimestamp();
if (track.getTimestamp(audioTimestamp)) {
// Calculate the number of frames between our known frame and the write index
long frameIndexDelta = audioFramesWritten - audioTimestamp.framePosition;
// Calculate the time which the next frame will be presented
long frameTimeDelta = _framesToNanoSeconds(frameIndexDelta);
long nextFramePresentationTime = audioTimestamp.nanoTime + frameTimeDelta;
// Assume that the next frame will be written at the current time
long nextFrameWriteTime = System.nanoTime();
// Calculate the latency
estimatedAudioLatency = nextFramePresentationTime - nextFrameWriteTime;
}
}
// Second method. SDK >= 18.
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED && Build.VERSION.SDK_INT >= 18) {
Method getLatencyMethod;
try {
getLatencyMethod = AudioTrack.class.getMethod("getLatency", (Class<?>[]) null);
estimatedAudioLatency = (Integer) getLatencyMethod.invoke(track, (Object[]) null) * 1000000L;
} catch (Exception ignored) {}
}
// If no method has successfully gave us a value, let's try a third method
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
AudioManager audioManager = (AudioManager) CRT.getInstance().getSystemService(Context.AUDIO_SERVICE);
try {
Method getOutputLatencyMethod = audioManager.getClass().getMethod("getOutputLatency", int.class);
estimatedAudioLatency = (Integer) getOutputLatencyMethod.invoke(audioManager, AudioManager.STREAM_MUSIC) * 1000000L;
} catch (Exception ignored) {}
}
// No method gave us a value. Let's use a default value. Better than nothing.
if (estimatedAudioLatency == AUDIO_LATENCY_NOT_ESTIMATED) {
estimatedAudioLatency = DEFAULT_AUDIO_LATENCY;
}
return estimatedAudioLatency
}
private static long _framesToNanoSeconds(long frames) {
return frames * 1000000000L / SAMPLE_RATE;
}

The android MediaPlayer class is notoriously slow to begin audio playback, I experienced an issue in an app I was creating where there was a greater than one second delay to begin playing an audio clip. I resolved it by switching to ExoPlayer which resulted in the playback starting within 100ms. I've also read that ffmpeg has even faster start audio startup time than ExoPlayer but I haven't used it so I can't make any promises.

writing custom codecs for android using FFmpeg

I am doing a video compression project for Android and I am thinking of implementing it by designing a new video codec (by scratch , I have designed the algorithm) . I have already read the basics of video compression , related relevant algorithms and codec basics . I have also found that FFmpeg may serve as a quite good solution on Android.
Now my questions come:
How to write a new video codec as in FFmpeg? I am still a beginner at writing codecs , but
how do I start ? I have a rough idea that that you have to write at least a demuxer first and then the specific encoder and decoder etc . (Asking for references here please.)
Since my codec deosn't simply adjust video properties like fps , resolution , bit-rate etc.
Is reading the MediaCodec API and MediaPlayer API in official Android SDK enough for writing new codecs ? (Because last time I saw it had only support for MPEG-4 SP , H.263 and H.264 . I was unable to find if you could directly write your own classes and functions).
Thanks .

You can use ffmpeg as a tool or the ffmpeg set of libraries (libavcodec, libaviformat, …) on Android. You can add or change ffmpeg codecs in a cross- platform manner, because this project puts a strong emphasis on platform independence. You can use the MediaCodec API instead. But there is no way to extend the MediaCodec API (update it is possible to extend MediaCodec, it is documented at http://source.android.com/devices/media.html#codecs ) and no easy way to let ffmpeg use this API.

if you are a newb and "just want to do it in SW", than just do it in SW. I am assuming your algorithm does not need to be real-time, and compress video data on the fly, or you would need to use a HW codec.
This is from Android MediaCodec Reference
MediaCodec codec = MediaCodec.createDecoderByType(type);
codec.configure(format, ...);
codec.start();
ByteBuffer[] inputBuffers = codec.getInputBuffers();
ByteBuffer[] outputBuffers = codec.getOutputBuffers();
for (;;) {
int inputBufferIndex = codec.dequeueInputBuffer(timeoutUs);
if (inputBufferIndex >= 0) {
// fill inputBuffers[inputBufferIndex] with valid data
...
codec.queueInputBuffer(inputBufferIndex, ...);
}
int outputBufferIndex = codec.dequeueOutputBuffer(timeoutUs);
if (outputBufferIndex >= 0) {
// outputBuffer is ready to be processed or rendered.
...
codec.releaseOutputBuffer(outputBufferIndex, ...);
} else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
outputBuffers = codec.getOutputBuffers();
} else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
// Subsequent data will conform to new format.
MediaFormat format = codec.getOutputFormat();
...
}
}
codec.stop();
codec.release();
codec = null;
On the line that reads "// outputBuffer is ready to be processed or rendered" apply your codec.
That is your first frame will be outputBuffers[0] to outputBuffers[outputBufferIndex]. Save off outputBufferIndex, i.e. outputBufferIndex_old = outputBufferIndex; then your next frame will be outputbuffers[outputBufferIndex_old] to outputbuffers[outputBufferIndex]. But this is a circular buffer, so in the for loop ... ahhhhh
something like this:
//init
int old = 0;
int len = codec.BufferInfo().size,buff_len=outputBuffers.size;
Byte[] processBuffer = new Byte[len];
... // outputBuffer ready
for (int i=old; i<old+len; i++){
processBuffer[i-old] = outputBuffers[i%buff_len];
}
old = outputBufferIndex;
Here is a good example. You may want to look into MediaMetadataRetriever to get information about the input video. height and width ect. bytesize per pixel, if you want your encoder to be robust to different types of video. Anyway, that should get you started.
I strongly recommend Matlab(or GNU Octave) for prototyping a video codec. It will save you a ton of time. Meaning you should make sure your intended codec algorithm works before trying to implement it on a near impossible system to debug like Android.
Hope this helps.

If someone stumbles across this old question the answer is:
Write your Program.
Where you want the "Codec" to go simply add a 'null Codec' (copy Input to Output).
Test that your Program still works and that you can read the (so-called) encoded File.
Add your Codec where the 'null Codec' was (call a Function to avoid big edits to a working File).
Re-Test your Program to ensure it still works and read the Output to make sure it is correct.
That is all. ;)
Things to consider:
A "Video Player" can drop Frames, a "Video Recorder" had better NOT
drop Frames.
A 'Software Codec' (no Hardware assist) will be slow,
run it on a different Core, if available.
A Hardware Codec (called from Software) will be necessary unless you are just making a
Demo.
Split your Program into pieces that can run separately so it can be threaded and those Threads can be assigned to different Cores. You will need to detect the number of Cores and assess their speed so you can do some of the partitioning dynamically at Runtime.
Use of the NDK and Assembly Language Programming will be necessary to get enough speed to compress a decent sized Video at a wanted frame rate (IE: you do not want your finished Program to only support 320x176 # 5 FPS Videos). The Compressor MUST run faster than it's Input arrives.
Designing your own Codec to beat an existing Codec (x265) will take you years (without help).
If your a Wiz at Java, C, and ARM Assembly (and a Software Engineer) it will take more than a couple of months of work; so commit or quit. Try to find some Open Source as a base to start from.

Android : multiple audio tracks in a VideoView?

I've got some .MP4 video files that must be read in a VideoView in an Android activity. These videos include several audio tracks, with each one corresponding to a user language (eg. : English, French, Japanese...).
I've got unexpected trouble finding any help or documentation to provide such a feature. I'm currently able to load the video and play it in a VideoView with a MediaController, but not to change audio tracks.
I'm not sure the Android SDK provides any easy way to do this, which leaves me quite clueless on how to solve my problem. I was thinking of extracting every audio track, loading the audio that I want into a MediaPlayer depending on the language, then make audio and video play together. But I fear that some sync issues could arise and prevent me from doing this.
If you have any clue, any advice to help me getting started with this problem, you're more than welcome.

No 3rd party library required:
mVideoView.setVideoURI(Uri.parse("")); // set video source
mVideoView.setOnInfoListener(new MediaPlayer.OnInfoListener() {
#Override
public boolean onInfo(MediaPlayer mp, int what, int extra) {
MediaPlayer.TrackInfo[] trackInfoArray = mp.getTrackInfo();
for (int i = 0; i < trackInfoArray.length; i++) {
// you can switch out the language comparison logic to whatever works for you
if (trackInfoArray[i].getTrackType() == MediaPlayer.TrackInfo.MEDIA_TRACK_TYPE_AUDIO
&& trackInfoArray[i].getLanguage().equals(Locale.getDefault().getISO3Language()) {
mp.selectTrack(i);
break;
}
}
return true;
}
});
As far as I can tell - audio tracks should be encoded in the 3-letter ISO 639-2 in order to be recognized correctly.

Haven't tested myself yet, but it seems that Vitamio library has support for multiple audio tracks (among other interesting features). It is API-compatible with VideoView class from Android.
Probably you would have to use Vitamio VideoView.setAudioTrack() to set audio track (for example based on locale). See Vitamio API docs for details.

Now you can Play Multiple audio track through ExoPlayer.
Here is the details,
https://exoplayer.dev/track-selection.html
Exo Player Track Selection

VideoView class can't support your require.U must parse to get audio stream data(you want) to play with AudioTrack class on java layer.

Wrong duration of sound file on android 2.2

I am developing application in which there is a list of some audio files. When I click on a item it plays the corresponding audio and also shows the duration. The problem is in android 2.1 device and emulator the duration is correct but in android 2.2 emulator it's showing wrong duration. Does anyone have idea to solve the problem. Is there a good method to get the correct duration of the sound files. The audio files are in the res/raw folder. And one thing for the same sounds iphone is showing correct duration.

Yes
It is probably due to VBR files. Variable bit rates mention a rate in the header, which is probably used by Android software to calculate the duration of the MP3 from it's length.
I remember having seen a utility that can calculate a 'correct' effective bitrate and prefix a separate MP3 data frame at the start just to make it report the 'correct' (average) bitrate.
Try VBRFix
Throughout a song there are points that require high quality and points that require low quality(i.e. silence). Instead of having the whole file at one quality: VBR(Variable Bit Rate) provides us with a variable quality within the file. This allows us to more efficiently use the file space. The problem is that many MP3 playing programs estimate the time of a MP3 based on the first bitrate they find and the file size. Also, when jumping through a file the positions aren't the same - half way through a VBR mp3 may not be half way through the song. Ogg Vorbis is a more advanced free music format and uses VBR as default without problem
It is also in the repositories for Ubuntu (Debian likely): sudo apt-get install vbrfix

I might be a bit late on answering this but, anyway I was able fix a problem of a similar kind using Ringdroid. This is what you'd have to do to get the audio duration in milliseconds from VBR files using Rindroid
public class AudioUtils
{
public static long getDuration(CheapSoundFile cheapSoundFile)
{
if( cheapSoundFile == null)
return -1;
int sampleRate = cheapSoundFile.getSampleRate();
int samplesPerFrame = cheapSoundFile.getSamplesPerFrame();
int frames = cheapSoundFile.getNumFrames();
cheapSoundFile = null;
return 1000 * ( frames * samplesPerFrame) / sampleRate;
}
public static long getDuration(String mediaPath)
{
if( mediaPath != null && mediaPath.length() > 0)
try
{
return getDuration(CheapSoundFile.create(mediaPath, null));
}catch (FileNotFoundException e){}
catch (IOException e){}
return -1;
}
}
Hope this helps

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.