I am doing a video compression project for Android and I am thinking of implementing it by designing a new video codec (by scratch , I have designed the algorithm) . I have already read the basics of video compression , related relevant algorithms and codec basics . I have also found that FFmpeg may serve as a quite good solution on Android.
Now my questions come:
How to write a new video codec as in FFmpeg? I am still a beginner at writing codecs , but
how do I start ? I have a rough idea that that you have to write at least a demuxer first and then the specific encoder and decoder etc . (Asking for references here please.)
Since my codec deosn't simply adjust video properties like fps , resolution , bit-rate etc.
Is reading the MediaCodec API and MediaPlayer API in official Android SDK enough for writing new codecs ? (Because last time I saw it had only support for MPEG-4 SP , H.263 and H.264 . I was unable to find if you could directly write your own classes and functions).
Thanks .
You can use ffmpeg as a tool or the ffmpeg set of libraries (libavcodec, libaviformat, …) on Android. You can add or change ffmpeg codecs in a cross- platform manner, because this project puts a strong emphasis on platform independence. You can use the MediaCodec API instead. But there is no way to extend the MediaCodec API (update it is possible to extend MediaCodec, it is documented at http://source.android.com/devices/media.html#codecs ) and no easy way to let ffmpeg use this API.
if you are a newb and "just want to do it in SW", than just do it in SW. I am assuming your algorithm does not need to be real-time, and compress video data on the fly, or you would need to use a HW codec.
This is from Android MediaCodec Reference
MediaCodec codec = MediaCodec.createDecoderByType(type);
codec.configure(format, ...);
codec.start();
ByteBuffer[] inputBuffers = codec.getInputBuffers();
ByteBuffer[] outputBuffers = codec.getOutputBuffers();
for (;;) {
int inputBufferIndex = codec.dequeueInputBuffer(timeoutUs);
if (inputBufferIndex >= 0) {
// fill inputBuffers[inputBufferIndex] with valid data
...
codec.queueInputBuffer(inputBufferIndex, ...);
}
int outputBufferIndex = codec.dequeueOutputBuffer(timeoutUs);
if (outputBufferIndex >= 0) {
// outputBuffer is ready to be processed or rendered.
...
codec.releaseOutputBuffer(outputBufferIndex, ...);
} else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
outputBuffers = codec.getOutputBuffers();
} else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
// Subsequent data will conform to new format.
MediaFormat format = codec.getOutputFormat();
...
}
}
codec.stop();
codec.release();
codec = null;
On the line that reads "// outputBuffer is ready to be processed or rendered" apply your codec.
That is your first frame will be outputBuffers[0] to outputBuffers[outputBufferIndex]. Save off outputBufferIndex, i.e. outputBufferIndex_old = outputBufferIndex; then your next frame will be outputbuffers[outputBufferIndex_old] to outputbuffers[outputBufferIndex]. But this is a circular buffer, so in the for loop ... ahhhhh
something like this:
//init
int old = 0;
int len = codec.BufferInfo().size,buff_len=outputBuffers.size;
Byte[] processBuffer = new Byte[len];
... // outputBuffer ready
for (int i=old; i<old+len; i++){
processBuffer[i-old] = outputBuffers[i%buff_len];
}
old = outputBufferIndex;
Here is a good example. You may want to look into MediaMetadataRetriever to get information about the input video. height and width ect. bytesize per pixel, if you want your encoder to be robust to different types of video. Anyway, that should get you started.
I strongly recommend Matlab(or GNU Octave) for prototyping a video codec. It will save you a ton of time. Meaning you should make sure your intended codec algorithm works before trying to implement it on a near impossible system to debug like Android.
Hope this helps.
If someone stumbles across this old question the answer is:
Write your Program.
Where you want the "Codec" to go simply add a 'null Codec' (copy Input to Output).
Test that your Program still works and that you can read the (so-called) encoded File.
Add your Codec where the 'null Codec' was (call a Function to avoid big edits to a working File).
Re-Test your Program to ensure it still works and read the Output to make sure it is correct.
That is all. ;)
Things to consider:
A "Video Player" can drop Frames, a "Video Recorder" had better NOT
drop Frames.
A 'Software Codec' (no Hardware assist) will be slow,
run it on a different Core, if available.
A Hardware Codec (called from Software) will be necessary unless you are just making a
Demo.
Split your Program into pieces that can run separately so it can be threaded and those Threads can be assigned to different Cores. You will need to detect the number of Cores and assess their speed so you can do some of the partitioning dynamically at Runtime.
Use of the NDK and Assembly Language Programming will be necessary to get enough speed to compress a decent sized Video at a wanted frame rate (IE: you do not want your finished Program to only support 320x176 # 5 FPS Videos). The Compressor MUST run faster than it's Input arrives.
Designing your own Codec to beat an existing Codec (x265) will take you years (without help).
If your a Wiz at Java, C, and ARM Assembly (and a Software Engineer) it will take more than a couple of months of work; so commit or quit. Try to find some Open Source as a base to start from.
Related
I'm using the Android oboe library for high performance audio in a music game.
In the assets folder I have 2 .raw files (both 48000Hz 16 bit PCM wavs and about 60kB)
std_kit_sn.raw
std_kit_ht.raw
These are loaded into memory as SoundRecordings and added to a Mixer. kSampleRateHz is 48000:
stdSN= SoundRecording::loadFromAssets(mAssetManager, "std_kit_sn.raw");
stdHT= SoundRecording::loadFromAssets(mAssetManager, "std_kit_ht.raw");
mMixer.addTrack(stdSN);
mMixer.addTrack(stdFT);
// Create a builder
AudioStreamBuilder builder;
builder.setFormat(AudioFormat::I16);
builder.setChannelCount(1);
builder.setSampleRate(kSampleRateHz);
builder.setCallback(this);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);
LOGD("After creating a builder");
// Open stream
Result result = builder.openStream(&mAudioStream);
if (result != Result::OK){
LOGE("Failed to open stream. Error: %s", convertToText(result));
}
LOGD("After openstream");
// Reduce stream latency by setting the buffer size to a multiple of the burst size
mAudioStream->setBufferSizeInFrames(mAudioStream->getFramesPerBurst() * 2);
// Start the stream
result = mAudioStream->requestStart();
if (result != Result::OK){
LOGE("Failed to start stream. Error: %s", convertToText(result));
}
LOGD("After starting stream");
They are called appropriately to play with standard code (as per Google tutorials) at required times:
stdSN->setPlaying(true);
stdHT->setPlaying(true); //Nasty Sound
The audio callback is standard (as per Google tutorials):
DataCallbackResult SoundFunctions::onAudioReady(AudioStream *mAudioStream, void *audioData, int32_t numFrames) {
// Play the stream
mMixer.renderAudio(static_cast<int16_t*>(audioData), numFrames);
return DataCallbackResult::Continue;
}
The std_kit_sn.raw plays fine. But std_kit_ht.raw has a nasty distortion. Both play with low latency. Why is one playing fine and the other has a nasty distortion?
I loaded your sample project and I believe the distortion you hear is caused by clipping/wraparound during mixing of sounds.
The Mixer object from the sample is a summing mixer. It just adds the values of each track together and outputs the sum.
You need to add some code to reduce the volume of each track to avoid exceeding the limits of an int16_t (although you're welcome to file a bug on the oboe project and I'll try to add this in an upcoming version). If you exceed this limit you'll get wraparound which is causing the distortion.
Additionally, your app is hardcoded to run at 22050 frames/sec. This will result in sub-optimal latency across most mobile devices because the stream is forced to upsample to the audio device's native frame rate. A better approach would be to leave the sample rate undefined when opening the stream - this will give you the optimal frame rate for the current audio device - then use a resampler on your source files to supply audio at this frame rate.
I want to be able to use mp4v-es instead of avc on some devices. The encoder runs fine using avc, but when I replace it with mp4v-es, the muxer reports:
E/MPEG4Writer(12517): Missing codec specific data
as in MediaMuxer error "Failed to stop the muxer", and the video cannot be played. The difference is that I am adding the correct track/format to the muxer, without receiving any error:
...else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
MediaFormat newFormat = encoder.getOutputFormat();
mTrackIndex[encID] = mMuxer.addTrack(newFormat);
Is there any difference in handling mp4v-es compared to avc? One mention, I just skip "bufferInfo.flags & MediaCodec.BUFFER_FLAG_CODEC_CONFIG" when it occurs, as for avc it was not needed.Thanks.
Just as Ganesh pointed out, unfortunately it does seem that this isn't possible right now, without modifying the platform source.
There's actually two ways that the codec specific data can be passed to the internal MPEG4Writer class, but neither of them actually work without modifications.
As Ganesh found, the logic for remapping MediaFormat keys to the internal format seems to be missing handling of codec specific data for any other video codec than H264. A tested modification that fixes this issue is as follows:
diff --git a/media/libstagefright/Utils.cpp b/media/libstagefright/Utils.cpp
index 25afc5b..304fe59 100644
--- a/media/libstagefright/Utils.cpp
+++ b/media/libstagefright/Utils.cpp
## -549,14 +549,14 ## void convertMessageToMetaData(const sp<AMessage> &msg, sp<MetaData> &meta) {
// reassemble the csd data into its original form
sp<ABuffer> csd0;
if (msg->findBuffer("csd-0", &csd0)) {
- if (mime.startsWith("video/")) { // do we need to be stricter than this?
+ if (mime == MEDIA_MIMETYPE_VIDEO_AVC) {
sp<ABuffer> csd1;
if (msg->findBuffer("csd-1", &csd1)) {
char avcc[1024]; // that oughta be enough, right?
size_t outsize = reassembleAVCC(csd0, csd1, avcc);
meta->setData(kKeyAVCC, kKeyAVCC, avcc, outsize);
}
- } else if (mime.startsWith("audio/")) {
+ } else if (mime == MEDIA_MIMETYPE_AUDIO_AAC || mime == MEDIA_MIMETYPE_VIDEO_MPEG4) {
int csd0size = csd0->size();
char esds[csd0size + 31];
reassembleESDS(csd0, esds);
Secondly, instead of passing the codec specific data as csd-0 in MediaFormat, you could in principle pass the same buffer (with the MediaCodec.BUFFER_FLAG_CODEC_CONFIG flag set) to MediaMuxer.writeSampleData. This approach doesn't work currently since this method doesn't check for the codec config flag at all - it could be fixed with this modification:
diff --git a/media/libstagefright/MediaMuxer.cpp b/media/libstagefright/MediaMuxer.cpp
index c7c6f34..d612e01 100644
--- a/media/libstagefright/MediaMuxer.cpp
+++ b/media/libstagefright/MediaMuxer.cpp
## -193,6 +193,9 ## status_t MediaMuxer::writeSampleData(const sp<ABuffer> &buffer, size_t trackInde
if (flags & MediaCodec::BUFFER_FLAG_SYNCFRAME) {
sampleMetaData->setInt32(kKeyIsSyncFrame, true);
}
+ if (flags & MediaCodec::BUFFER_FLAG_CODECCONFIG) {
+ sampleMetaData->setInt32(kKeyIsCodecConfig, true);
+ }
sp<MediaAdapter> currentTrack = mTrackList[trackIndex];
// This pushBuffer will wait until the mediaBuffer is consumed.
As far as I can see, there's no way to mux MPEG4 video with MediaMuxer right now while using the public API, without modifying the platform source. Given the issues in Utils.cpp above, you can't mux any video format that requires codec specific data, except for H264. If VP8 is an option, you can mux that into webm files (together with vorbis audio), but hardware encoders for VP8 is probably much less common than hardware encoders for MPEG4.
I presume you have the ability to modify the Stagefright sources and hence, I have a proposed solution for your problem, but one which requires a customization.
Background:
When an encoder completes encoding, the first buffer will have the csd information which is usually tagged with OMX_BUFFERFLAG_CODECCONFIG flag. When such a buffer is returned to the MediaCodec, it shall store the same as csd-0 in MediaCodec::amendOutputFormatWithCodecSpecificData.
Now, when this buffer is given to MediaMuxer, the same is processed as part of addTrack,in which convertMessageToMetadata is invoked. If you refer to the implementation of the same, we can observe that only AVC is handled for video and defaults to audio for ESDS creation.
EDIT:
Here, my recommendation is to modify this line as below and try your experiment
}
if (mime.startsWith("audio/") || (!strcmp(mime, MEDIA_MIMETYPE_VIDEO_MPEG4)) {
With this change, I feel it should work for MPEG4 video track also. The change is to convert the else if into if as the previous check for video will also try to process the data, but only for AVC.
I am transcoding videos based on the example given by Google (https://android.googlesource.com/platform/cts/+/master/tests/tests/media/src/android/media/cts/ExtractDecodeEditEncodeMuxTest.java)
Basically, transocding of MP4 files works, but on some phones I get some weird results. If for example I transcode a video with audio on an HTC One, the code won't give any errors but the file cannot play afterward on the phone. If I have a 10 seconds video it jumps to almost the last second and you only here some crackling noise. If you play the video with VLC the audio track is completely muted.
I did not alter the code in terms of encoding/decoding and the same code gives correct results on a Nexus 5 or MotoX for example.
Anybody having an idea why it might fail on that specific device?
Best regard and thank you,
Florian
I made it work in Android 4.4.2 devices by following changes:
Set AAC profile to AACObjectLC instead of AACObjectHE
private static final int OUTPUT_AUDIO_AAC_PROFILE = MediaCodecInfo.CodecProfileLevel.AACObjectLC;
During creation of output audio format, use sample rate and channel count of input format instead of fixed values
MediaFormat outputAudioFormat = MediaFormat.createAudioFormat(OUTPUT_AUDIO_MIME_TYPE,
inputFormat.getInteger(MediaFormat.KEY_SAMPLE_RATE),
inputFormat.getInteger(MediaFormat.KEY_CHANNEL_COUNT));
Put a check just before audio muxing audio track to control presentation timestamps. (To avoid timestampUs X < lastTimestampUs X for Audio track error)
if (audioPresentationTimeUsLast == 0) { // Defined in the begining of method
audioPresentationTimeUsLast = audioEncoderOutputBufferInfo.presentationTimeUs;
} else {
if (audioPresentationTimeUsLast > audioEncoderOutputBufferInfo.presentationTimeUs) {
audioEncoderOutputBufferInfo.presentationTimeUs = audioPresentationTimeUsLast + 1;
}
audioPresentationTimeUsLast = audioEncoderOutputBufferInfo.presentationTimeUs;
}
// Write data
if (audioEncoderOutputBufferInfo.size != 0) {
muxer.writeSampleData(outputAudioTrack, encoderOutputBuffer, audioEncoderOutputBufferInfo);
}
Hope this helps...
If original CTS tests fail you need to go to device vendors and ask for fixes
In my Android application, I am encoding some media in webm (vp8) format using MediaCodec. The encoding is working as expected. However, I need to ensure that I create a sync frame once in a while. Here is what I do:
encoder.queueInputBuffer(..., MediaCodec.BUFFER_FLAG_SYNC_FRAME);
Later in the code, I check for sync frame:
encoder.dequeueOutputBuffer(bufferInfo, 0);
boolean isSyncFrame = (bufferInfo.flags & MediaCodec.BUFFER_FLAG_SYNC_FRAME);
The problem is that isSyncFrame never gets a true value.
I am wondering if I am making a mistake in my encoding configuration. May be there is a better way to tell the encoder to create a sync frame once in a while.
I hope it is not a bug in MediaCodec. Thank you in advance for your help.
There is no (current as of Android 4.3) way to request an on-demand sync frame using MediaCodec encoders. This is partly due to OMX, the underlying codec implementation in Android, that does not provide a way to specify which input frame should be encoded as a sync frame; although it has a way to trigger a sync frame "in the near future".
feisal's answer is the only currently supported way to control sync frames, but you have to do it at configuration time.
==edit re: jesup
You can trigger a sync frame in the near future using MediaCodec.setParameter:
Bundle params = new Bundle();
params.putInt(MediaCodec.PARAMETER_KEY_REQUEST_SYNC_FRAME, 0);
mCodec.setParameters(syncFrame);
Unfortunately, there is no (reliable) way to tell in MediaCodec if an encoded buffer is a sync frame other than doing it on your own by inspecting the byte-codes.
you can set the rate of I-frames in the MediaFormat object of your encoder by setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, int secs_between_iframes );
I am trying to display video buffers on an android. I am using the media codec API released in Android 4.1 Jelly Bean.
The sample goes like this:
MediaCodec codec = MediaCodec.createDecoderByType(type);
codec.configure(format, ...);
configure method accepts 3 other arguments, apart from MediaFormat. I have been able to figure out MediaFormat somehow but I am not sure about the other 3 parameters. (below).
MediaSurface, MediaCrypto and Flags.
Any leads?
Also, what should I do with the MediaCrypto argument, if I am not encrypting my video buffers.
Requirements:
1) Decode the buffers on the android device,
2) Display them on the screen.
You can see the article from here:
http://dpsm.wordpress.com/2012/07/28/android-mediacodec-decoded/
Just for completeness:
To decode -
MediaSurface is the surface to render the frame to ( or null if not rendering )
MediaCrypto should be null if the is no encryption
flags == 0 if decoding or MediaCodec.CONFIGURE_FLAG_ENCODE if encoding