Does RTMP support the Display Orientation SEI Message in h264 streams? - android

I'm streaming video h264 video and AAC audio over RTMP on Android using the native MediaCodec APIs. Video and audio look great, however while the video is shot in potrait mode, playback on the web or with VLC is always in landscape.
Having read through the h264 spec, I see that this sort of extra metadata can be specified in Supplemental Enhancement Information (SEI), and I've gone about adding it to the raw h264 bit stream. My SEI NAL unit for this follows this rudimentary format, I plan to optimize later:
val displayOrientationSEI = {
val prefix = byteArrayOf(0, 0, 0, 1)
val nalHeader = byteArrayOf(6) // forbidden_zero_bit:0; nal_ref_idc:0; nal_unit_type:6
val display = byteArrayOf(47 /* Display orientation type*/, 3 /*payload size*/)
val displayOrientationCancelFlag = "0" // u(1); Rotation information follows
val horFlip = "1" // hor_flip; u(1); Flip horizontally
val verFlip = "1" // ver_flip; u(1); Flip vertically
val anticlockwiseRotation = "0100000000000000" // u(16); value / 2^16 -> 90 degrees
val displayOrientationRepetitionPeriod = "010" // ue(v); Persistent till next video sequence
val displayOrientationExtensionFlag = "0" // u(1); No other value is permitted by the spec atm
val byteAlignment = "1"
val bitString = displayOrientationCancelFlag +
horFlip +
verFlip +
anticlockwiseRotation +
displayOrientationRepetitionPeriod +
displayOrientationExtensionFlag +
byteAlignment
prefix + nalHeader + display + BigInteger(bitString, 2).toByteArray()
}()
Using Jcodec's SEI class, I can see that my SEI message is parsed properly. I write out these packets to the RTMP stream using an Android JNI wrapper for LibRtmp.
Despite this, ffprobe does not show the orientation metadata, and the video when played remains in landscape.
At this point I think I'm missing a very small detail about how FLV headers work when the raw h264 units are written out by LibRtmp. I have tried appending this displayOrientationSEI NAL unit:
To the initial SPS and PPS configuration only.
To each raw h264 NAL units straight from the encoder.
To both.
What am I doing wrong? Going through the source of some RTMP libraries, like rtmp-rtsp-stream-client-java, it seems the SEI message is dropped when creating FLV tags.
Help is much, much appreciated.

Does RTMP support the Display Orientation SEI Message in h264 streams?
RTMP is unaware of the very concept. from RTMPs perspective, the SEI is just a series of bytes it copys. It never looks at them, it never parses them.
The thing that needs to support it, is the h.264 decoder (which RTMP is also unaware of) and the player software. If it is not working for you, you must check the player, or the validity of the encoded SEI, not the transport.

Related

An AAC audio stream is playable in VLC for Android, but not in Exoplayer

I have an RTMP stream I want to play in my app using the Exoplayer library. My setup for that is as follows:
TrackSelector trackSelector = new DefaultTrackSelector();
RtmpDataSourceFactory rtmpDataSourceFactory = new RtmpDataSourceFactory(bandwidthMeter);
ExtractorsFactory extractorsFactory = new DefaultExtractorsFactory();
factory = new ExtractorMediaSource.Factory(rtmpDataSourceFactory);
factory.setExtractorsFactory(extractorsFactory);
createSource();
mPlayer = ExoPlayerFactory.newSimpleInstance(mActivity, trackSelector, new DefaultLoadControl(
new DefaultAllocator(true, C.DEFAULT_BUFFER_SEGMENT_SIZE),
1000, // min buffer
3000, // max buffer
1000, // playback
2000, //playback after rebuffer
DefaultLoadControl.DEFAULT_TARGET_BUFFER_BYTES,
true
));
vwExoPlayer.setPlayer(mPlayer);
mPlayer.addListener(mVideoStreamHandler);
mPlayer.addVideoListener(new VideoListener() {
#Override
public void onVideoSizeChanged(int width, int height, int unappliedRotationDegrees, float pixelWidthHeightRatio) {
Log.d("hasil", "onVideoSizeChanged: w:" + width + ", h:" + height);
String res = width + "x" + height;
resolution.setText(res);
}
#Override
public void onRenderedFirstFrame() {
}
});
Where createSource() is as follows:
private void createSource() {
mMediaSource180 = factory.createMediaSource(Uri.parse(API.GAME_VIDEO_STREAM_URL_180));
mMediaSource360 = factory.createMediaSource(Uri.parse(API.GAME_VIDEO_STREAM_URL_360));
mMediaSource720 = factory.createMediaSource(Uri.parse(API.GAME_VIDEO_STREAM_URL_720));
mMediaSourceAudio = factory.createMediaSource(Uri.parse(API.GAME_AUDIO_STREAM_URL));
}
My current problem is that only the first three ExtractorMediaSources work fine in Exoplayer. The mMediaSourceAudio refuses to play in Exoplayer, but works just fine in the VLC Media Player for Android.
Right now I have a suspicion that the format is AAC-LTP, or whatever AAC variant that requires a codec available in VLC but not in default Android. However, I do not have access to the encoding process so I don't know for sure.
If this isn't the case, what is it?
EDIT:
I've been debugging the BandwidthMeter and added a MediaSourceEventListener. When I use the normal Video sources, onDownstreamFormatChanged() gets called, but not when I use that Audio Stream source.
In addition, the BandwidthMeter works fine, with bytes always downloaded in all parts of the stream and more bytes when the video stream comes in, but only in the Audio only stream that, when I call mPlayer.getBufferedPosition(), the returned value is always 0. Also, when I use the Audio Stream source, no OMX code was called - no decoders were set up.
Am I seeing a malformed audio stream, or do I need to change my Exoplayer's settings?
EDIT 2:
Further debugging reveals that, in all the Video streams and Audio stream, the same FlvExtractor is used. Even though the Video streams have the avc video track encoding and mp4a-latm audio track encoding. Is this normal?
Turns out it's because the stream was recognized to have two tracks/sampleQueues. One Audio track, and one track with null format. That null track was supposed to be the video track, which was supposed to exist according to the stream's flvHeader flag.
For now, I get around this by creating a custom MediaSource using a custom MediaPeriod. Said custom MediaPeriod having code to separate the video and audio tracks of the SampleQueues, then using the audio-only SampleQueue[] instead of the source SampleQueue[] when I want to play the audio-only stream.
Though this gives me another point of concern: There's something one can do to alter the 'has audio track (flag & 0x04) and video track (flag & 0x01)' flag in the rtmp stream, right?
Thanks for the comments, I'm new to ExoPlayer. But your comments helped me in debugging and getting multiple workarounds to the issue.
I tried to use custom MediaSource and custom MediaPeriod to address this audio issue. I have observed video format data coming after audio data incase of video+audio wowza stream, so the function maybeFinishPrepare() will wait for getting both video and audio format tag data before invoking onPrepared, incase if video tagData is received first. Incase of audio data received first, it wont wait and will call onPrepare().
With the above changes, I was able to play audio alone and video_audio wowza streams, where rtmp tagHeader with tagTypes were coming in the order of video tagData and then followed by audio data.
I wasn't able to use the same patch with srs server to play both audio_only and video_audio streams with the same changes. srs server is giving tagData in the order of audio and then video tagData,
So, I debugged further in FlvExtractor. In readFlvHeader, I have overriden the hasAudio and hasVideo variables. These variables will be set based on the first few tagHeaders(5 or 6). I used peekFully on input for 6 times in a loop. In each loop after fetching tagType and tagDataSize, tagDataSize is used to input.advancePeekPosition(), and tagType is used to identify whether we have audio/video format data in tagData. After peeking for first 6 consecutive tagHeaders, I was able to get actual values of hasAudio and hasVideo, and ignored the flvHeaders.flags, which were used to set these variables.
Custom FlvExtractor workaround, looked cleaner than custom MediaSource/MediaPeriod, as we will create those many tracks as necessary, as we are setting proper hasVideo/hasAudio values.

MediaCodec - intentionally adding silence to the audio track

I've managed to combine multiple videos with audio tracks, but then I realized that if I combine multiple videos with one of them not having an audio track, I have to add silence to the combined audio track.
So, how do I go about doing it? Should I encode a ByteBuffer filled with 0s with timestamps for silence?
So, how do I go about doing it? Should I encode a ByteBuffer filled with 0s with timestamps for silence?
Essentially yes. I am using the function below to encode silence at a certain presentation time.
For the length of your video with no audio, you should be encoding silence at a regular interval. I determined that the interval should match the audio before it. So in my case, the period between audio presentation times of my first video was 21333 us.
Using that info I started encoding silence:
from the last presentation time of the first video's audio + 21333,
at intervals of 21333 until I encoded enough silence to last the full video
I am still trying to figure out how to use a video with no audio (as the first video) followed by a video with audio. I will update my answer if I figure it out.
private byte[] zerodArray = new byte[2048];// Used to encode silent audio... Not really sure how big this should be ......
private void encodeSilenceForFrame(long presentationTime){
//mAudioEncoder is the audio encoder you are using to combine the other videos' audio.
final int TIMEOUT_USEC = 10000;
int encoderInputBufferIndex = mAudioEncoder.dequeueInputBuffer(TIMEOUT_USEC);
if (encoderInputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER) {
if (VERBOSE) Log.d(TAG, "no audio encoder input buffer");
}
if (VERBOSE) {
Log.d(TAG, "audio encoder: returned input buffer: " + encoderInputBufferIndex);
}
ByteBuffer encoderInputBuffer = mAudioEncoder.getInputBuffer(encoderInputBufferIndex);
encoderInputBuffer.position(0);
encoderInputBuffer.put(zerodArray);
Log.d(TAG, "audio silence: pending buffer for time " + presentationTime);
mAudioEncoder.queueInputBuffer(
encoderInputBufferIndex,
0,
zerodArray.length,
presentationTime,0);
}

use ffmpeg to parse info about presentation time in h264 stream encoded by MediaCodec

I have seen the below example for encode/decode using MediaCodec API.
https://android.googlesource.com/platform/cts/+/jb-mr2-release/tests/tests/media/src/android/media/cts/EncodeDecodeTest.java
In which there is a comparsion of the guessed presentation time and the presentation time received from decoded info.
assertEquals("Wrong time stamp", computePresentationTime(checkIndex),
info.presentationTimeUs);
Because the decoder just decode the data in encoded buffer, I think there is any timestamp info could be parsed in this encoder's output H.264 stream.
I am writing an Android application which mux a H264 stream (.h264) encoded by MediaCodec to mp4 container by using ffmpeg (libavformat).
I don't want to use MediaMuxer because it require version 4.3 which is too high.
However, ffmpeg seems not recognize the presentation timestamp in a packet encoded by MediaCodec, so I always get NO_PTS value when try to read a frame from the stream.
Anyone know how to get the correct presentation timestamp in this situation?
to send timestamps from MediaCodec encoder to ffmpeg you need to convert like that:
jint Java_com_classclass_WriteVideoFrame(JNIEnv * env, jobject this, jbyteArray data, jint datasize, jlong timestamp) {
....
AVPacket pkt;
av_init_packet(&pkt);
AVCodecContext *c = m_pVideoStream->codec;
pkt.pts = (long)((double)timestamp * (double)c->time_base.den / 1000.0);
pkt.stream_index = m_pVideoStream->index;
pkt.data = rawjBytes;
pkt.size = datasize;
where time_base depends on framerate
upd re timestamps flow in pipline:
neither decoder nor encoder knows time-stamps by their own. timestamps are set to these components via
decoder.queueInputBuffer(inputBufIndex, 0, info.size, info.presentationTimeUs, info.flags);
or
encoder.queueInputBuffer(inputBufIndex, 0, 0, ptsUsec, info.flags);
these timestamps could be taken from extractor, from camera or generated by app, but decoder\encoder just passes through these time-stamps without changing them. as a result time-stamps go unchanged from source to sink (muxer).
for sure there are some exclusions: if frames frequency in changed - frame rate conversion for example. or if encoder makes encoding with B-frames and reordering happens.
or encoder can add time-stamps to the encoder frame header - optional, not mandatory by standard. i think all of this is not applied to current android version, codecs or your usage scenario.

Phonegap Media record mp3 file corrupt

I am using phonegap media to record audio as mp3. After recording, it plays fine on my Android and plays fine on Windows Media Player. However, when I try it in the browser it says that the file is corrupt.
Exact errors:
Chrome: "We cannot play this audio file right now."
Firefox: "Video can't be played because the file is corrupt."
IE: Opens the file in WMP and it plays.
I used the code from the example. http://docs.phonegap.com/en/2.6.0/cordova_media_media.md.html#media.startRecord
// Record audio
//
function recordAudio() {
var src = "myrecording.mp3";
var mediaRec = new Media(src,
// success callback
function() {
console.log("recordAudio():Audio Success");
},
// error callback
function(err) {
console.log("recordAudio():Audio Error: "+ err.code);
});
// Record audio
mediaRec.startRecord();
}
Thanks in advance.
Edit:
Here is an example. http://blrbrdev.azurewebsites.net/voice/blrbr_130419951008830874.mp3
This plays in WMP but not the browser.
The file you provided as an mp3 does not appear to be an mp3 file.
I have attached below the details of the file. As you can see it is AMR codec packed in a MPEG-4/3GPP container. I would say no current browser can decode that natively (but software like VLC can play it back).
If you are attempting to play an audio file in a browser - be it HTML5 audio for say - you need to provide a compatible format. Have a look here for a compatibility table.
This is expected behavior as stated here:
Android devices record audio in Adaptive Multi-Rate format. The specified file should end with a .amr extension.
If you want to play it in a browser/HTML5 audio tag you would need to post process the file to convert it to a valid mp3 file (add ogg audio for full browser coverage). Server side this can be done with a program called ffmpeg for example. I am no specialist of Phonegap development so I cannot point you to a valid lib that does that client side but maybe this has already been asked on SO.
File specs:
General
Complete name : C:\wamp\www\stack\sample\thisTest.mp3
Format : MPEG-4
Format profile : 3GPP Media Release 4
Codec ID : 3gp4
File size : 10.5 KiB
Duration : 4s 780ms
Overall bit rate mode : Constant
Overall bit rate : 18.0 Kbps
Performer : LGE
Encoded date : UTC 2014-04-15 00:24:57
Tagged date : UTC 2014-04-15 00:24:57
Audio
ID : 1
Format : AMR
Format/Info : Adaptive Multi-Rate
Format profile : Narrow band
Codec ID : samr
Duration : 4s 780ms
Bit rate mode : Constant
Bit rate : 12.8 Kbps
Channel(s) : 1 channel
Sampling rate : 8 000 Hz
Bit depth : 13 bits
Stream size : 7.47 KiB (71%)
Title : SoundHandle
Writing library :
Language : English
Encoded date : UTC 2014-04-15 00:24:57
Tagged date : UTC 2014-04-15 00:24:57
Firstly thanks to #Forestan06 for pointing me in the right direction.
For those of you recording on Android devices in .amr format and needing said recording in .mp3 format on your server using .Net C#, this is how I did it.
Install-Package MediaToolkit --> http://www.nuget.org/packages/MediaToolkit/
Write this code:
var fileName ="myVoice.mp3";
string fileNameWithPath = Server.MapPath("~/Voice/" + fileName);
if (request.FileName.EndsWith(".amr"))
{
var amrFileName = "myVoice.amr";
string amrFileNameWithPath = Server.MapPath("~/Voice/Amr/" + amrFileName);
request.SaveAs(amrFileNameWithPath);
var inputFile = new MediaFile { Filename = amrFileNameWithPath };
var outputFile = new MediaFile { Filename = fileNameWithPath };
using (var engine = new Engine())
{
engine.Convert(inputFile, outputFile);
}
}
else
{
request.SaveAs(fileNameWithPath);
}

Android - Include native StageFright features in my own project

I am currently developing an application that needs to record audio, encode it as AAC, stream it, and do the same in reverse - receiving stream, decoding AAC and playing audio.
I successfully recorded AAC (wrapped in a MP4 container) using the MediaRecorder, and successfully up-streamed audio using the AudioRecord class. But, I need to be able to encode the audio as I stream it, but none of these classes seem to help me do that.
I researched a bit, and found that most people that have this problem end up using a native library like ffmpeg.
But I was wondering, since Android already includes StageFright, that has native code that can do encoding and decoding (for example, AAC encoding and AAC decoding), is there a way to use this native code on my application? How can I do that?
It would be great if I only needed to implement some JNI classes with their native code. Plus, since it is an Android library, it would be no licensing problems whatever (correct me if I'm wrong).
yes, you can use libstagefright, it's very powerful.
Since stagefright is not exposed to NDK, so you will have to do extra work.
There are two ways:
(1) build your project using android full source tree. This way takes a few days to setup, once ready, it's very easy, and you can take full advantage of stagefright.
(2) you can just copy include file to your project, it's inside this folder:
android-4.0.4_r1.1/frameworks/base/include/media/stagefright
then you will have export the library function by dynamically loading libstagefright.so, and you can link with your jni project.
To encode/decode using statgefright, it's very straightforward, a few hundred of lines can will do.
I used stagefright to capture screenshots to create a video, which will be available in our Android VNC server, to be released soon.
the following is a snippet, I think it's better than using ffmpeg to encode a movie. You can add audio source as well.
class ImageSource : public MediaSource {
ImageSource(int width, int height, int colorFormat)
: mWidth(width),
mHeight(height),
mColorFormat(colorFormat)
{
}
virtual status_t read(
MediaBuffer **buffer, const MediaSource::ReadOptions *options) {
// here you can fill the buffer with your pixels
}
...
};
int width = 720;
int height = 480;
sp<MediaSource> img_source = new ImageSource(width, height, colorFormat);
sp<MetaData> enc_meta = new MetaData;
// enc_meta->setCString(kKeyMIMEType, MEDIA_MIMETYPE_VIDEO_H263);
// enc_meta->setCString(kKeyMIMEType, MEDIA_MIMETYPE_VIDEO_MPEG4);
enc_meta->setCString(kKeyMIMEType, MEDIA_MIMETYPE_VIDEO_AVC);
enc_meta->setInt32(kKeyWidth, width);
enc_meta->setInt32(kKeyHeight, height);
enc_meta->setInt32(kKeySampleRate, kFramerate);
enc_meta->setInt32(kKeyBitRate, kVideoBitRate);
enc_meta->setInt32(kKeyStride, width);
enc_meta->setInt32(kKeySliceHeight, height);
enc_meta->setInt32(kKeyIFramesInterval, kIFramesIntervalSec);
enc_meta->setInt32(kKeyColorFormat, colorFormat);
sp<MediaSource> encoder =
OMXCodec::Create(
client.interface(), enc_meta, true, image_source);
sp<MPEG4Writer> writer = new MPEG4Writer("/sdcard/screenshot.mp4");
writer->addSource(encoder);
// you can add an audio source here if you want to encode audio as well
//
//sp<MediaSource> audioEncoder =
// OMXCodec::Create(client.interface(), encMetaAudio, true, audioSource);
//writer->addSource(audioEncoder);
writer->setMaxFileDuration(kDurationUs);
CHECK_EQ(OK, writer->start());
while (!writer->reachedEOS()) {
fprintf(stderr, ".");
usleep(100000);
}
err = writer->stop();

Categories

Resources