I have an audio and video stream and would like to generate an mp4. However, all the samples I've seen tend to generate an mp4 when it has the entire stream. What I want to accomplish is to encode the stream to mp4 while receiving the audio/video stream from the web.
It isn't clear whether an mp4 can be created on-the-fly or needs to wait until the final input data has been received. I believe that this has to be possible because when you record a video on your device using the camera, as soon as you stop recording, the video is already available. This means that it was encoding it on-the-fly. So it really shouldn't matter whether the frames come from the camera or from a video stream over the web. Or am I possibly missing something?
Without knowing how mp4 is actually generated, I'm guessing that even if the encoder receives a few seconds of frames and audio, it still can generate that portion of the mp4 and save it to disk. At any point in time, my app should be able to stop the streaming and the encoder just saves the final portion of the stream. What would be nice is that if the app were to crash during encoding, that at least the part that was encoded still gets saved to disk as a valid mp4. Not sure if that is even possible.
Ultimately I would like to do this using the MediaCodec, OpenGL ES and a Surface.
Related
I have to get video/audio tracks or if it's possible MediaStream object from the Android WebView which plays HLS stream ("m3m8").
I load HLS stream in the WebView using method loadUrl("...m3m8"). It's running without issues, but i can't figure out how to get live video and audio tracks from the HLS stream. I read a lot of articles and I was not able to find any solution. So my question is - Is it possible to get audio and video tracks from the running HLS stream on the WebView? I need to get the audio and video tracks because I should send them via PeerConnection(WebRTC) which accepts MediaStream or audio tracks and video tracks. Any ideas and examples will be helpful. Thank you in advance.
HLS works by defining a playlist (your .m3u8 file) which points to chunks of the video (say segment00000.ts, segment00001.ts, ...). It can contain different streams for different resolutions, but the idea (very simplified) is that your player can download a chunk, play it right away, and download the next one in the meantime.
A WebRTC video track takes RTP packets. I don't know exactly what the Android API exposes, but it feels like either you can pass it an m3u8 as a source (though that may be a bit weird for a generic API), or you can't, and you have to read the HLS stream yourself, extract RTP frames, and pass them to WebRTC.
For instance, gstreamer's hlsdemux seems to be able to read from an HLS source. From there you could extract RTP frames and feed them to your WebRTC track.
Probably that's a bit lower-level than you hoped, but if the Android API doesn't do it for you, you have to do it yourself (or find a library that does it).
I have an app calling using WebRTC. But during a call, I need to record microphone. WebRTC has an object WebRTCAudioRecord to record audio but the audio file is so large (PCM_16bit). I want to record but to a smaller size.
I've tried MediaRecorder but it doesn't work because WebRTC is recorded and MediaRecorder does not have permission to record while calling.
Has anyone done this, or have any idea that could help me?
Webrtc is considered as comparatively much better pre-processing tool for Audio and Video.
Webrtc native development includes fully optimized native C and C++ classes, In order to maintain wonderful Speech Quality and Intelligibility of audio and video which is quite interesting.
Visit Reference Link: https://github.com/jitsi/webrtc/tree/master/examples regularly.
As Problem states;
I want to record but smaller size. I've tried MediaRecorder and it doesn't work because WebRtc is recorded and MediaRecorder has not permission to record while calling.
First of all, to reduce or minimize the size of your recorded data (audio bytes), you should look at different types of speech codecs which basically reduce the size of recorded data by maintaining sound quality at a level. To see different voice codecs, here are well-known speech codecs as follows:
OPUS
SPEEX
G7.11 (G-Series Speech Codecs)
As far as size of the audio data is concerned, it basically depends upon the Sample Rate and Time for which you record a chunk or audio packet.
Supppose time = 40ms ---then---> Reocrded Data = 640 bytes (or 320 short)
Size of recorded data is **directly proportional** to both Time and Sample rate.
Sample Rate = 8000 or 16000 etc. (greater the sample rate, greater would be the size)
To see in more detail visit: fundamentals of audio data representation. But Webrtc mainly process 10ms audio data for pre-processing in which packet size is reduced up to 160 bytes.
Secondly, If you want to use multiple AudioRecorder instances at a time, then it is practically impossible. As WebRtc is already recording from microphone then practically MediaRecorder instance would not perform any function as this answer depicts audio-record-multiple-audio-at-a-time. Webrtc has following methods to manage audio bytes such as;
1. Push input PCM data into `ProcessCaptureStream` to process in place.
2. Get the processed PCM data from `ProcessCaptureStream` and send to far-end.
3. The far end pushed the received data into `ProcessRenderStream`.
I have maintained a complete tutorial related to audio processing using Webrtc, you can visit to see more details; Android-Audio-Processing-Using-Webrtc.
There are two parts for the solution:
Get the raw PCM audio frames from webrtc
Save them to a local file in compressed size so that it can be played out later
For the first part you have to attach the SamplesReadyCallback while creating audioDeviceManager by calling the setSamplesReadyCallback method of JavaAudioDeviceModule. This callback will give you the raw audio frames captured by webrtc's AudioRecord from the mic.
For the second part you have to encode the raw frames and write into a file. Check out this sample from google on how to do it - https://android.googlesource.com/platform/frameworks/base/+/master/packages/SystemUI/src/com/android/systemui/screenrecord/ScreenInternalAudioRecorder.java#234
There is some good documentation on this site called big flake about how to use media muxer and mediacodec to encode then decode video as mp4, or extract video then encode it again and more stuff.
But it doesnt seem that there is a way to encode audio with video at the same time, no documentation or code about this. It doesn't seem impossible.
Question
Do you know any stable way of doing it that will work on all devices greater than android 18?
Why no one implemented it, is it hard to implement?
You have to create 2 Mediacodec instances, one for video and one for audio and then use MediaMuxer to mux the video with audio after encoding, you can take a look at ExtractDecodeEditEncodeMuxTest.java and at this project to capture camera/mic and save to mp4 file using Mediamuxer and Mediacodec
I have a working app that streams video to Chromecast(using nannoHttpd) and everything is working fine. Now my problem is: videos recorded using new devices are too large in size to stream, so I want to re-encode videos to some lower bitrate.
I tried ffmpeg but the results are not satisfactory and it will increase the apk size by 14 MB.
Now I am trying the MediaCodec api. It is faster than ffmpeg, but it takes the input file and writes it to the output file and I want to re-encode byte data that is to be served by nannohttpd.
Now a solution comes to my mind, that is to transcode the video and stream the output file but its has two drawbacks;
What if the file is too large and the user doesn't see the whole video? Much of CPU, battery resource is wasted.
What if the user fast forwards a long video to a time which is not re-encoded yet?
1 MediaCodec just do one thing decode encode! and you will get raw bytes of new encoded data. So it is up to the programmer to choose to either dump that into a container (.mp4 file) using a muxer. So no need here to rewrite everything back into a file.
2 Seek to the proper chunk of data and restart MediaCodec.
I'm trying to build a system that live-streams video and audio captured by android phones. Video and auido are captured on the android side using MediaRecorder, and then pushed directly to a server written in python. Clients should access this live feed using their browser, so the I implemented the streaming part of the system using flash. Right now both video and audio content appear on the client side, but the problem is that they are out of sync. I'm sure this is caused by wrong timestamp values in flash (currently I increment ts by 60ms for a frame of video, but clearly this value should be variable).
The audio is encoded into amr on the android phone, so I know exactly each frame of amr is 20ms. However, this is not the case with video, which is encoded into H.264. To synchronized them together, I would have to know exactly how many millisecs each frame of H.264 lasts, so that I can timestamp them later when delivering content using flash. My question is is this kind of information available in NAL units of H.264? I tried to find the answer in H.264 standard, but the information there is just overwhelming.
Can someone please point me at the right direction? Thanks.
Timestamps are not in NAL units, but are typically part of RTP. RTP/RTCP also takes care of media synchronisation.
The RTP payload format for H.264 might also be of interest to you.
If you are not using RTP, are you just sending raw data units over the network?