i would like to use opus codec in my linphone application
but i have a few questions , if someone with opus codec knowledge could help me out would appreciate
OPUS. Does this codec compress as well as package the data?
What is the output data structure from OPUS?
Is the output data streaming or packets?
What does the audio sampling scheme look like?
and….
Within the audio sampling scheme, what are the values for silence?
Within the audio sampling scheme, what are the values for speech?
thx in advance
I see you asked this on the mailing list too, but I'll answer here. I'm not sure what you mean in some the questions, but here's a start. You've tagged your post as relating to Android; I mostly know about the reference C implementation, so if you're asking about the java interface available to Android applications this won't be much help.
OPUS. Does this codec compress as well as package the data?
The Opus codec compressed pcm audio data into packets. There's internal structure, but the codec requires a transport layer like RTP to keep track of the boundaries between compressed packets.
What is the output data structure from OPUS?
The reference encoder accepts a given duration of pcm audio data an fills in a given buffer with compressed data up to a maximum requested size. See opus_encode() and opus_encode_float() in the encoder documentation for details.
Is the output data streaming or packets?
Opus produces a sequence of packets.
What does the audio sampling scheme look like? and….
The reference encoder accepts interleaved mono, stereo, or surround pcm audio data with either 16-bit signed integer or floating point samples at 8, 12, 16, 24, or 48 kHz.
Within the audio sampling scheme, what are the values for silence?
Zero pcm values are silence. As a perceptual codec Opus will try to encode low-level noise if there is no other signal. There is also support for special zero-data compressed packets for sending silence or handling discontinuous transmission.
Within the audio sampling scheme, what are the values for speech?
I'm not sure what you're asking here. Speech is treated the same as music, and will sound equally normal down to 64 kbps. The codec can maintain transparency for speech down to much lower bitrates than for music (something like 24 kbps for mono) and is intelligible down to 6 kbps for narrowband speech.
Related
I have an app calling using WebRTC. But during a call, I need to record microphone. WebRTC has an object WebRTCAudioRecord to record audio but the audio file is so large (PCM_16bit). I want to record but to a smaller size.
I've tried MediaRecorder but it doesn't work because WebRTC is recorded and MediaRecorder does not have permission to record while calling.
Has anyone done this, or have any idea that could help me?
Webrtc is considered as comparatively much better pre-processing tool for Audio and Video.
Webrtc native development includes fully optimized native C and C++ classes, In order to maintain wonderful Speech Quality and Intelligibility of audio and video which is quite interesting.
Visit Reference Link: https://github.com/jitsi/webrtc/tree/master/examples regularly.
As Problem states;
I want to record but smaller size. I've tried MediaRecorder and it doesn't work because WebRtc is recorded and MediaRecorder has not permission to record while calling.
First of all, to reduce or minimize the size of your recorded data (audio bytes), you should look at different types of speech codecs which basically reduce the size of recorded data by maintaining sound quality at a level. To see different voice codecs, here are well-known speech codecs as follows:
OPUS
SPEEX
G7.11 (G-Series Speech Codecs)
As far as size of the audio data is concerned, it basically depends upon the Sample Rate and Time for which you record a chunk or audio packet.
Supppose time = 40ms ---then---> Reocrded Data = 640 bytes (or 320 short)
Size of recorded data is **directly proportional** to both Time and Sample rate.
Sample Rate = 8000 or 16000 etc. (greater the sample rate, greater would be the size)
To see in more detail visit: fundamentals of audio data representation. But Webrtc mainly process 10ms audio data for pre-processing in which packet size is reduced up to 160 bytes.
Secondly, If you want to use multiple AudioRecorder instances at a time, then it is practically impossible. As WebRtc is already recording from microphone then practically MediaRecorder instance would not perform any function as this answer depicts audio-record-multiple-audio-at-a-time. Webrtc has following methods to manage audio bytes such as;
1. Push input PCM data into `ProcessCaptureStream` to process in place.
2. Get the processed PCM data from `ProcessCaptureStream` and send to far-end.
3. The far end pushed the received data into `ProcessRenderStream`.
I have maintained a complete tutorial related to audio processing using Webrtc, you can visit to see more details; Android-Audio-Processing-Using-Webrtc.
There are two parts for the solution:
Get the raw PCM audio frames from webrtc
Save them to a local file in compressed size so that it can be played out later
For the first part you have to attach the SamplesReadyCallback while creating audioDeviceManager by calling the setSamplesReadyCallback method of JavaAudioDeviceModule. This callback will give you the raw audio frames captured by webrtc's AudioRecord from the mic.
For the second part you have to encode the raw frames and write into a file. Check out this sample from google on how to do it - https://android.googlesource.com/platform/frameworks/base/+/master/packages/SystemUI/src/com/android/systemui/screenrecord/ScreenInternalAudioRecorder.java#234
I want to trim an existing aac-mp4 audio file. For the first time I want to "trim" 0 bytes, basically just to copy the file using MediaCodec/MediaExtractor.
Questions:
The header is fixed size and I can just copy it from the old file? Or it has some infos about the track duration and I need to update it? If it has fixed size which is that (in order to know how many bytes should I copy from the old file)?
Should I only use the extractor's getSampleData(ByteBuffer, offset) and advance() or I should also use the MediaCodec and extract the samples(decode) and then encode them again with an encoder - and write the encoded values?
If you use MediaExtractor, you probably aren't going to read the raw file yourself, so I don't see what header you're proposing to copy. This is probably easiest to do with MediaExtractor + MediaMuxer; just copy the MediaFormat and the packets you get from MediaExtractor to MediaMuxer.
This depends on how you want to do the trimming. It's absolutely simplest to not involve MediaCodec at all, but just copy packets from MediaExtractor to MediaMuxer, and skip the packets at the start that you want to omit (or use seekTo() for seeking to the right start position).
But keep in mind that audio frames have a certain length; for AAC-LC it's usually 1024 samples, which for 48 kHz audio is 21 milliseconds. So if you only copy individual packets, you can't get any closer trimming granularity than 21 milliseconds, for 48 kHz. This probably is fine for most cases, but if the audio has a lower sample rate, say 8 kHZ, the granularity ends up as high as 128 ms.
If you want to trim to a more exact position than the individual packets allow you, you need to decode using MediaCodec, skip the right amount of samples, repackage output frames from the decoder into new full frames for the encoder, and encode this.
I have less knowledge on the codecs. What I know codec stands for Decode/Encode.In codecs will be built in mobiles and external libs can used as an alternative. By codecs plays big role for Audio\Video in which format have encoded as file and decoded to play them.
Problem :
Android api 16 is shipped with MediaCodec which can do Encoding/Decoding work. MediaCodec contains flags constant
"video/mp4v-es"
Is it same as MPEG-4 part 2 (MPEG-4 Visual Format) codec format.
note : There is MPEG-4 part 10 format which is (H.264 )AVC Format. I just want need confirmation or any documentation or Blogs links which can help me in this.
Yes.
By default "video/mp4v-es" maps to the Google's MPEG4 Part-2 Video Software Codec. See media_codecs_google_video_xml for details. However on a real device, it will be implemented by a hardware video codec as software-video-codecs are processor-intensive.
For MPEG4 Part 10 (H.264), "video/avc" has to be used.
Its actually quite ambiguously defined but I believe that MP4V-ES is an MPEG-4 audio/visual stream which has been fragmented and mapped to RTP packets for transport using the RTP streaming protocol.
The RFC describing this outlines an efficient and pragmatic mapping of the audio and video packets to RTP packets - for example it does not simply assume that there is a one to one mapping.
More info is available in the RFC defining the format: https://www.rfc-editor.org/rfc/rfc6416
I have a project where I have been asked to display a video stream in android, the stream is raw H.264 and I am connecting to a server and will receive a byte stream from the server.
Basically I'm wondering is there a way to send raw bytes to a decoder in android and display it on a surface?
I have been successful in decoding H264 wrapped in an mp4 container using the new MediaCodec and MediaExtractor API in android 4.1, unfortunately I have not found a way to decode a raw H264 file or stream using these API's.
I understand that one way is to compile and use FFmpeg but I'd rather use a built in method that can use HW acceleration. I also understand RTSP streaming is supported in android but this is not an option. Android version is not an issue.
I can't provide any code for this unfortunately, but I'll do my best to explain it based on how I got it to work.
So here is my overview of how I got raw H.264 encoded video to work using the MediaCodec class.
Using the link above there is an example of getting the decoder setup and how to use it, you will need to set it up for decoding H264 AVC.
The format of H.264 is that it’s made up of NAL Units, each starting with a start prefix of three bytes with the values 0x00, 0x00, 0x01 and each unit has a different type depending on the value of the 4th byte right after these 3 starting bytes. One NAL Unit IS NOT one frame in the video, each frame is made up of a number of NAL Units.
Basically I wrote a method that finds each individual unit and passes it to the decoder (one NAL Unit being the starting prefix and any bytes there after up until the next starting prefix).
Now if you have the decoder setup for decoding H.264 AVC and have an InputBuffer from the decoder then you are ready to go. You need to fill this InputBuffer with a NAL Unit and pass it back to the decoder and continue doing this for the length of the stream.
But, to make this work I had to pass the decoder a SPS (Sequence Parameter Set) NAL Unit first. This unit has a byte value of 0x67 after the starting prefix (the 4th byte), on some devices the decoder would crash unless it received this Unit first.
Basically until you find this unit, ignore all other NAL Units and keep parsing the stream until you get this unit, then you can pass all other units to the decoder.
Some devices didn't need the SPS first and some did, but you are better of passing it in first.
Now if you had a surface that you passed to the decoder when you configured it then once it gets enough NAL units for a frame it should display it on the surface.
You can download the raw H.264 from the server, then offer it via a local HTTP server running on the phone and then let VLC for Android do playback from that HTTP server. You should use VLC's http/h264:// scheme to force the demuxer to raw H.264 (if you don't force the demuxer VLC may not be able to recognize the stream, even when the MIME type returned by the HTTP server is set correctly). See
https://github.com/rauljim/tgs-android/blob/integrate_record/src/com/tudelft/triblerdroid/first/VideoPlayerActivity.java#L211
for an example on how to create an Intent that will launch VLC.
Note: raw H.264 apparently has no timing info, so VLC will play as fast as possible.
First embedding it in MPEGTS will be better. Haven't found a Android lib that will do that yet.
Here are the resources I've found helpful in a similar project:
This video has been super insightful in understanding how MediaCodec handles raw h.264 streams on a high level.
This thread goes into a bit more detail as to handling the SPS/PPS NALUs specifically. As was mentioned above, you need to separate individual NAL Units using the start prefix, and then hand the remaining data to the MediaCodec.
This repo (libstreaming) is a great example of decoding an H264 stream in Android using RTSP/RTP for transmission.
I am having problems figuring out how to detect if an AAC audio source is compatible with Android. The supported media formats page for android says 'AAC LC/LTP' when delivered as 3GP, MPEG4 or ADTS raw AAC. It appears the LC means 'Low Complexity" and LTP means "Long Term Prediction" but, my biggest frustration is determining what AAC profiles/modules are supported on Android. When I run the input into ffmpeg, i see its AAC, but no extended information about the AAC. An example source is http://6693.live.streamtheworld.com:80/WTMJAMAAC_SC . Anyone have any ideas?
You can get extended media information programmatically using the MediaInfo library available here:
http://mediainfo.sourceforge.net/en/Download
The "DLL" or other media downloads include example code in C, C#, etc.
If you do not want to write any code, the same website has downloads for "MediaInfo", a program that uses the library to display information.
Your Android supported media formats link says: "Mono/Stereo content in any combination of standard bit rates up to 160 kbps and sampling rates from 8 to 48kHz". Notice the sample below shows all of those: Channel(s), Overall bit rate, and sampling rate.
It may be necessary to test for yourself whether "up to 160 kbps" means "Up to 160 kbps overall" or "No part of the file, including those encoded with variable bit rates (VBR), may surpass 160kbps."
It is noteworthy that I have played movies on my single-core Android phone which have 256KBit VBR AAC 6-channel audio, though obviously I did not hear the rear surround channels. Because of those, I suspect that the limitations proposed in the link are minimums allowed by Google, but that the audio formats supported in practice are much more broad.
Here is an example from an actual AAC file (using the MediaInfo program):
Format : ADTS
Format/Info : Audio Data Transport Stream
File size : 176 KiB
Duration : 30s 707ms
Overall bit rate : 46.8 Kbps
Audio
Format : AAC
Format/Info : Advanced Audio Codec
Format version : Version 4
Format profile : LC
Format settings, SBR : Yes
Format settings, PS : Yes
Muxing mode : ADTS
Duration : 30s 707ms
Bit rate mode : Constant
Bit rate : 46.8 Kbps
Channel(s) : 2 channels
Sampling rate : 44.1 KHz
Stream size : 176 KiB (100%)
I wrote a wrapper library in C# for MediaInfo. It isn't necessary to use MediaInfo, but makes its use much easier and more ".NET-friendly". It can be found here: MediaInfo.Net.
Load the source into Media Player Classic.
View its propreties.
In the MediaInfo tab it would list:
Format : AAC
Format profile : LC
If you just want to check the profile used for a few files, you may use VLC or any other program (like Sheepy already suggested) - in VLC it's in Extras -> Media Information -> CodecDetails and in your example stream, it's AAC SBR+PS (this is a High-Efficiency Profile), which is decodable by android.
If you do have control over the media you want to play through android, you may want to check out this blog article on cross platform mobile media for the correct encoding. If not (e.g. because the user might be able to choose his own urls or files), you should instead catch any exceptions and display an error message. That way, you are also future proof against new media types, which might be supported in future android versions.