I am trying to design hardware accelerated video encoder based on Android. I have done research for some time but I did not find much useful.
Anyway, I saw the Gstreamer (http://gstreamer.freedesktop.org/). It is said this can provide hardware video encoder. However, after I read the manual, I found nothing about encoder.
Does anyone know about this stuff? Thank you!
It's going to be dependent on your hardware. What device are you running on?
If your processor contains an IP core that implements video encoding/decoding, the manufacturer needs to either offer a driver so you can call this hardware, or ideally go a step further and offer a specific plugin for GStreamer that does it.
For example, the Freescale i.MX6 processor (used in the Wandboard and CuBox) has a driver maintained by Freescale: https://github.com/Freescale/gstreamer-imx
TI OMAP processors have support: http://processors.wiki.ti.com/index.php/GStreamer, also see TI Distributed Codec Engine.
Broadcom processors have support: https://packages.debian.org/wheezy/gstreamer0.10-crystalhd
There are also several standard interfaces to video accelerator hardware, including VDPAU, VAAPI, and OpenMax IL. If your processor is not one of the above, someone may have written a driver that maps one of these standard interfaces to your hardware.
The Rasberry Pi is apparently supported by the OpenMax IL plugin: http://gstreamer.freedesktop.org/releases/gst-omx/1.0.0.html
If you don't know whether your processor is supported, I'd search for the name and various combinations of "VDPAU", "VAAPI", etc.
There are a wide variety of encoding options in Gstreamer to take a raw stream and encode it. Pretty much any element ending in "enc" can be used to do the encoding. Here is a good example of a few encoding pipelines:
https://developer.ridgerun.com/wiki/index.php/TVP5146_GStreamer_example_pipelines
With that said, I'd caution that video encoding is extremely hardware intensive. I would also look at getting a special purpose hardware encoder and to not do software encoding via GStreamer if you're stream is a robust size.
Related
I'm currently attempting to minimize audio latency for a simple application:
I have a video on a PC, and I'm transmitting the video's audio through RTP to a mobile client. With a very similar buffering algorithm, I can achieve 90ms of latency on iOS, but a dreadful ±180ms on Android.
I'm guessing the difference stems from the well-known latency issues on Android.
However, after reading around for a bit, I came upon this article, which states that:
Low-latency audio is available since Android 4.1/4.2 in certain devices.
Low-latency audio can be achieved using libpd, which is Pure Data library for Android.
I have 2 questions, directly related to those 2 statements:
Where can I find more information on the new low-latency audio in Jellybean? This is all I can find but it's sorely lacking in specific information. Should the changes be transparent to me, or is there some new class/API calls I should be implementing for me to notice any changes in my application? I'm using the AudioTrack API, and I'm not even sure if it should reap benefits from this improvement or if I should be looking into some other mechanism for audio playback.
Should I look into using libpd? It seems to me like it's the only chance I have of achieving lower latencies, but since I've always thought of PD as an audio synthesis utility, is it really suited for a project that just grabs frames from a network stream and plays them back? I'm not really doing any synthesizing. Am I following the wrong trail?
As an additional note, before someone mentions OpenSL ES, this article makes it quite clear that no improvements in latency should be expected from using it:
"As OpenSL ES is a native C API, non-Dalvik application threads which
call OpenSL ES have no Dalvik-related overhead such as garbage
collection pauses. However, there is no additional performance benefit
to the use of OpenSL ES other than this. In particular, use of OpenSL
ES does not result in lower audio latency, higher scheduling priority,
etc. than what the platform generally provides."
For lowest latency on Android as of version 4.2.2, you should do the following, ordered from least to most obvious:
Pick a device that supports FEATURE_AUDIO_PRO if possible, or FEATURE_AUDIO_LOW_LATENCY if not. ("Low latency" is 50ms one way; pro is <20ms round trip.)
Use OpenSL. The Dalvik GC has a low amortized cost, but when it runs it takes more time than a low-latency audio thread can allow.
Process audio in a buffer queue callback. The system runs buffer queue callbacks in a thread that has more favorable scheduling than normal user-mode threads.
Make your buffer size a multiple of AudioManager.getProperty(PROPERTY_OUTPUT_FRAMES_PER_BUFFER). Otherwise your callback will occasionally get two calls per timeslice rather than one. Unless your CPU usage is really light, this will probably end up glitching. (On Android M, it is very important to use EXACTLY the system buffer size, due to a bug in the buffer handling code.)
Use the sample rate provided by AudioManager.getProperty(PROPERTY_OUTPUT_SAMPLE_RATE). Otherwise your buffers take a detour through the system resampler.
Never make a syscall or lock a synchronization object inside the buffer callback. If you must synchronize, use a lock-free structure. For best results, use a completely wait-free structure such as a single-reader single-writer ring buffer. Loads of developers get this wrong and end up with glitches that are unpredictable and hard to debug.
Use vector instructions such as NEON, SSE, or whatever the equivalent instruction set is on your target processor.
Test and measure your code. Track how long it takes to run--and remember that you need to know the worst-case performance, not the average, because the worst case is what causes the glitches. And be conservative. You already know that if it takes more time to process your audio than it does to play it, you'll never get low latency. But on Android this is even more important, because the CPU frequency fluctuates so much. You can use perhaps 60-70% of CPU for audio, but keep in mind that this will change as the device gets hotter or cooler, or as the wifi or LTE radios start and stop, and so on.
Low-latency audio is no longer a new feature for Android, but it still requires device-specific changes in the hardware, drivers, kernel, and framework to pull off. This means that there's a lot of variation in the latency you can expect from different devices, and given how many different price points Android phones sell at, there probably will always be differences. Look for FEATURE_AUDIO_PRO or FEATURE_AUDIO_LOW_LATENCY to identify devices that meet the latency criteria your app requires.
From the link at your point 1:
"Low-latency audio
Android 4.2 improves support for low-latency audio playback, starting
from the improvements made in Android 4.1 release for audio output
latency using OpenSL ES, Soundpool and tone generator APIs. These
improvements depend on hardware support — devices that offer these
low-latency audio features can advertise their support to apps through
a hardware feature constant."
Your citation in complete form:
"Performance
As OpenSL ES is a native C API, non-Dalvik application threads which
call OpenSL ES have no Dalvik-related overhead such as garbage
collection pauses. However, there is no additional performance benefit
to the use of OpenSL ES other than this. In particular, use of OpenSL
ES does not result in lower audio latency, higher scheduling priority,
etc. than what the platform generally provides. On the other hand, as
the Android platform and specific device implementations continue to
evolve, an OpenSL ES application can expect to benefit from any future
system performance improvements."
So, the api to comunicate with drivers and then hw is OpenSl (in the same fashion Opengl does with graphics). The earlier versions of Android have a bad design in drivers and/or hw, though. These problems were addressed and corrected with 4.1 and 4.2 versions, so if the hd have the power, you get low latency using OpenSL.
Again, from this note from the puredata library website, is evident that the library uses OpenSL itself to achieve low latency:
Low latency support for compliant devices
The latest version of Pd for
Android (as of 12/28/2012) supports low-latency audio for compliant
Android devices. When updating your copy, make sure to pull the latest
version of both pd-for-android and the libpd submodule from GitHub.
At the time of writing, Galaxy Nexus, Nexus 4, and Nexus 10 provide a
low-latency track for audio output. In order to hit the low-latency
track, an app must use OpenSL, and it must operate at the correct
sample rate and buffer size. Those parameters are device dependent
(Galaxy Nexus and Nexus 10 operate at 44100Hz, while Nexus 4 operates
at 48000Hz; the buffer size is different for each device).
As is its wont, Pd for Android papers over all those complexities as
much as possible, providing access to the new low-latency features
when available while remaining backward compatible with earlier
versions of Android. Under the hood, the audio components of Pd for
Android will use OpenSL on Android 2.3 and later, while falling back
on the old AudioTrack/AudioRecord API in Java on Android 2.2 and
earlier.
When using OpenSL ES you should fulfil the following requirements to get low latency output on Jellybean and later versions of Android:
The audio should be mono or stereo, linear PCM.
The audio sample rate should be the same same sample rate as the output's native rate (this might not actually be required on some devices, because the FastMixer is capable of resampling if the vendor configures it to do so. But in my tests I got very noticeable artifacts when upsampling from 44.1 to 48 kHz in the FastMixer).
Your BufferQueue should have at least 2 buffers. (This requirement has since been relaxed. See this commit by Glenn Kasten. I'm not sure in which Android version this first appeared, but a guess would be 4.4).
You can't use certain effects (e.g. Reverb, Bass Boost, Equalization, Virtualization, ...).
The SoundPool class will also attempt to make use of fast AudioTracks internally when possible (the same criteria as above apply, except for the BufferQueue part).
Those of you more interested in Android’s 10 Millisecond Problem ie low latency audio on Android. We at Superpowered created the Android Audio Path Latency Explainer. Please see here:
http://superpowered.com/androidaudiopathlatency/#axzz3fDHsEe56
Another database of audio latencies and buffer sizes used:
http://superpowered.com/latency/#table
Source code:
https://github.com/superpoweredSDK/SuperpoweredLatency
There is a new C++ Library Oboe which help with reducing Audio Latency. I have used it in my projects and it works good.
It has this features which help in reducing audio latency:
Automatic latency tuning
Chooses the audio API (OpenSL ES on API 16+ or AAudio on API 27+)
Application for measuring sampleRate and bufferSize: https://code.google.com/p/high-performance-audio/source/checkout and http://audiobuffersize.appspot.com/ DB of results
I am doing a bit of research about SVC for the H264 codec and as far as I know, the SVC is an extension of the previous AVC which uses a base layer for SVC so that it works on a mobile device(perferably android).
My question is, is it possible to enhance this base layer on a mobile device using SVC? Is a mobile device powerful enough(memory, ram ect.) to perform this?
Thanks
Your question can not really be answered, it depends...
FWIW here's my 0.02 cents:
Modern mobile phones e.g such as the Samsung Galaxy S2 have a 1.2 GHz Dual Core processor and 1GB of RAM. While other phones may have lower specifications, mobiles in general are constantly improving. I see no reason why such devices could not decode an SVC stream. However this also depends on other factors such as the resolution and complexity of the video, the number of SVC layers and of course, very importantly, the efficiency of the decoder implementation.
While Android does have an H.264 decoder, I suspect it may be some time until it supports SVC.
Im not sure i completely understand the question but ill try to answer anyway
an SVC stream is always composed of a base layer which is H264 compatible and 1 or more enhancmement layers (temporal, spatial or quality ) which can only be decoded by and SVC decoder.
Most mobile devices use and HW accelrator to decode the H.264 stream so the CPU is hardly loaded while decoding the base layer
to decode enhancement layer(s) on android you will need to use an SVC decoder for arm which i'm not sure if exist at all. you can try to port open source projects like opensvc yourself
since the decoding of the enhancement layer is highly dependant on the base layers you will not be able to use the H264 HW accelerator for the base layer because the HW accelerator cannot supply the metadata for the enhancement layer deocde process.
so in terms of processing power you will need to load the CPU both for the base layer and for the enhancement layers. wether it will runs depends on the following
1. performance of svc decoder code
2. resolution and fps of the video
3. complexity of the content
4. amount of type enhancment layers
hope this answers your question
I'm looking for some information about encoding video on an Android phone using hardware acceleration. I know some phones (if anyone has a list?) support encoding for the camera, and was hoping I could access the chip to encode a live feed supplied through say wifi, usb.
Also, I'm interested in the latency any such chip would provide.
EDIT: Apparently Android uses PacketVideo, however not much documentation to be found for encoding.
EDIT: Android documentation shows a video-encoder: http://developer.android.com/reference/android/media/MediaRecorder.VideoEncoder.html. However it does not say anything about hardware acceleration.
MediaCodec should fit your needs. Documentation says nothing about hardware acceleration, but according to logs and benchmarks, it relies on OpenMAX library.
I want to develop a Android App which will use a SIP Server of my client. My client is exposing couple of REST API from the SIP server for communicating with the apps.
I want to know which would be the best codec type for this app?
Basically, I want to create a SIP-Stack and send the SIP Packets to the Server. So, there should be a coding and decoding system for the packets. My client prefers 16 kb/sec but I am not sure which should I use.
As others have said, SIP does not transfer audio or video. Although in theory, you can send data over any transport, including ATM, analog lines, a DS0, etc, in the real world, RTP is the most common. RTP (Real Time Protocol) and RTCP (Real Time Control Protocol) or SRTP (Secure RTP) usually carry the audio and video.
As far as codecs go, you will be limited by what your server supports. Here are a few common codecs and some pros and cons of each.
G.711 - Toll quality (ie good as a good analog phone line, or even a bit better). "Universal" in that virtually every device supports G.711. Takes a lot of bandwidth, it doesn't really compress data (G.711 is a "compander"). The baseline G.711 is pretty bare-bones (its really a couple of look up tables). Appendix I adds packet loss concealment (PLC) and Appendix II adds silence suppression and comfort noise generation.
GSM - used on cellphones, sounds ok, good PLC, good compression
G.729A - widely used, near toll quality, good compression (8Kbps)
G.723.1 - widely used, almost as good as G.729, better compression (4-5Kbps)
G.722 - sounds better than G.711, wideband (twice the audio bandwidth of G.711 or an analog call), same bandwidth used on the line as G.711
GIPS - various implemnetations exist, one is free. IIRC, uses about 13.5Kbps on the line, sound is not as good asG.723.1 (but this is a perceptual metric, YMMV) Takes a lot of processor.
All the codecs use some processor and other system resources, as a rule of thumb the more aggressive the codec (the smaller the bandwidth) the more processor used. Also, all of these particular codecs are lossy codecs--they lose some of the data. This means that there is compression, not that portions of the audio are dropped due to poor routing and poor line quality. Much like an MP3 is considered a LOSSY codec while FLAC is considered Lossless. If you're interested the following wikipedia article explains in further detail: http://en.wikipedia.org/wiki/Lossy_compression
You need to know what codecs and protocols this SIP server will support. If you control both ends and want to stick to 16Kbps, you'll want iLBC (no royalties) or G.729 (royalties apply). G.711 and (now) G.722 have no royalties either, but both use ~64Kbps.
The list given before is good, with a few issues.:
GIPS - various implemnetations exist, one is free. IIRC, uses about 13.5Kbps on the line, sound is not as good asG.723.1 (but this is a perceptual metric, YMMV) Takes a lot of processor.
GIPS is not a codec - iLBC and iSAC are codecs designed by GIPS. iLBC is free and standardized. iLBC is high quality, 13 or 15Kbps, and is very resilient to packet loss compared to G.729 or even G.711. You can have 30 or even 50% loss with iLBC and still be understood. I'm not sure I'd say it uses a lot of CPU compared to say G.729.
All the codecs use some processor and other system resources, as a rule of thumb the more aggressive the codec (the smaller the bandwidth) the more processor used. Also, all codecs are lossy codecs--they lose some of the data.
Well, G.711 isn't really lossy per-se (in theory yes, but it's almost more quantization-level loss). 64K G.722 isn't very lossy either. G.723 sucks dead gerbils through garden hoses. :-)
It sounds like a bad idea to do it yourself. Developing a sip client is not a trivial task since there are several protocols you would have to implement. Choosing the coding is not very important decision compared to the rest.
imho you should use one of the open source sip stacks available (like pjsip) or build your application on top of an open source sip client (like sipdroid).
But since you asked for codec: Use the GSM codec. Saved bandwidth and sounds OK. G.711 is otherwise the standard codec that 99% of all sip servers support.
Any?
SIP does not send and deal with ANY data packets. SIP is the Session Initiation Protocol and it handles the NEGOTIATION OF SESSIONS.
The sessions then arae - in case of auio and video - are based on RTP and use RTSP for signalling. So, your question indicates a REAL lack of knowledge of what you need to do - the real uestion is: you need a RTP compatible codec.
Which is similar senseless. RTP is jsut a carrier protocol. THis is like asking "what is a HTTP compatioble image format". HTTP does not care. The browser does.
In case of RTP, it means - RTP does not care. It can transport ANY data. WHat is important is that BOTH SIDES know the same codec. So, in your case it means:
If you program both sides then it is your choice.
If you program only one sidwe (like a SIP phone system), then the question is waht "normal" programs handle.
There are two things you need to take into consideration here:
What other devices/servers that handle media are being deployed or are planned to be deployed
Is your customer looking for narrowband or wideband solution - this will affect the voice quality of the call greatly
Once you have nailed down the answers to the two questions above you will be able to select.
For mobile devices, the voice codecs usually used are AMR-NB or AMR-WB. For SIP it is usually G.729 or G.722.x.
You also have Speex, ISAC and SILK to choose from.
You will probably need to do G.711 in any case just to interoperate with everything - bandwidth will be higher though.
No easy answer here. If your customer can select, or state what other devices are being used - it will be easier for you to select.
In my program, it gets MP4 video in, and I want it to output a MP3 (without any server-side stuff.) Since Android (and my app) needs to run on many different hardware configurations, this means I probably cannot use FFMPEG. I know this may be very battery and processing power intensive, especially for a mobile phone, but I need this option for my users. I cannot find any native libraries for Java that don't use FFMPEG.
I see little problem with FFMPEG, since apparently it runs on 11 architectures supported by Debian. Only architecture not supported is apparently m68k, others are old versions in ports to FreeBSD kernel, or Hurd kernel. And from what I know of Android, fact that it's based on ARM isn't going to change any time soon.
Of course, there could be some issues with Java wrappers around native code. Is that the issue? I'm not an Android nor a Java programmer, but I'm sure you can detect the platform and dynamically load appropriate native wrapper.
MPEG_4 Part 14 (.mp4 file extension) is a container format. In other words, this specifies how multiple media streams can be packaged together. Processing container formats is much less computationally expensive than - for example - compressing or decompressing video. I would be surprised if it turned out to be too computationally expensive to read through an .mp4 file and extract an audio stream on a cell phone ARM processor.
I haven't seen any immediately suitable Java libraries either. It probably wouldn't be too hard to build your own library. Parsing container formats is much simpler than decompressing video. And you do have the libavformat implementation in ffmpeg as a reference. The MPEG4 Part 14 standards can be found here:
http://webstore.iec.ch/preview/info_isoiec14496-14%7Bed1.0%7Den.pdf
and here:
http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html
I haven't used it, but I downloaded and am looking at the API for IBM Toolkit for MPEG-4. It looks a little light on data access features, though. The implementation is pure java, though. It looks like they've obfuscated their codec jars.