The question does not mean that I'm interested if ffmpeg code can be used on Andoid. I know that it can. I'm just asking if somebody has the real performance progress with that stuff.
I've created the question after several weeks of experiments with the stuff and I've had enough.
I do not want to write to branches where people even do not say what kind of video they decode (resolution, codec) and talk only about some mystical FPS. I just don't understand what they want to do. Also I'm not going to develop application only for my phone or for Android 2.2++ phones that have some extended OpenGL features. I have quite popular phone HTC Desire so if the application does not work on it, so what's next?
Well, what do I have?
FFMpeg source from the latest HEAD branch. Actually I could not buld it with NDK5 so I decided to use stolen one.
Bambuser's build script (bash) with appropriate ffmpeg source ([web]: http://bambuser.com/r/opensource/ffmpeg-4f7d2fe-android-2011-03-07.tar.gz).
It builds well after some corrections by using NDK5.
Rockplayer's gelded ffmpeg source code with huge Android.mk in the capacity of build script ([web]: http://www.rockplayer.com/download/rockplayer_ffmpeg_git_20100418.zip).
It builds by NDK3 and NDK5 after some corrections. Rockplayer is probably the most cool media player for Android and I supposed that I would have some perks using it's build.
I had suitable video for a project (is not big and is not small): 600x360 H.264.
Both libraries we got from clauses 2 and 3 provide us possibility to get frames from video (frame-by-frame, seek etc.). I did not try to get an audio track because I did not need one for the project. I'm not publishing my source here because I think that's traditional and it's easy to find.
Well, what's the results with video?
HTC Desire, Android 2.2
600x360, H.264
decoding and rendering are in different threads
Bambuser (NDK5 buld for armv5te, RGBA8888): 33 ms/frame average.
Rockplayer (NDK3 build for neon, RGB565): 27 ms/frame average.
It's not bad for the first look, but just think that these are results only to decode frames.
If somebody has much better results with decoding time, let me know.
The most hard thing for a video is rendering. If we have bitmap 600x360 we should scale one somehow before painting because different phones have different screen sizes and we can not expect that our video will be the same size as screen.
What options do we have to rescale a frame to fit it to screen?
I was able to check (the same phone and video source) those cases:
sws_scale() C function in Bambuser's build: 70 ms/frame. Unacceptable.
Stupid bitmap rescaling in Android (Bitmap.createScaledBitmap): 65 ms/frame. Unacceptable.
OpenGL rendering in ortho projection on textured quad. In this case I did not need to scale frame. I just needed to prepare texture 1024x512 (in my case it was RGBA8888) containig frame pixels and than load it in GPU (gl.glTexImage2D). Result: ~220 ms/frame to render. Unacceptable. I did not expect that glTexImage2D just sucked on Snapdragon CPU.
That's all. I know that there is some way to use fragment shader to convert YUV pixels using GPU, but we will have the same glTexImage2D and 200 ms just to texture loading.
But this is not the end. ...my only friend the end... :) It is not a hopeless condition.
Trying to use RockPlayer you definitely will wonder how they do that damn frame scaling so fast. I suppose that they have really good experience in ARM achitecture. They most probably use avcodec_decode_video2 and than img_convert (as I did in RP version), but then they use some tricks (depends of ARM version) for scaling.
Maybe they also have some "magic" buld configuration for ffmpeg decreasing decoding time but Android.mk that they published is not THE Android.mk they use. I don't know.
So, now it looks like you can not just buld some easy JNI bridge for ffmpeg and than have real media player for Android platform. You can do this only if you have suitable video that you do not need to scale.
Any ideas?
I did compile ffmpeg on android. From this point - playing video is purely implementation dependant, so no point of measuring latencies on things whitch can be highly optimised in needed place and not using standart swscale. And yes - you can build some easy JNI bridge and use it in NDK to perform ffmpeg calls, but this would already be a player code.
In my experience, YUV to RGB conversion has always been a bottleneck. Therefore, using an OpenGL shader for this proved to give a significant boost.
I use http://writingminds.github.io/ffmpeg-android-java/ for my project. There is some workaround with complex commands but for simple commands the wrapper work very well for me.
Related
I am recording the audio stream using AudioRecord class.
I am reading samples to a buffer with size of 2040 samples.
After reading samples, they are processed and reduced to size of 170 samples.
Even in this case there are a lot of samples to draw them in real time, I have not managed to configure SciChart library to show this samples correctly, they are compressed however I need to make a chart wider.
I am processing the ECG signal, therefore, all main components like R peak, QRS complex and other should be left without change.
Is there any techniques that could be applied to reduce samples count, but do not spoil the signal.
I think techniques like moving average won't work, as far they are used for smoothing the signal, however in my case I don't need to smooth it.
I would be grateful for any help.
Thanks.
SciChart android should be able to draw 2,040 samples in real-time (actually, millions). It includes methods to automatically downsample waveforms without visual loss of data, and in performance benchmarks can draw a million points on Android devices.
Have you got a code sample? Maybe something is wrong in the configuration.
So I'm trying to get the camera pixel data, monitor any major changes in luminosity and then save the image. I have decided to use open gl as I figured it would be quicker to do the luminosity checks in the fragment shader.
I bind a surface texture to the camera to get the image to the shader and am currently using glReadPixels to get the pixels back which I then put in a bitmap and save.
The bottle neck on the glReadPixels is crazy so I looked into other options and saw that EGL_KHR_image_base was probably my best bet as I'm using OpenGL-ES 2.0.
Unfortunately I have no experience with extensions and don't know where to find exactly what I need. I've downloaded the ndk but am pretty stumped. Could anyone point me in the direction of some documentation and help explain it if I don't understand fully?
Copying pixels with glReadPixels() can be slow, though it may vary significantly depending on the specific device and pixel format. Some tests with using glReadPixels() to save frames from video data (which is also initially YUV) found that 96.5% of the time was in PNG compression and file I/O on a Nexus 5.
In some cases, the time required goes up substantially if the source and destination formats don't match. On one particular device I found that copying to RGBA, instead of RGB, reduced the time required.
The EGL calls can work but require non-public API calls. And it's a bit tricky; see e.g. this answer. (I think the comment in the edit would allow it to work, but I never got back around to trying it, and I'm not in a position to do so now.)
The only solution would be using Pixel Pack Buffer (PBO), where the reading is asynchronous. However, to utilize this asynchronous, you need to have PBO and use it as ping pong buffer.
I refer to http://www.jianshu.com/p/3bc4db687546 where I reduce the read time for 1080p from 40ms to 20ms.
I am trying to play a synthesized sound (basically 2 sine waves and some noise) using the AudioTrack class. It doesn't seem to be any different than the SourceDataLine in javax.sound.sampled, BUT the synthesis is REALLY SLOW. Even for ARM standards, it's unrealistic to think that 32768 samples (16 bit, stereo, for a total of 65536) take over 1 second to render on a Nexus 4 (measured with System.nanotime(), write to AudioTrack excluded).
The synthesis part is almost identical to this http://audioprograming.wordpress.com/2012/10/18/a-simple-synth-in-android-step-by-step-guide-using-the-java-sdk/, the only difference is that I play stereo sound (I can't reduce it to mono because it's a binaural tone).
Any ideas? what can I do?
Thanks in advance
Marko's answer seems very good. But if you're still in the experimental/investigational phase of your project, you might want to consider using Pure Data, which already is implemented as a combination Android library/NDK library and which would allow you to synthesize many sounds and interact with them in a relatively simple manner.
The libpd distribution is the Android implementation of Pure Data. Some good starting references can be found at the SoundOnSound site and also at this site.
Addendum: I found a basic but functional implementation of an Android Midi Driver through this discussion link. The relevant code can be found here (github, project by billthefarmer, named mididriver).
You can view how I use it in my Android app (imSynt link leads you to Google Play), or on YouTube.
The performance of audio synthesis on ARM is actually very respectable with native code that makes good use of the NEON unit. The Dalvik's JIT compiler is never going to get close to this level of performance for floating-point intensive code.
A look at the enormous number of soft-synth apps for iOS provides ample evidence of what should be possible on ARM devices with similar levels of performance.
However, the performance you are reporting is several orders of magnitude short of what I would expect. You might consider the following:
Double precision float-point arithmetic is particularly expensive on ARM Cortex A-x NEON units, where as single precision is very fast and highly parallelizable. Math.sin() returns a double, so is unnecessarily precise, and liable to be slow. The 24-mantissa provided by single precision floating point value is substantially larger than the 16-bit int used by the audio subsystem.
You could precompute sin(x) and then perform a table-lookup in your render loop.
There is a previous post on SO concerning Math.sin(x) on android suggesting degrading performance as x becomes large, as it's likely to in this case over time.
For a more advanced table-based synthesiser, you might consider using a DDS Oscillator.
Ultimately, you might consider using native code for synthesis, with the NDK.
You should be able render multiple oscillators with filters and envelopes and still have CPU time left over. Check your inner loops to make sure that there are no system calls.
Are you on a very old phone? You did not mention the hardware or OS version.
You might want to try using JSyn. It is a free modular Java synthesizer that runs on any Java platform including desktops, Raspberry Pi and Android.
https://github.com/philburk/jsyn
Have you tried profiling your code? It sounds like something else is possibly causing your slow down, profiling would help to highlight the cause.
Mike
what is the best choice for rendering video frames obtained from a decoder bundled into my app (FFmpeg, etc..) ?
I would naturally tend to choose OpenGL as mentioned in Android Video Player Using NDK, OpenGL ES, and FFmpeg.
But in OpenGL in Android for video display, a comment notes that OpenGL isn't the best method for rendering video.
What then? The jnigraphics native library? And a non-GL SurfaceView?
Please note that I would like to use a native API for rendering the frames, such as OpenGL or jnigraphics. But Java code for setting up a SurfaceView and such is ok.
PS: MediaPlayer is irrelevant here, I'm talking about decoding and displaying the frames by myself. I can't rely on the default Android codecs.
I'm going to attempt to elaborate on and consolidate the answers here based on my own experiences.
Why openGL
When people think of rendering video with openGL, most are attempting to exploit the GPU to do color space conversion and alpha blending.
For instance converting YV12 video frames to RGB. Color space conversions like YV12 -> RGB require that you calculate the value of each pixel individually. Imagine for a frame of 1280 x 720 pixels how many operations this ends up being.
What I've just described is really what SIMD was made for - performing the same operation on multiple pieces of data in parallel. The GPU is a natural fit for color space conversion.
Why !openGL
The downside is the process by which you get texture data into the GPU. Consider that for each frame you have to Load the texture data into memory (CPU operation) and then you have to Copy this texture data into the GPU (CPU operation). It is this Load/Copy that can make using openGL slower than alternatives.
If you are playing low resolution videos then I suppose it's possible you won't see the speed difference because your CPU won't bottleneck. However, if you try with HD you will more than likely hit this bottleneck and notice a significant performance hit.
The way this bottleneck has been traditionally worked around is by using Pixel Buffer Objects (allocating GPU memory to store texture Loads). Unfortunately GLES2 does not have Pixel Buffer Objects.
Other Options
For the above reasons, many have chosen to use software-decoding combined with available CPU extensions like NEON for color space conversion. An implementation of YUV 2 RGB for NEON exists here. The means by which you draw the frames, SDL vs openGL should not matter for RGB since you are copying the same number of pixels in both cases.
You can determine if your target device supports NEON enhancements by running cat /proc/cpuinfo from adb shell and looking for NEON in the features output.
I have gone down the FFmpeg/OpenGLES path before, and it's not very fun.
You might try porting ffplay.c from the FFmpeg project, which has been done before using an Android port of the SDL. That way you aren't building your decoder from scratch, and you won't have to deal with the idiosyncracies of AudioTrack, which is an audio API unique to Android.
In any case, it's a good idea to do as little NDK development as possible and rely on porting, since the ndk-gdb debugging experience is pretty lousy right now in my opinion.
That being said, I think OpenGLES performance is the least of your worries. I found the performance to be fine, although I admit I only tested on a few devices. The decoding itself is fairly intensive, and I wasn't able to do very aggressive buffering (from the SD card) while playing the video.
Actually I have deployed a custom video player system and almost all of my work was done on the NDK side. We are getting full frame video 720P and above including our custom DRM system. OpenGL is not your answer as on Android Pixbuffers are not supported, so you are bascially blasting your textures every frame and that screws up OpenGLESs caching system. You frankly need to shove the video frames through the Native supported Bitmap on Froyo and above. Before Froyo your hosed. I also wrote a lot of NEON intrinsics for color conversion, rescaling, etc to increase throughput. I can push 50-60 frames through this model on HD Video.
In my program, it gets MP4 video in, and I want it to output a MP3 (without any server-side stuff.) Since Android (and my app) needs to run on many different hardware configurations, this means I probably cannot use FFMPEG. I know this may be very battery and processing power intensive, especially for a mobile phone, but I need this option for my users. I cannot find any native libraries for Java that don't use FFMPEG.
I see little problem with FFMPEG, since apparently it runs on 11 architectures supported by Debian. Only architecture not supported is apparently m68k, others are old versions in ports to FreeBSD kernel, or Hurd kernel. And from what I know of Android, fact that it's based on ARM isn't going to change any time soon.
Of course, there could be some issues with Java wrappers around native code. Is that the issue? I'm not an Android nor a Java programmer, but I'm sure you can detect the platform and dynamically load appropriate native wrapper.
MPEG_4 Part 14 (.mp4 file extension) is a container format. In other words, this specifies how multiple media streams can be packaged together. Processing container formats is much less computationally expensive than - for example - compressing or decompressing video. I would be surprised if it turned out to be too computationally expensive to read through an .mp4 file and extract an audio stream on a cell phone ARM processor.
I haven't seen any immediately suitable Java libraries either. It probably wouldn't be too hard to build your own library. Parsing container formats is much simpler than decompressing video. And you do have the libavformat implementation in ffmpeg as a reference. The MPEG4 Part 14 standards can be found here:
http://webstore.iec.ch/preview/info_isoiec14496-14%7Bed1.0%7Den.pdf
and here:
http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html
I haven't used it, but I downloaded and am looking at the API for IBM Toolkit for MPEG-4. It looks a little light on data access features, though. The implementation is pure java, though. It looks like they've obfuscated their codec jars.