Is achartengine ready for realtime graphing? - android

I'm trying to graph some real-time data, "realtime" here means < 10msec data, ideally as low as possible. I've been able to get Android to fetch and process data this fast but ACE just looks like it's not been designed for real-time use in mind. First symptoms are that garbage collector kicks in like there's no tomorrow and totally kills the app. I'm visualizing data on a "sliding window" fashion so it's not like I'm expecting ACE to plot in real time hundreds of thousandths of points.
I've taken a look at it and the onDraw for XYChart certainly allocates very heavily in cases where it looks like it's convenient and probably makes the code much more readable but not really required.
This might even be worse than it used to be so it might not have been noticed yet. I saw bugfix for issue #225 solved a concurrency problem changing:
return mXY.subMap(start, stop);
for:
return new TreeMap<Double, Double>(mXY.subMap(start, stop));
This creates huge allocations (still backed by the original subMap though) when it would probably be better to queue updates while onDraw is going on and process them later on atomic updates or something on that line to avoid concurrency issues.
The real pity here is that ACE is certainly fast enough for what I need. It can do what I need on my HW perfectly but since it allocates so heavily on repaint Android goes crazy with GC. It soon starts allocating while GC is running so it has to wait and my app starts looking like a stop-motion movie.
The real question though is: Is it reasonable to expect to be able to repaint 4 or 6 linecharts (tablet app) on realtime (sub 200ms refresh rate) with ACE or is it simply not prepared for that kind of abuse?
If the answer is no. Any other options there you'd recommend?
EDIT 20130109:
Revision 471 improves things quite a bit for small data sets. 2.000 points / 4 charts / 100 msec refresh rate is doable and smooth. Logs still see "GC_CONCURRENT freed" like crazy (around 10/sec) but no "WAIT_FOR_CONCURRENT_GC blocked" which are the showstoppers that make your app stop-motion like.
At 3.000 points / 1 chart / 100 msec it's clearly not smooth. We get again the avalanche of "WAIT_FOR_CONCURRENT_GC blocked" on logcat and the stuttering app. Again it looks like we do not have a speed problem, only a memory management problem.
It may look like I might be asking ACE to do magic but I hit this wall after refactoring all my code to retrieve and store telemetry data at 1KHz. Once I finally saw my app retrieve and store all of that realtime without triggering GC at all I pulled my hair with ACE when trying to graph :)

First of all thanks for the great question and for the point you raised. You were definitely right about the huge memory allocation that was done under the onDraw() method. I fixed that and checked in the code in SVN. I have also added a synchronized block inside the onDraw() method such as it will hopefully not throw ConcurrentModificationException when adding new data to the dataset during repaints.
Please checkout the code from SVN and do an ant dist in order to build new AChartEngine jar file and embed it in your application. Please see instructions here.
To answer your question: AChartEngine is definitely ready for dynamic charting. The issue you reported was a show-stopper, but it should be fixed now. I have written dynamic charting using it. However, you need to make sure you don't add 100000s of data values to the datasets. Old data can be removed from the datasets in order to gain performance.
It is definitely reasonable to paint the 5 or so line charts if they have up to a few 1000s points.

After a big effort on optimizing everything else on my app I still can't make it plot in what I understand for "realtime".
The library is great, and very fast but the way in which every onDraw allocates memory inevitably triggers an avalanche of garbage collection which collides with its own allocation and thus android freezes your app momentarily causing stutter which is totally incompatible with "realtime" graphing. The "stutter" here can range from 50-250ms (yes milliseconds) but that's enough to kill a realtime app.
AChartengine will allow you to create "dynamic" graphs just as long as you don't require them to be "realtime" (10 frames/sec, (<100ms refresh rate) or what you'd call "smooth").
If someone needs more information on the core problem here, or why I'm saying that the library is fast enough but memory allocation patterns end up causing performance problems take a look at Google I/O 2009 - Writing Real-Time Games for Android

Ok, so after some time looking for another graphing library I found nothing good enough :)
That made me look into ACE again and I ended up doing a small patch which makes it "usable" for me although far from ideal.
On XYSeries.java I added a new method:
/**
* Removes the first value from the series.
* Useful for sliding, realtime graphs where a standard remove takes up too much time.
* It assumes data is sorted on the key value (X component).
*/
public synchronized void removeFirst() {
mXY.removeByIndex(0);
mMinX = mXY.getXByIndex(0);
}
I found that on top of the memory issues there where some real speed issues at high speed framerates too. I was spending around 90% of the time on the remove function as I removed points which scrolled out of view. The reason is that when you remove a min|max point ACE calls InitRange which iterates over every point to recalculate those min/max points it uses internally. As I'm processing 1000 telemetry frames per sec and a have a small viewport (forced by ACE memory allocation strategies) I hit min/max points very often on the remove function.
I created a new method used to remove the first point of a series which will normally be called as soon as you add a point which makes that one scroll out of your viewport. If your points are sorted on key value (classic dateTime series) then we can adjust mMinX so the viewport still looks nice and do that very fast.
We can't update minY or maxY fast with the current ACE implementation (not that I looked into it though), but if you set up an appropriate initial range it may not be required. In fact I manually adjust the range from time to time using extra information I have because I know what I'm plotting and what the normal ranges are at different points in time.
So, this might be good enough for someone else but I insist that a memory allocation refactoring is still required on ACE for any serious realtime graphing.

Related

Detecting face landmarks points in android

I am developing app in which I need to get face landmarks points on a cam like mirror cam or makeup cam. I want it to be available for iOS too. Please guide me for a robust solution.
I have used Dlib and Luxand.
DLIB: https://github.com/tzutalin/dlib-android-app
Luxand: http://www.luxand.com/facesdk/download/
Dlib is slow and having a lag of 2 sec approximately (Please look at the demo video on the git page) and luxand is ok but it's paid. My priority is to use an open source solution.
I have also use the Google vision but they are not offering much face landmarks points.
So please give me a solution to make the the dlib to work fast or any other option keeping cross-platform in priority.
Thanks in advance.
You can make Dlib detect face landmarks in real-time on Android (20-30 fps) if you take a few shortcuts. It's an awesome library.
Initialization
Firstly you should follow all the recommendations in Evgeniy's answer, especially make sure that you only initialize the frontal_face_detector and shape_predictor objects once instead of every frame. The frontal_face_detector will initialize faster if you deserialize it from a file instead of using the get_serialized_frontal_faces() function. The shape_predictor needs to be initialized from a 100Mb file, and takes several seconds. The serialize and deserialize functions are written to be cross-platform and perform validation on the data, which is robust but makes it quite slow. If you are prepared to make assumptions about endianness you can write your own deserialization function that will be much faster. The file is mostly made up of matrices of 136 floating point values (about 120000 of them, meaning 16320000 floats in total). If you quantize these floats down to 8 or 16 bits you can make big space savings (e.g. you can store the min value and (max-min)/255 as floats for each matrix and quantize each separately). This reduces the file size down to about 18Mb and it loads in a few hundred milliseconds instead of several seconds. The decrease in quality from using quantized values seems negligible to me but YMMV.
Face Detection
You can scale the camera frames down to something small like 240x160 (or whatever, keeping aspect ratio correct) for faster face detection. It means you can't detect smaller faces but it might not be a problem depending on your app. Another more complex approach is to adaptively crop and resize the region you use for face detections: initially check for all faces in a higher res image (e.g. 480x320) and then crop the area +/- one face width around the previous location, scaling down if need be. If you fail to detect a face one frame then revert to detecting the entire region the next one.
Face Tracking
For faster face tracking, you can run face detections continuously in one thread, and then in another thread, track the detected face(s) and perform face feature detections using the tracked rectangles. In my testing I found that face detection took between 100 - 400ms depending on what phone I used (at about 240x160), and I could do 7 or 8 face feature detections on the intermediate frames in that time. This can get a bit tricky if the face is moving a lot, because when you get a new face detection (which will be from 400ms ago), you have to decide whether to keep tracking from the new detected location or the tracked location of the previous detection. Dlib includes a correlation_tracker however unfortunately I wasn't able to get this to run faster than about 250ms per frame, and scaling down the resolution (even drastically) didn't make much of a difference. Tinkering with internal parameters produced increase speed but poor tracking. I ended up using a CAMShift tracker based on the chroma UV planes of the preview frames, generating the color histogram based on the detected face rectangles. There is an implementation of CAMShift in OpenCV, but it's also pretty simple to roll your own.
Hope this helps, it's mostly a matter of picking the low hanging fruit for optimization first and just keep going until you're happy it's fast enough. On a Galaxy Note 5 Dlib does face+feature detections at about 100ms, which might be good enough for your purposes even without all this extra complication.
Dlib is fast enough for most cases. The most of processing time is taken to detect face region on image and its slow because modern smartphones are producing high-resolution images (10MP+)
Yes, face detection can take 2+ seconds on 3-5MP image, but it tries to find very small faces of 80x80 pixels size. I am really sure, that you dont need such small faces on high resolution images and the main optimization here is to reduce the size of image before finding faces.
After the face region is found, the next step - face landmarks detection is extremely fast and takes < 3 ms for one face, this time does not depend on resolution.
dlib-android port is not using dlib's detector the right way for now. Here is a list of recommendations how to make dlib-android port work much faster:
https://github.com/tzutalin/dlib-android/issues/15
Its very simple and you can implement it yourself. I am expecting performance gain about 2x-20x
Apart from OpenCV and Google Vision, there are widely-available web services like Microsoft Cognitive Services. The advantage is that it would be completely platform-independent, which you've listed as a major design goal. I haven't personally used them in an implementation yet but based on playing with their demos for awhile they seem quite powerful; they're pretty accurate and can offer quite a few details depending on what you want to know. (There are similar solutions available from other vendors as well by the way).
The two major potential downsides to something like that are the potential for added network traffic and API pricing (depending on how heavily you'll be using them).
Pricing-wise, Microsoft currently offers up to 5,000 transactions a month for free with added transactions beyond that being some fraction of a penny (depending on traffic, you can actually get a discount for high volume), but if you're doing, for example, millions of transactions per month the fees can start adding up surprisingly quickly. This is actually a fairly typical pricing model; before you select a vendor or implement this kind of a solution make sure you understand how they're going to charge you and how much you're likely to end up paying and how much you could be paying if you scale your user base. Depending on your traffic and business model it could be either very reasonable or cost-prohibitive.
The added network traffic may or may not be a problem depending on how your app is written and how much data you're sending. If you can do the processing asynchronously and be guaranteed reasonably fast Wi-Fi access that obviously wouldn't be a problem but unfortunately you may or may not have that luxury.
I am currently working with the Google Vision API and it seems to be able to detect landmarks out of the box. Check out the FaceTracker here:
google face tracker
This solution should detect the face, happiness, and left and right eye as is. For other landmarks, you can call the getLandmarks on a Face and it should return everything you need (thought I have not tried it) according to their documentation: Face reference

how to measure and improve battery use in iPhone/iPad game (Android also)

My game uses too much battery. I don't know exactly how much it uses as compared to comparable games, but it uses too much. Players complain that it uses a lot, and a number of them note that it makes their device "run hot". I'm just starting to investigate this and wanted to ask some theoretical and practical questions to narrow the search space. This is mainly about the iOS version of my game, but probably many of the same issues affect the Android version. Sorry to ask many sub-questions, but they all seemed so interrelated I thought it best to keep them together.
Side notes: My game doesn't do network access (called out in several places as a big battery drain) and doesn't consume a lot of battery in the background; it's the foreground running that is the problem.
(1) I know there are APIs to read the battery level, so I can do some automated testing. My question here is: About how long (or perhaps: about how much battery drain) do I need to let the thing run to get a reliable reading? For instance, if it runs for 10 minutes is that reliable? If it drains 10% of the battery, is that reliable? Or is it better to run for more like an hour (or, say, see how long it takes the battery to drain 50%)? What I'm asking here is how sensitive/reliable the battery meter is, so I know how long each test run needs to be.
(2) I'm trying to understand what are the likely causes of the high battery use. Below I list some possible factors. Please help me understand which ones are the most likely culprits:
(2a) As with a lot of games, my game needs to draw the full screen on each frame. It runs at about 30 fps. I know that Apple says to "only refresh the screen as much as you need to", but I pretty much need to draw every frame. Actually, I could put some work into only drawing the parts of the screen that had changed, but in my case that will still be most of the screen. And in any case, even if I can localize the drawing to only part of the screen, I'm still making an OpenGL swap buffers call 30 times per second, so does it really matter that I've worked hard to draw a bit less?
(2b) As I draw the screen elements, there is a certain amount of floating point math that goes on (e.g., in computing texture UV coordinates), and some (less) double precision math that goes on. I don't know how expensive these are, battery-wise, as compared to similar integer operations. I could probably cache a lot of these values to not have to repeatedly compute them, if that was a likely win.
(2c) I do a certain amount of texture switching when rendering the scene. I had previously only been worried about this making the game too slow (it doesn't), but now I also wonder whether reducing texture switching would reduce battery use.
(2d) I'm not sure if this would be practical for me but: I have been reading about shaders and OpenCL, and I want to understand if I were to unload some of the CPU processing to the GPU, whether that would likely save battery (in addition to presumably running faster for vector-type operations). Or would it perhaps use even more battery on the GPU than on the CPU?
I realize that I can narrow down which factors are at play by disabling certain parts of the game and doing iterative battery test runs (hence part (1) of the question). It's just that that disabling is not trivial and there are enough potential culprits that I thought I'd ask for general advice first.
Try reading this article:
Android Documents on optimization
What works well for me, is decreasing the use for garbage collection e.g. when programming for a desktop computer, you're (or i'm) used to defining variables inside loops when they are not needed out side of the loop, this causes a massive use of garbage collection (and i'm not talking about primitive vars, but big objects.
try avoiding things like that.
One little tip that really helped me get Battery usage (and warmth of the device!) down was to throttle FPS in my custom OpenGL Engine.
Especially while the scene is static (e.g. a turn-based game or the user tapped pause) throttle down FPS.
Or throttle if the user isn't responsive for more then 10 seconds, like a screensaver on a desktop pc. In the real world users often get distracted while using mobile devices. Don't let your app drain battery while your user figures out what subway-station he's in ;)
Also on the iPhone, sometimes 60FPS is the default, throttling this manually to 30 FPS is barely visible and safes you about half of the gpu cycles (and therefore a lot of battery!).

Strange "stutter" in box2D on different android devices

I'm developing an engine and a game at the same time in C++ and I'm using box2D for the physics back end. I'm testing on different android devices and on 2 out of 3 devices, the game runs fine and so do the physics. However, on my galaxy tab 10.1 I'm sporadically getting a sort of "stutter". Here is a youtube video demonstrating:
http://www.youtube.com/watch?v=DSbd8vX9FC0
The first device the game is running on is an Xperia Play... the second device is a Galaxy Tab 10.1. Needless to say the Galaxy tab has much better hardware than the Xperia Play, yet Box2D is lagging at random intervals for random lengths of time. The code for both machines is exactly the same. Also, the rest of the engine/game is not actually lagging. The entire time, it's running at solid 60 fps. So this "stuttering" seems to be some kind of delay or glitch in actually reading values from box2D.
The sprites you see moving check to see if they have an attached physical body at render time and set their positional values based on the world position of the physical body. So it seems to be in this specific process that box2D is seemingly out of sync with the rest of the application. Quite odd. I realize it's a long shot but I figured I'd post it here anyway to see if anyone had ideas... since I'm totally stumped. Thanks for any input in advance!
Oh, P.S. I am using a fixed time step since that seems to be the most commonly suggested solution for things like this. I moved to a fixed time step while developing this on my desktop, I ran into a similar issue just more severe and the fixed step was the solution. Also like I said the game is running steady at 60 fps, which is controlled by a low latency timer so I doubt simple lag is the issue. Thanks again!
As I mentioned in the comments here, this came down to being a timer resolution issue. I was using a timer class which was supposed to access the highest resolution system timer, cross platform. Everything worked great, except when it came to Android, some versions worked and some versions it did not. The galaxy tab 10.1 was one such case.
I ended up re-writing my getSystemTime() method to use a new addition to C++11 called std::chrono::high_resolution_clock. This also worked great (everywhere but Android)... except it has yet to be implemented in any NDK for android. It is supposed to be implemented in version 5 of the crystax NDK R7, which at the time of this post is 80% complete.
I did some research into various methods of accessing the system time or something by which I could base a reliable timer on the NDK side, but what it comes down to is that these various methods are not supported on all platforms. I've went through the painful process of writing my own engine from scratch simply so that I could support every version of android, so betting on methods that are inconsistently implemented is nonsensical.
The only sensible solution for anyone facing this problem, in my opinion, is to simply abandon the idea of implementing such code on the NDK side. I'm going to do this on the Java end instead, since thus far in all my tests this has been sufficiently reliable across all devices that I've tested on. More on that here:
http://www.codeproject.com/Articles/189515/Androng-a-Pong-clone-for-Android#Gettinghigh-resolutiontimingfromAndroid7
Update
I have now implemented my proposed solution, to do timing on the java side and it has worked. I also discovered that handling any relatively large number, regardless of data type (a number such as the nano seconds from calling the monotonic clock) in the NDK side also results in serious lagging on some versions of android. As such I've optimized this as much as possible by passing around a pointer to the system time, to ensure we're not passing-by-copy.
One last thing too, my statement that calling the monotonic clock from the NDK side is unreliable is however, it would seem, false. From the Android docks on System.nanoTime(),
...and System.nanoTime(). This clock is guaranteed to be monotonic,
and is the recommended basis for the general purpose interval timing
of user interface events, performance measurements, and anything else
that does not need to measure elapsed time during device sleep.
So it would seem, if this can be trusted, that calling the clock is reliable, but as mentioned there are other issues that then arise, like handling allocating and dumping the massive number that results which alone nearly cut my framerate in half on the Galaxy Tab 10.1 with Android 3.2. Ultimate conclusion: supporting all android devices equally is either damn near or flat out impossible and using native code seems to make it worse.
I am very new to game development, and you seem a lot more experienced and it may be silly to ask, but are you using delta time to update your world? Altough you say you have a constant frame rate of 60 fps, maybe your frame counter calculates something wrong, and you should use delta time to skip some frames when the FPS is low, or your world seem to "stay behind". I am pretty sure that you are familiar with this, but I think a good example is here : DeltaTimeExample altough it is a C implementation. If you need I can paste some code from my Android Projects of how I use delta time, that I've developed following this book : Beginning Android Games.

Simple particle system on Android using OpenGL ES 1.0

I'm trying to put a particle system together in Android, using OpenGL. I want a few thousand particles, most of which will probably be offscreen at any given time. They're fairly simple particles visually, and my world is 2D, but they will be moving, changing colour (not size - they're 2x2), and I need to be able to add and remove then.
I currently have an array which I iterate through, handling velocity changes, managing lifecyling (killing old ones, adding new ones), and plotting them, using glDrawArrays. What OpenGl is pointing at, though, for this call, is a single vertex; I glTranslatex it to the relevant co-ords for each particle I want to plot, one at a time, set the colour with glColor4x then glDrawArrays it. It works, but it's a bit slow and only works for a few hundred particles. I'm handling the clipping myself.
I've written a system to support static particles which I have loaded into a vertex/colourarray and plot using glDrawArrays, but this approach only seems suitable for particles which will never change relative location (ie I move all of them using glTranslate), colour and where I don't need to add/remove particles. A few tests on my phone (HTC Desire) suggest that trying to alter the contents of those arrays (which are ByteBuffers, pointed to by OpenGL) is extremely slow.
Perhaps there's some way of manually writing the screen myself with the CPU. If I'm just plotting 1x1/2x2 dots on the screen, and I'm purely interested in writing and not doing any blending/antialiasing, is this an option? Would it be quicker than whatever OpenGl is doing?
(200 or so particles on a 1ghz machine with megs of ram. This is way slower than I was getting 20 years ago on a 7mhz machine with <500k of ram! I appreciate I'm using Java here, but surely there must be a better solution. Do I have to use the NDK to get the power of C++, or is what I'm after possible)
I've been hoping somebody might answer this definitively, as I'll be needing particles on Android myself. (I'm working in C++, though -- Currently using glDrawArrays(), but haven't pushed particles to the limit yet.)
I found this thread on gamedev.stackexchange.com (not Android-specific), and nobody can agree on the best approach there, but you might want to try a few things out and see for yourself.
I was going to suggest glDrawArrays(GL_POINTS, ...) with glPointSize(), but the guy asking the question there seemed unhappy with it.
Let us know if you find a good solution!

Android drawBitmap Performance For Lots of Bitmaps?

I'm in the process of writing an Android game and I seem to be having performance issues with drawing to the Canvas. My game has multiple levels, and each has (obviously) a different number of objects in it.
The strange thing is that in one level, which contains 45 images, runs flawlessly (almost 60 fps). However, another level, which contains 81 images, barely runs at all (11 fps); it is pretty much unplayable. Does this seem odd to anybody besides me?
All of the images that I use are .png's and the only difference between the aforementioned levels is the number of images.
What's going on here? Can the Canvas simply not draw this many images each game loop? How would you guys recommend that I improve this performance?
Thanks in advance.
Seems strange to me as well. I am also developing a game, lots of levels, I can easily have a 100 game objects on screen, have not seen a similar problem.
Used properly, drawbitmap should be very fast indeed; it is little more than a copy command. I don't even draw circles natively; I have Bitmaps of pre-rendered circles.
However, the performance of Bitmaps in Android is very sensitive to how you do it. Creating Bitmaps can be very expensive, as Android can by default auto-scale the pngs which is CPU intensive. All this stuff needs to be done exactly once, outside of your rendering loop.
I suspect that you are looking in the wrong place. If you create and use the same sorts of images in the same sorts of ways, then doubling the number of screen images should not reduce performance by a a factor of over 4. At most it should be linear (a factor of 2).
My first suspicion would be that most of your CPU time is spent in collision detection. Unlike drawing bitmaps, this usually goes up as the square of the number of interacting objects, because every object has to be tested for collision against every other object. You doubled the number of game objects but your performance went down to a quarter, ie according to the square of the number of objects. If this is the case, don't despair; there are ways of doing collision detection which do not grow as the square of the number of objects.
In the mean time, I would do basic testing. What happens if you don't actually draw half the objects? Does the game run much faster? If not, its nothing to do with drawing.
I think this lecture will help you. Go to the 45 minute . There is a graph comparing the Canvas method and the OpenGl method. I think it is the answer.
I encountered a similar problem with performance - ie, level 1 ran great and level 2 didn't
Turned it wasn't the rendering that was a fault (at least not specifically). It was something else specific to the level logic that was causing a bottleneck.
Point is ... Traceview is your best friend.
The method profiling showed where the CPU was spending its time and why the glitch in the framerate was happening. (incidentally, the rendering cost was also higher in Level 2 but wasn't the bottleneck)

Categories

Resources