How would you improve Dalvik? Android's Virtual Machine - android

I am currently writing a paper on the Android platform. After some research, it's clear that Dalvik has room for improvement. I was wondering, what do you think would be the best use of a developer's time with this goal?
JIT compilation seems like the big one, but then i've also heard this would be of limited use on such a low resource machine. Does anyone have a resource or data that backs this up?
Are there any other options that should be considered? Aside from developing a robust native development kit to bypass the VM.
For those who are interested, there is a lecture that has been recorded and put online regarding the Dalvik VM.
Any thoughts welcome, as this question appears subjective i'll clarify that the answer I'll accept must have some justification for proposed changes. Any data to back it up, such as the improvement in the Sun JVM when it was introduced, would be a massive plus.

Better garbage collection: compacting at minimum (to eliminate memory fragmentation problems experienced today), ideally less CPU intensive at doing the collection itself (to reduce the "my game frame rates suck" complaints)
JIT, as you cite
Enough documentation that, when coupled with an NDK, somebody sufficiently crazy could compile Dalvik bytecode to native code for an AOT compilation option
Make it separable from Android itself, such that other projects might experiment with it and community contributions might arrive in greater quantity and at a faster clip
I'm sure I could come up other ideas if you need them.

JIT. The stuff about it not helping is a load of crap. You might be more selective about what code you JIT but having 1/10th the performance of native code is always going to be limiting
Decent GC. Modern generational garbage collectors do not have big stutters.
Better code analysis. There are lot of cases where allocations/frees don't need to be made, locks held, and so on. It allows you to write clean code rather than doing optimizations that the machine is better at
In theory most of the higher level languages (Java, Javascript, python,...) should be within 20% of native code performance for most cases. But it requires the platform vendor to spend 100s+ developer man years. Sun Java is getting good. They have also been working on it for 10 years.

One of the main problems with Dalvik is performance, which is terrible I heard, but one of the things I would like most is the addition of more languages.
The JVM has had community projects getting Python and Ruby running on the platform, and even special languages such as Scala, Groovy and Closure developed for it. It would be nice to see these (and/or others) on the Dalvik platform as well. Sun has been working on the Da Vinci machine as well, a dynamic typing extension of the JVM, which indicates a major shift away from the "one language fits all" philosophy Sun has followed over the last 15 years.

Related

Localized strings (Android) took too much memory

Good day,
I've quite big project currently localized to 21 languages where one language has around 9000 words! This is not so interesting, but
compiled and started application with all these resources, took after start around 11 MB in memory (simply measured by Debug.getNativeHeapSize())
when I remove 20 languages and keep only default one, it has after start only 7.5 MB
Because biggest problem of my app are devices with low memory available for a single process (mainly older devices with 2.X android), this is very serious question for me.
So here comes two questions, hope that someone will have any useful suggestion
I expect that Android loads only required resources, so how it's possible that these additional languages makes so huge difference, when in memory should be in worst case just list of available resources
if there is not any explanation for point 1., is there any way to precompile resources into separate packages and download them on request? For example in some starting activity where user select language he wants to use?
any suggestions are more then welcome. Thanks
Memory measurement on Android is reminiscent of a portion of Zork: a maze of twisty little passages, all alike.
If your concerns are stuff in the VM, the best way to try to get answers (IMHO) is to use MAT. Particularly on Android 3.0 and higher, if you are allocating memory in the SDK, it should show up here.
For a more thorough discussion of non-MAT ways of measuring memory usage, and their pitfalls, I recommend Dianne Hackborn's epic answer on the subject.

Android sound synthesis

I am trying to play a synthesized sound (basically 2 sine waves and some noise) using the AudioTrack class. It doesn't seem to be any different than the SourceDataLine in javax.sound.sampled, BUT the synthesis is REALLY SLOW. Even for ARM standards, it's unrealistic to think that 32768 samples (16 bit, stereo, for a total of 65536) take over 1 second to render on a Nexus 4 (measured with System.nanotime(), write to AudioTrack excluded).
The synthesis part is almost identical to this http://audioprograming.wordpress.com/2012/10/18/a-simple-synth-in-android-step-by-step-guide-using-the-java-sdk/, the only difference is that I play stereo sound (I can't reduce it to mono because it's a binaural tone).
Any ideas? what can I do?
Thanks in advance
Marko's answer seems very good. But if you're still in the experimental/investigational phase of your project, you might want to consider using Pure Data, which already is implemented as a combination Android library/NDK library and which would allow you to synthesize many sounds and interact with them in a relatively simple manner.
The libpd distribution is the Android implementation of Pure Data. Some good starting references can be found at the SoundOnSound site and also at this site.
Addendum: I found a basic but functional implementation of an Android Midi Driver through this discussion link. The relevant code can be found here (github, project by billthefarmer, named mididriver).
You can view how I use it in my Android app (imSynt link leads you to Google Play), or on YouTube.
The performance of audio synthesis on ARM is actually very respectable with native code that makes good use of the NEON unit. The Dalvik's JIT compiler is never going to get close to this level of performance for floating-point intensive code.
A look at the enormous number of soft-synth apps for iOS provides ample evidence of what should be possible on ARM devices with similar levels of performance.
However, the performance you are reporting is several orders of magnitude short of what I would expect. You might consider the following:
Double precision float-point arithmetic is particularly expensive on ARM Cortex A-x NEON units, where as single precision is very fast and highly parallelizable. Math.sin() returns a double, so is unnecessarily precise, and liable to be slow. The 24-mantissa provided by single precision floating point value is substantially larger than the 16-bit int used by the audio subsystem.
You could precompute sin(x) and then perform a table-lookup in your render loop.
There is a previous post on SO concerning Math.sin(x) on android suggesting degrading performance as x becomes large, as it's likely to in this case over time.
For a more advanced table-based synthesiser, you might consider using a DDS Oscillator.
Ultimately, you might consider using native code for synthesis, with the NDK.
You should be able render multiple oscillators with filters and envelopes and still have CPU time left over. Check your inner loops to make sure that there are no system calls.
Are you on a very old phone? You did not mention the hardware or OS version.
You might want to try using JSyn. It is a free modular Java synthesizer that runs on any Java platform including desktops, Raspberry Pi and Android.
https://github.com/philburk/jsyn
Have you tried profiling your code? It sounds like something else is possibly causing your slow down, profiling would help to highlight the cause.
Mike

Is the NDK required for good performance in the development of an Android Game?

I heard if I develop Android Game without using the NDK, the performance is significantly lower. Is this the truth?
I see 3 main reasons to use NDK:
you want to reuse C/C++ code base (e.g. game engine written in C++, crossplatform games)
get more memory (you can use much more memory via NDK)
your game is very CPU intensive and you need all your device power.
In all other cases you can choose whatever you like. SDK/Java allows you to use OpenGL same way as NDK, so your graphics won't slow. You should be careful with GC to get smooth gameplay. If your game is very CPU intensive you can write some methods in C++ and call them via JNI. By the way, Dalvik has JIT so Java code can be as fast (even faster sometimes) as C code.
No, that is simply not true.
It very much depends on what your particular game is doing. The essential question is: is your game going to be CPU intensive (or require larger amounts of memory).
If it's not, stick to the SDK.
If you don't know, because you haven't written many games yet in the past, by all means stick to the SDK.
Even it there turn out to be parts of the game that could do with the extra processing power, you can always extract those parts to native code during development as needed.
One reason to choose the NDK over the SDK is having a huge background in C++, which might make you more productive in that environment. However, given the current state of the toolsets (convenient debugging, build times, easy access to SDK libraries etc), this is rarely effective.
No you dont need to use the ndk unless you want to make a super-realistic 3d game such as Real Racing 3 if it is a game with simple graphics or not much time critical stuff java is ok

To use the JNI, or not to use the JNI (Android performance)

I just added some computationally expensive code to an Android game I am developing. The code in question is a collection of collision detection routines that get called very often (every iteration of the game-loop) and are doing a large amount of computation. I feel my collision detection implementation is fairly well developed, and as reasonably fast as I can make it in Java.
I've been using Traceview to profile the code, and this new piece of collision detection code has somewhat unsurprisingly doubled the duration of my game logic. That's obviously a concern since for certain devices, this performance hit could take my game from a playable to an unplayable state.
I have been considering different ways to optimize this code, and I am wondering if by moving the code into C++ and accessing it with the JNI, if I will get some noticeable performance savings?
The above question is my main concern and my reason for asking. I've determined that the two following reasons would be other positive results from using the JNI. However, it is not enough to persuade me to port my code to C++.
This would make the code cleaner. Since most of the collision detection is some sort of vector math, it is much cleaner to be able to use overloaded operators rather than using some more verbose vector classes in Java.
Memory management would be simpler. Simpler you say? Well, this is a game so the garbage collector running is not welcome because the GC could end up ruining the performance of your game if it constantly has to interrupt to clean up. In C I don't have to worry about the garbage collector, so I can avoid all the ugly things I do in Java with temporary static variables and just rely on the good old stack memory of C++
Long-winded as this question may be, I think I covered all my points. Given this information, would it be worth porting my code from Java to C++ and accessing it with the JNI (for reasons of improving performance)? Also, is there a way to measure or estimate a potential performance gain?
EDIT:
So I did it. Results? Well from TraceView's perspective, it was a 6x increase in speed of my collision detection routine.
It wasn't easy getting there though. Besides having to do the JNI dance, I also had to make some optimizations that I did not expect. Mainly, using a directly allocated float buffer to pass data from Java to native. My initial attempt just used a float array to hold the data in question because the conversion from Java to C++ was more natural, but that was realllly reallllly slow. The direct buffer completely side-stepped performance issues with array copying between java and native, and left me with a 6x bump.
Also, instead of rolling my own vector class, I just used the Eigen math library. I'm not sure how much of an affect this has had on performance, but at the least, it saved me the time of dev'ing my own (less efficient) vector class.
Another lesson learned is that excessive logging is bad for performance (jic that isn't obvious).
Not really a direct answer to your question, but the following links might be of use to you:
Android Developers, JNI Tips.
Android Developers, Designing for Performance
In the second link the following is written:
Native code isn't necessarily more efficient than Java. For one thing,
there's a cost associated with the Java-native transition, and the JIT
can't optimize across these boundaries. If you're allocating native
resources (memory on the native heap, file descriptors, or whatever),
it can be significantly more difficult to arrange timely collection of
these resources. You also need to compile your code for each
architecture you wish to run on (rather than rely on it having a JIT).
You may even have to compile multiple versions for what you consider
the same architecture: native code compiled for the ARM processor in
the G1 can't take full advantage of the ARM in the Nexus One, and code
compiled for the ARM in the Nexus One won't run on the ARM in the G1.
Native code is primarily useful when you have an existing native
codebase that you want to port to Android, not for "speeding up" parts
of a Java app.
If you are still at a fairly early stage of game development, you can consider using a Game Engine which provides a good collision detection mechanism, like Libgdx which does a fairly good job of box2d collision detection.

Android - Debug slow running code

Is there a good way (proper way, or effective way) to debug slow running code?
I have a thread which runs multiple loops and then recurses and my code is running very slow.
Is there a good way to debug different loops or sections of code to find out which is running slowest?
If the debugger already does this, can someone please explain how,
Many thanks
What you need is not a debugger, but a profiler. Check this tool from Android's SDK: traceview
One of the most primitive ways to determine slow points is to litter the code with print statements. Bottlenecks then show up as delays between prints. This can be improved by printing the system time as you move from one loop to another, making it trivial to determine the slowest loops.
A solution that is potentially easier and more thorough is to use a performance profiler. Most mainstream languages will have standalone profilers or debuggers with performance profiling built-in. A good profiler will determine the percentage of execution time spent on each area of your code and offer useful information for optimizing performance bottlenecks.
If you need more specific information it would be helpful to post the language you are using as well as relevant sections of the code.

Categories

Resources