loops efficiency - android

I came across through a presentation(dalvik-vm-internals) on Dalvik VM, in that it is mentioned as for the below loops, we have use (2) and (3) and to avoid (7).
(1) for (int i = initializer; i >= 0; i--)
(2) int limit = calculate limit;
for (int i = 0; i < limit; i++)
(3) Type[] array = get array;
for (Type obj : array)
(4) for (int i = 0; i < array.length; i++)
(5) for (int i = 0; i < this.var; i++)
(6) for (int i = 0; i < obj.size(); i++)
(7) Iterable list = get list;
for (Type obj : list)
Comments: i feel that (1) and (2) are the same.
(4) every time it has to calculate the length of array, so this can be avoided
(6) same as (4), calculating the size everytime
(7) asked to avoid as list is of an Iterable type??
one more, in case if we have infinite data(assume data is coming as a stream) which loop should we consider for better efficiency?)
request you to please comment on this...

If that's what they recommend, that's what they've optimized the compiler and VM for. The ones you feel are the same aren't necessarily implemented the same way: the compiler can use all sorts of tricks with data and path analysis to avoid naively expensive operations. For instance, the array.length() result can be cached since arrays are immutable.
They're ranked from most to least efficient: but (1) is 'unnatural'. I agree, wouldn't you? The trouble with (7) is that an iterator object is created and has to be GC'ed.
Note carefully when the advice should be heeded. It's clearly intended for bounded iteration over a known collection, not the stream case. It's only relevant if the loop has significant effect on performance and energy consumption ('operating on the computer scale'). The first law of optimization is "Don't optimize". The second law (for experts) is "Don't optimize, yet.". Measure first (both execution times and CPU consumption), optimize later: this applies even to mobile devices.
What you should consider is the preceding slides: try to sleep as often and as long as possible, while responding quickly to changes. How you do that depends on what kind of stream you're dealing with.
Finally, note that the presentation is two years old, and may not fully apply to 2.2 devices where among other things JIT is implemented.

With infinite data, none of the examples are good enough. Best would be to do
for(;;) {
list.poll(); //handle concurrency, in java for example, use a blocking queue

1) and 2) are really different. 2) need an extra subtraction to compute i=0 doesn't.
Even better, on most processor (and well optimized code) no is comparison needed for i>=0. The processor can use the the negative flag, resulting for the last decrement (i--).
So the end of the loop -1 looks like (in pseudo assembler)
while loop #2
limit-i # set negative flag if i >limit
That doesn't make a big difference, except if the code in your loop is really small (like basic C string operation)
That might not work with interpreted languages.


Sampling from an array in android kotlin

I need an idea on doing this. I'm not good at math.
Maybe it have build in function which i haven't found yet.
I have an array which consists of 2048 data.
I need to get on 250 value out of this.
I'm thinking of
2048/250 = 8.19
which means, I take value on each increment of 8 position in an array.
Is there a function to do this?
Not that I'm aware of, I think the problem is to balance iterations and the randomness of the sampling.
So the naive approach
dataSet.indexedMapNotNull { i, data ->
if (i % 8 == 0) data else null
That would run through all the array, so you only need 250 iterations, not dataSet.size iterations. So what about if we iterate 250 times and for each of those we take the 8th times of it
val sample = mutableListOf<DataType>()
for (i in 1..250) {
val positionInDataSet = (i * 8) - 1 //minus one adjust the index for the size
val case = dataSet[positionInDataSet]
Another alternative would be to simply use copy methods from collections, but the problem is you lose your sampling
dataSet.subArray(0, 250)
Sub-array didn't sample the data in a pseudo-random way but only got the first 250 and that would be biased. The upside is usually array copies methods are a log of N.
Another option would be to randomize things even more by not getting data each 8 but a random position until we hit our desired sample size.
val sample = mutableSetOf<DataType>()
while (sample.size != 250) {
val randomPosition = Random.nextInt(0, dataSet.size)
val randomSelection = dataSet[randomPosition]
Here we use a set, because a Set guarantee unique elements, so you have completely random 250 elements from your data set. The problem with this is that randomness on the position could make the same randomPosition more than once, so you iterate on the data set more than 250 times, this could even be factorial which in larger data sets it would happen and is considered the lowest performance.

How to generate unique ids for RecyclerView, in case of 2 ids per item?

Suppose I have a RecyclerView, which has items that can only be unique if you look at 2 ids they have, but not just one of them.
The first Id is the primary one. Usually there aren't 2 items that have the same primary ID, but sometimes it might occur, which is why there is a secondary ID.
In my
The problem
The RecyclerView adapter needs to have a "long" type being returned:
What I've tried
The easy way to overcome this, is to have a HashMap and a counter.
The HashMap will contain the combined-keys, and the value will be the id that should be returned. The counter is used to generated the next id in case of a new combined key. The combined key can be a "Pair" class in this case.
Suppose each item in the RecyclerView data has 2 long-type keys:
HashMap<Pair<Long,Long>,Long> keyToIdMap=new HashMap();
long idGenerator=0;
this is what to do in getItemId :
Pair<Long,Long> combinedKey=new Pair(item.getPrimaryId(), item.getSecondary());
Long uniqueId=keyToIdMap.get(combinedKey);
return uniqueId;
This has the drawback of taking more and more memory. Not much though, and it's very small and proportional to the data you already have, but still...
However, this has the advantage of being able to handle all types of IDs, and you can use even more IDs as you wish (just need something similar to Pair).
Another advantage is that it will use all IDs starting from 0.
The question
Is there perhaps a better way to achieve this?
Maybe a mathematical way? I remember I learned in the past of using prime numbers for similar tasks. Will it work here somehow?
Do the existing primary and secondary ids use the entire 64-bit range of longs? If not then it's possible to compute a unique 64-bit long from their values with e.g bit slicing.
Another approach would be to hash the two together with a hash with very low collisions (a crypto hash like SHA2 for example) and using the first 64 bits of the result. Having a range of 64 bits means you can comfortably have millions of items before the chance of a collision becomes likely - the chance of a collision is 50% when you add sqrt(64)=2**32 items, which is more than 4 billion.
Finally, having an unique independent mapping is very versatile and assuming the map is always accessible it's fine (it gets tricky when you try to synchronize that new id across machine etc). In Java you can attempt to increase performance by avoiding the boxed Longs and a separate Pair instance using a custom map implementation, but that's micro-optimizing.
Example using SHA1:
With Guava - the usage is clean and obvious.
HashFunction hf = Hashing.sha1();
long hashedId = hf.newHasher()
Just the standard JDK, it's pretty horrible and can probably be more efficient, should look something like this (I'm ignoring checked exceptions):
static void updateDigestWithLong(MessageDigest md, long l) {
md.update((byte)(l >> 8));
md.update((byte)(l >> 16));
md.update((byte)(l >> 24));
// this is from the Guava sources, can reimplement if you prefer
static long padToLong(bytes[] bytes) {
long retVal = (bytes[0] & 0xFF);
for (int i = 1; i < Math.min(bytes.length, 8); i++) {
retVal |= (bytes[i] & 0xFFL) << (i * 8);
return retVal;
static long hashLongsToLong(long primary, long secondary) {
MessageDigest md = MessageDigest.getInstance("SHA-1");
updateDigestWithLong(md, primary);
updateDigestWithLong(md, secondary);
return padToLong(md.digest());
I think my original idea is the best one I can think of.
Should cover all possible ids, with least possible collision.

What are ways to cache queries from SQLite database?

I am developing the following functionality:
an user picks a date and gets ListView populated by SimpleCoursorLoader (queries are executed in the background).
User frequently choices adjacent dates and there might be a lot of duplicate queries.
I tested the application and discovered that in case of high frequency requests - it runs very slow.
In order to speedup my application I decided to implement cache where results of queries will be stored. Key - date and value-?
Is it worth doing and what techniques could you advice?
1) Yes, it's really worth doing since DB access is relatively slow (even with such a great thing like SQLite)
2) Considering what I've got from your post I'd suggest using LongSparseArray: key will be date from database (long), stored value - your cached data object (Bundle etc). The reasons are it's:
naturally sorted
sort order is maintained on changes
memory efficient
3) When you need to load overlapping/adjacent interval you have to check bounds and load only absent part
4) If a situation is possible when you cache non-adjacent intervals - you need to manage loaded intervals bounds as well. But if you do it only for list scroll purposes you may omit this (if you don't stop loading data on fling gesture)
About my experience: I've got about 3 times payoff using caching. But actual results depends on database scheme etc. You may get even more
I found MatrixCursor useful for the purpose of caching. I keep HashMap.
Logic: if no request has been done - issue it, get Cursor, convert it to MatrixCursor and write to cache.
Here is the snippet for convertion:
private MatrixCursor cursorToMatrixCursor(Cursor c) {
MatrixCursor result = new MatrixCursor(c.getColumnNames());
if (c.moveToFirst()) {
do {
ArrayList<String> columnValues = new ArrayList<>();
final int nOfColumns = c.getColumnCount();
for(int col = 0; col < nOfColumns; ++col)
} while (c.moveToNext());
return result;

Android: why is native code so much faster than Java code

In the following SO question: https://stackoverflow.com/questions/2067955/fast-bitmap-blur-for-android-sdk #zeh claims a port of a java blur algorithm to C runs 40 times faster.
Given that the bulk of the code includes only calculations, and all allocations are only done "one time" before the actual algorithm number crunching - can anyone explain why this code runs 40 times faster? Shouldn't the Dalvik JIT translate the bytecode and dramatically reduce the gap to native compiled code speed?
Note: I have not confirmed the x40 performance gain myself for this algorithm, but all serious image manipulation algorithm I encounter for Android, are using the NDK - so this supports the notion that NDK code will run much faster.
For algorithms that operate over arrays of data, there are two things that significantly change performance between a language like Java, and C:
Array bound checking: Java will check every access, bmap[i], and confirm i is within the array bounds. If the code tries to access out of bounds, you will get a useful exception. C & C++ do not check anything and just trust your code. The best case response to an out of bounds access is a page fault. A more likely result is "unexpected behavior".
Pointers: You can significantly reduce the operations by using pointers.
Take this innocent example of a common filter (similar to blur, but 1D):
for(int i = 0; i < ndata - ncoef; ++i) {
z[i] = 0;
for(int k = 0; k < ncoef; ++k) {
z[i] += c[k] * d[i + k];
When you access an array element, coef[k] is:
Load address of array coef into register;
Load value k into a register;
Sum them;
Go get memory at that address.
Every one of those array accesses can be improved because you know that the indexes are sequential. Neither the compiler, nor the JIT can know that the indexes are sequential so they cannot optimize fully (although they keep trying).
In C++, you would write code more like this:
int d[10000];
int z[10000];
int coef[10];
int* zptr;
int* dptr;
int* cptr;
dptr = &(d[0]); // Just being overly explicit here, more likely you would dptr = d;
zptr = &(z[0]); // or zptr = z;
for(int i = 0; i < (ndata - ncoef); ++i) {
*zptr = 0;
*cptr = coef;
*dptr = d + i;
for(int k = 0; k < ncoef; ++k) {
*zptr += *cptr * *dptr;
When you first do something like this (and succeed in getting it correct) you will be surprised how much faster it can be. All the array address calculations of fetching the index and summing the index and base address are replaced with an increment instruction.
For 2D array operations such as blur on an image, an innocent code data[r,c] involves two value fetches, a multiply and a sum. So with 2D arrays the benefits of pointers allows you to remove multiply operations.
So the language allows real reduction in the operations the CPU must perform. The cost is that the C++ code is horrendous to read and debug. Errors in pointers and buffer overflows are food for hackers. But when it comes to raw number grinding algorithms, the speed improvement is too tempting to ignore.
Another factor not mentioned above is the garbage collector. The problem is that garbage collection takes time, plus it can run at any time. This means that a Java program which creates lots of temporary objects (note that some types of String operations can be bad for this) will often trigger the garbage collector, which in turn will slow down the program (app).
Following is an list of Programming Language based on the levels,
Assembly Language ( Machine Language, Lover Level )
C Language ( Middle Level )
C++, Java, .net, ( Higher Level )
Here Lower level language has direct access to the Hardware. As long as the level gets increased the access to the hardware gets decrease. So Assembly Language's code runs at the highest speed while other language's code runs based on their levels.
This is the reason that C Language's code run much faster than the Java's code.

Android: benchmarking two algorithms

I have implemented two algorithm for the same problem and want to find out, which is the best in a professional way.
The basis idea was:
final static int LOOP_COUNT = 500;
long totaTime = 0;
for(int i =0, i<LOOP_COUNT, i++)
long startTime = System.currentTimeMillis();
long endTime= System.currentTimeMillis();
totalTime += endTime - startTime;
return totalTime / LOOP_COUNT;
And do that for both Algorithm.
how can I achieve, that the android System does not do any system calculations in the background and skew up the data
ist there a way i can also compare the used memory, both methods need?
If you want professional statistical and relevant results and you want to minimize the influence of Android background processes, you will need to run your algorithm a number of times and compare the averages. In that way, due the the law of large numbers, your results will be correct.
How much times depends on the standard deviation of the execution time and how certain you want to be. If you're familiar with some basic statistic knowledge, you can determine your sample size with some basic formulas and you can for example run a t-test if your sample distribution is normally distributed to compare the averages of both algorithms. This automatically incorporates the fact that you want to minimize the influence of background processes. They will appear randomly so after a number of iterations, the influence of Android will be cancelled out.
Also take a look at the garbage collector, if you have a lot of object creation during the execution of your algorithm, it will affect your results but it should, as it will also affect the real world usage of the algorithm.
You could try to analyze your code and find the time complexity. If you have a nested loop:
for(int i = 0; i< max; i++){
for(int j = 0; j< max; j++){
c = i + j;
This would have the time complexity O(n^2). The space complexity is O(1)
Another example is this:
for(int i = 0; i< max; i++){
list[i] = "hello";
for(int j = 0; j< max; j++){
list2[j] = "hello";
This would have the time complexity of O(2n) which is the same as O(n), and space complexity of O(2n) which is O(n).
The latter have a better runtime but uses more memory.
The recommended approach to measure specific inner loop performance is the Jetpack Microbenchmark library. You can find code samples on GitHub.

