Very slow key generation on certain OS versions with SpongyCastle - android

I am building an app in which I use the spongycastle library (which is run-down version of bouncycastle), but the problem is when I perform this:
KeyParameter key = (KeyParameter) generator.generateDerivedMacParameters(keyLength * 8); // key length in bits
and build it on any phone which has API of 6.0 or below the operation is extremely slow. To pinpoint the exact code which runs very slow (note that this code is in spongy library):
for (int count = 1; count < iterationCount; count++)
{
hMac.update(state, 0, state.length);
hMac.doFinal(state, 0);
for (int j = 0; j != state.length; j++)
{
out[outOff + j] ^= state[j];
}
}
The iteration count is always 800000 because I need it to be very secure, but the process to execute this code takes almost 5 minutes on these devices. The interesting part is that on API 4.4 it only takes about a minute. So, is there any workaround for this without reducing the iteration count, maybe I should just use bouncycastle or something else?

Related

For loop incredibly slow in Android Studio debugger, fast without it

I am initializing an array of 30 MB in Android Studio.
byte[] myarray = new byte[30 * 1024 * 1024];
for (int i = 0; i < myarray.length; i++) {
myarray[i] = 0;
}
Around this I had a time measurement with SystemClock that calculates how many milliseconds the loop takes.
It's 2.5 minutes if the app runs started with Android Studio. No breakpoints involved of course.
It's 0.5 seconds if the app runs started directly without Android Studio.
When I call other operations on this array such as System.arraycopy I don't see such a huge difference. I understand there is a difference between debugging or not but this is a factor of 300.
What is happening here and how can I modify this so I can debug my app efficiently?

Android production app, rare out-of-bounds-exception crash

Android app rare production out-of-bounds exception.
My production app sometimes has a rare out-of-bounds-exception crash, which has only ever occurred on a Samsung Galaxy Tab A (2016), 2048 MB RAM, Android 8.1. I cannot properly diagnose which index is OOB. Also I cannot see how any of the indexes can possible be wrong anyway. Am I missing something obvious, can anyone help please?
The app has prod versions 1.0, 1.1, 1.2 & 1.3.
There are crash reports of this happening in v 1.0 on March 30th, and again this week in v 1.2 on May 11th,
Although I could not diagnose it, I attempted some fixes for the 1.0 March 30th crash. These ‘fixes’ are live in version 1.2. So, moving on, 1.2 had a similar crash this week on May 11th (actually 5 crashes over several hours, all on the same device).
The code is:
public Bitmap[][] balloonBitmap = new Bitmap[6][6];
public int[] dynamicObjectRId = new int[10];
for (int j = 1; j <= totalNoOfDynamicImages; j++) {
// PROD CRASH March 30th: OOB was on the next line
**balloonBitmap[correctOptionColourNo][j] = ImageUtil.loadImage(res, db.dynamicObjectRId[j],
dynamicImageWidthQT3, dynamicImageHeightQT3)**;
// PROD CRASH May 11th seems to be on this line:
*byteCountforBitmaps += balloonBitmap[correctOptionColourNo][j].getByteCount();*
}
The fix I tried is done earlier in the method: (live in v1.2)
if ((correctOptionColourNo < 0) || (correctOptionColourNo > 5)) {
correctOptionColourNo = 3;
}
So correctOptionColourNo should be ok, not OOB.
The OOB on May 11th appears to be on the next statement:
byteCountforBitmaps += balloonBitmap[correctOptionColourNo][j].getByteCount();
However, I’m not sure if I can fully believe this, because surely the previous statement would have OOB’d first. Anyway my v 1.2 code backup points to this line of code – I just don’t believe it somehow. But whichever is the actual offending line of code, it’s still the same problem, which index is OOB and why is it happening?
I suspect the problem is here: for (int j = 1; j <= totalNoOfDynamicImages; j++)
Try changing it to for (int j = 0; j < totalNoOfDynamicImages; j++)

why kotlin code so long on first execution [duplicate]

I'm curious about this.
I wanted to check which function was faster, so I create a little code and I executed a lot of times.
public static void main(String[] args) {
long ts;
String c = "sgfrt34tdfg34";
ts = System.currentTimeMillis();
for (int k = 0; k < 10000000; k++) {
c.getBytes();
}
System.out.println("t1->" + (System.currentTimeMillis() - ts));
ts = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
Bytes.toBytes(c);
}
System.out.println("t2->" + (System.currentTimeMillis() - ts));
}
The "second" loop is faster, so, I thought that Bytes class from hadoop was faster than the function from String class. Then, I changed the order of the loops and then c.getBytes() got faster. I executed many times, and my conclusion was, I don't know why, but something happen in my VM after the first code execute so that the results become faster for the second loop.
This is a classic java benchmarking issue. Hotspot/JIT/etc will compile your code as you use it, so it gets faster during the run.
Run around the loop at least 3000 times (10000 on a server or on 64 bit) first - then do your measurements.
You know there's something wrong, because Bytes.toBytes calls c.getBytes internally:
public static byte[] toBytes(String s) {
try {
return s.getBytes(HConstants.UTF8_ENCODING);
} catch (UnsupportedEncodingException e) {
LOG.error("UTF-8 not supported?", e);
return null;
}
}
The source is taken from here. This tells you that the call cannot possibly be faster than the direct call - at the very best (i.e. if it gets inlined) it would have the same timing. Generally, though, you'd expect it to be a little slower, because of the small overhead in calling a function.
This is the classic problem with micro-benchmarking in interpreted, garbage-collected environments with components that run at arbitrary time, such as garbage collectors. On top of that, there are hardware optimizations, such as caching, that skew the picture. As the result, the best way to see what is going on is often to look at the source.
The "second" loop is faster, so,
When you execute a method at least 10000 times, it triggers the whole method to be compiled. This means that your second loop can be
faster as it is already compiled the first time you run it.
slower because when optimised it doesn't have good information/counters on how the code is executed.
The best solution is to place each loop in a separate method so one loop doesn't optimise the other AND run this a few times, ignoring the first run.
e.g.
for(int i = 0; i < 3; i++) {
long time1 = doTest1(); // timed using System.nanoTime();
long time2 = doTest2();
System.out.printf("Test1 took %,d on average, Test2 took %,d on average%n",
time1/RUNS, time2/RUNS);
}
Most likely, the code was still compiling or not yet compiled at the time the first loop ran.
Wrap the entire method in an outer loop so you can run the benchmarks a few times, and you should see more stable results.
Read: Dynamic compilation and performance measurement.
It simply might be the case that you allocate so much space for objects with your calls to getBytes(), that the JVM Garbage Collector starts and cleans up the unused references (bringing out the trash).
Few more observations
As pointed by #dasblinkenlight above, Hadoop's Bytes.toBytes(c); internally calls the String.getBytes("UTF-8")
The variant method String.getBytes() which takes Character Set as input is faster than the one that does not take any character set. So for a given string, getBytes("UTF-8") would be faster than getBytes(). I have tested this on my machine (Windows8, JDK 7). Run the two loops one with getBytes("UTF-8") and other with getBytes() in sequence in equal iterations.
long ts;
String c = "sgfrt34tdfg34";
ts = System.currentTimeMillis();
for (int k = 0; k < 10000000; k++) {
c.getBytes("UTF-8");
}
System.out.println("t1->" + (System.currentTimeMillis() - ts));
ts = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
c.getBytes();
}
System.out.println("t2->" + (System.currentTimeMillis() - ts));
this gives:
t1->1970
t2->2541
and the results are same even if you change order of executions of loop. To discount any JIT optimizations, I would suggest run the tests in separate methods to confirm this (as suggested by #Peter Lawrey above)
So, Bytes.toBytes(c) should always be faster than String.getBytes()

Android: Initial audio processing method call takes a long time

I'm getting a very peculiar issue with my audio callbacks in my Android app (that's using NDK/OpenSL ES). I'm streaming audio output at 44.1 kHz and 512 frames (which gives me a callback time of 11.6 ms). In the callback, I'm synthesizing a couple of waveforms, filters, etc (like a synthesizer). Due to optimization I never reach over 5 ms of the callback time. However, when I turn on a specific effect (digital delay line), it starts to take a radically longer time in the callback. The digital delay line will jump from 7.5 ms (after all voices/filters have been processed) and jump up to 100 to 350 ms.
This is the most confusing part; after maybe 1 or 2 seconds, the digital delay execution time will jump from the extremely high time to 0.2 ms completion time per callback.
Why would the Android app take a long time to complete my digital delay processing code the first few callbacks and then die down to a very short and audio-happy time? I'm kind of at a loss right now and not sure how to fix this. To confirm, this only happens with the delay processing method. It's just a standard digital delay line (you can find some on github) and I feel like the algorithm isn't the problem here...
Kind of a pseudocode/rough sketch of what my audio callback code looks like:
static bool myAudioCallback(void *userData, short int *audIO, int numSamples, int srate) {
AudioData *data = (AudioData *)userData;
// Resets pointer array values to 0
for (int i = 0; i < numSamples; i++) data->buffer[i] = 0;
// Voice Generation Block
for (int voice = 0; voice < data->numVoices; voice++) {
// Reset voice buffers:
for (int i = 0; i < numSamples; i++) data->voiceBuffer[i] = 0;
// Generate Voice
data->voiceManager[voice]->generateVoiceBlock(data->voiceBuffer, numSamples);
// Sum voices
for (int i = 0; i < numSamples; i++) data->buffer[i] += data->voiceBuffer[i]];
}
// When app first starts, delayEnabled = false so user must click on a
// button on the UI to enable it.
// Trouble is that when we enable processDelay(double *buffer, in frames) the
// first time, we get a long execution time.
if (data->delayEnabled) {
data->delay->processDelay(data->buffer, numSamples);
}
// Conversion loop
for (int i = 0; i < numSamples; i++) {
double sample = clipOutput(data->buffer[i]);
audIO[2*i] = audIO[(2*i)+1] = CONV_FLT_TO_16BIT(sample * data->volume);
}
}
Thanks!
Not a great answer to the solution but this is what I did:
Before the user is able to do anything on the app, I turned on the delay and let it run its course for like 2 seconds before switching it off. This allows the callback to do its weird long 300 ms execution time while not destroying the audio.
Obviously this is not a great answer and if anyone can figure out a more logical explanation I would be more than happy to mark that as the answer.

Android / Cordova performance comparison

I'm trying to compare "Cross platform mobile application development tools" vs "Android Native development" from a performance perspective. In order to do that I developed an application which makes a calculation of a serie. Below I transcript Android and Phonegap code.
Android
double serie;
long t1 = System.currentTimeMillis();
serie = 0;
for (int j = 1; j <= 5; j++) {
for (int k = 1; k <= 100000; k++) {
serie = serie + Math.log(k) / Math.log(2) + (3 * k / (2 * j)) + Math.sqrt(k) + Math.pow(k, j - 1);
}
}
long duration = System.currentTimeMillis() - t1;
Phonegap
var start = new Date().getTime();
var serie = 0;
for ( var j=1; j <= 5; j++ ){
for ( var k=1; k <= 100000; k++ ){
serie = serie + ( Math.log(k)/Math.LN2 ) + (3*k/2*j) + Math.sqrt(k) + Math.pow(k, j-1);
}
}
var end = new Date().getTime();
var duration = end - start;
Each timing was taken thirty times and the results were averaged.
Results
Android average time = 532.93ms
Phonegap average time = 230.33ms
The results are far from what I expected. I don't understand why Android performance is worse than Phonegap's. Both applications are run as release versions.
The device is a Moto G2 (Android 4.4)
Am I missing something?
I am not sure if this performance comparison makes any sense. You simply perform some computation in Java and then in JavaScript. After that, you measure computation time. It doesn't prove anything.
The only conclusion you could have is the fact that JavaScript performed this particular computation faster than Java for some reason. Maybe JavaScript optimized something under the hood. As it's dynamically typed language, you should check if Java and JavaScript code returned the same result, because I'm not sure about that. Moreover, you are measuring time in a different ways in two tests. Maybe System.currentTimeMillis(); simply takes more time than new Date().getTime(); ?
In real life, you put non-deterministic and long operations to a separate thread different than UI thread. When operation is done, you can pass the result to the UI thread. Bad project structure and bad programming practices can slow down your application. With cross-platform tools like Phonegap, you have no control over Java code and you have very limited access to low level optimization techniques and multi-threading. You also have no direct access to Android SDK.
If you really want to analyze performance, you should prepare more realistic instrumentation test. For example, create application (in two versions: Android & Phone Gap), which reads some data from a file and images from disk and then displays it on the list and with instrumentation test, you can scroll the list down to the bottom. Afterwards, you can measure time of whole instrumentation test in both cases. Having something like that, you can make some assumptions.

Categories

Resources