How can i stress my phone's CPU programatically?

How can i stress my phone's CPU programatically? - android

So i overclocked my phone to 1.664ghz and I know there are apps that test your phone's CPU performance and stressers but I would like to make my own someway. What is the best way to really make your CPU work? I was thinking just making a for loop do 1 million iterations of doing some time-consuming math...but that did not work becuase my phone did it in a few milliseconds i think...i tried trillions of iterations...the app froze but my task manager did not show the cpu even being used by the app. Usually stress test apps show up as red and say cpu:85% ram: 10mb ...So how can i really make my processor seriously think?

To compile a regex string:
Pattern p1 = Pattern.compile("a*b"); // a simple regex
// slightly more complex regex: an attempt at validating email addresses
Pattern p2 = Pattern.compile("[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)\b");
You need to launch these in background threads:
class RegexThread extends Thread {
RegexThread() {
// Create a new, second thread
super("Regex Thread");
start(); // Start the thread
}
// This is the entry point for the second thread.
public void run() {
while(true) {
Pattern p = Pattern.compile("[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)\b");
}
}
}
class CPUStresser {
public static void main(String args[]) {
static int NUM_THREADS = 10, RUNNING_TIME = 120; // run 10 threads for 120s
for(int i = 0; i < NUM_THREADS; ++i) {
new RegexThread(); // create a new thread
}
Thread.sleep(1000 * RUNNING_TIME);
}
}
(above code appropriated from here)
See how that goes.

I would suggest a slightly different test, it is not a simple mathematical algorithms and functions. There are plenty of odd-looking tests whose results always contains all reviews. You launch the application, it works for a while, and then gives you the result in standard scores. The more points more (or less), it is considered that the device better. But that the comparison results mean in real life, is not always clear. And not all.
Regard to mathematics, the first thing that comes to mind is a massive amount of counting decimal places and the task to count the number "pi"
OK. No problem, we will do it:
Here's a test number one - "The Number Pi" - how long it takes your phone to calculate the ten million digits of Pi (3.14) (if someone said this phrase a hundred years ago, exactly would be immediately went to a psychiatric hospital)
When you feel that the phone is slow. You turn / twist interface. But how to measure it - it is unclear.
Angry Birds run on different devices at different times - perhaps test "Angry Birds"
We think further - get a couple more tests, "heavy book" and "a large page."
algorithm of calculation:
Test "of Pi"
Take the Speed Pi.
Count ten million marks by using a slow algorithm "Abraham Sharp Series. Repeat measurements several times, take the average.
Test "Angry Birds"
Take the very first Angry Birds (not required, but these versions are not the most optimized)
Measure the time from launch to the first sounds of music. Exit. Immediately run over and over again. Repeat several times and take the average.
Test "Large Page"
Measure the load time of heavy site pages. You can do it with your favorite browser :)
You can use This link (sorry for the Cyrillic)
This page is maintained by using "computers browser" along with pictures. Total turns out 6.5 Mb and 99 files (I'm still on this page in its stored version of a small sound file)
All 99 files upload to the phone. Turn off Wi-Fi and mobile Internet (this is important!)
Page opens with your browser. Click the "back" button. And now click "Forward" and measure the time the page is fully loaded. And so a few times. Back-forward, backward-forward. As usual, we take the average.
All results are given in seconds.
During testing all devices that support microSD cards, was one and the same card-Transcend 16 Gb, class 10. And all data on it.
Well, the actual results of the tests for some devices TEST RESULT

https://play.google.com/store/apps/details?id=xcom.saplin.xOPS - the app crunches numbers (integer and float) on multiple threads (2x number of cores) and builds performance and CPU temperature graphs.
https://github.com/maxim-saplin/xOPS-Console/blob/master/Saplin.xOPS/Compute.cs - that's the core of the app

Related

Scheduling latency of Android sensors handlers

rather than an answer I'm looking for an idea here.
I'd like to measure the scheduling latency of sensor sampling in Android. In particular I want to measure the time from the sensor interrupt request to when the bottom half, which is in charge of the data read, is executed.
The bottom half already has, besides the data read, a timestamping instruction. Indeed samples are collected by applications (being java or native, no difference) as a tuple [measurement, timestamp].
The timestamp follows the clock source clock_gettime(CLOCK_MONOTONIC, &t);
So assuming that the bottom-half is not preempted, somehow this timestamp gives an indication of the task scheduling instant. What is missing is a direct or indirect way to find out its corresponding irq instant.
Safely assume that we can ask any sampling rate to the sensor. The driver skeleton is the following (Galaxy's S3 gyroscope)
err = request_threaded_irq(data->client->irq, NULL,
lsm330dlc_gyro_interrupt_thread\
, IRQF_TRIGGER_RISING | IRQF_ONESHOT,\
"lsm330dlc_gyro", data);
static irqreturn_t lsm330dlc_gyro_interrupt_thread(int irq\
, void *lsm330dlc_gyro_data_p) {
...
struct lsm330dlc_gyro_data *data = lsm330dlc_gyro_data_p;
...
res = lsm330dlc_gyro_read_values(data->client,
&data->xyz_data, data->entries);
...
input_report_rel(data->input_dev, REL_RX, gyro_adjusted[0]);
input_report_rel(data->input_dev, REL_RY, gyro_adjusted[1]);
input_report_rel(data->input_dev, REL_RZ, gyro_adjusted[2]);
input_sync(data->input_dev);
...
}
The key constraint is that I need to (well, I only have enough resources to) perform this measurement from user-space, on a commercial device, without toucing and recompliling the kernel. Hopefully with a limited mpact on the experiment accuracy. I don't know if such an experiment is possible with this constraint and so far I couldn't figure out any reasonable method.
I might consider also recompiling the kernel if the experiment then becomes straightforward.
Thanks.

First Its not possible to perform this measurement without touching the kernel.
Second I didnt see any bottom half configured in your ISR code.
Third if at all Bottom half is scheduled and kernel can be recompiled , you can sample jiffie value in ISR and again resample it in bottom half. take the difference between the two samples and subtract that offset from timestamp that is exported to U-space.

Accurate POSIX thread timing using NDK

I'm writing a simple NDK OpenSL ES audio app that records the users touches on a virtual piano keyboard and then plays them back forever over a set loop. After much experimenting and reading, I've settled on using a separate POSIX loop to achieve this. As you can see in the code it subtracts any processing time taken from the sleep time in order to make the interval of each loop as close to the desired sleep interval as possible (in this case it's 5000000 nanoseconds.
void init_timing_loop() {
pthread_t fade_in;
pthread_create(&fade_in, NULL, timing_loop, (void*)NULL);
}
void* timing_loop(void* args) {
while (1) {
clock_gettime(CLOCK_MONOTONIC, &timing.start_time_s);
tic_counter(); // simple logic gates that cycle the current tic
play_all_parts(); // for-loops through all parts and plays any notes (From an OpenSL buffer) that fall on the current tic
clock_gettime(CLOCK_MONOTONIC, &timing.finish_time_s);
timing.diff_time_s.tv_nsec = (5000000 - (timing.finish_time_s.tv_nsec - timing.start_time_s.tv_nsec));
nanosleep(&timing.diff_time_s, NULL);
}
return NULL;
}
The problem is that even using this the results are better, but quite inconsistent. sometimes notes will delay for perhaps even 50ms at a time, which makes for very wonky playback.
Is there a better way of approaching this? To debug I ran the following code:
gettimeofday(&timing.curr_time, &timing.tzp);
__android_log_print(ANDROID_LOG_DEBUG, "timing_loop", "gettimeofday: %d %d",
timing.curr_time.tv_sec, timing.curr_time.tv_usec);
Which gives a fairly consistent readout - that doesn't reflect the playback inaccuracies whatsoever. Are there other forces at work with Android preventing accurate timing? Or is OpenSL ES a potential issue? All the buffer data is loaded into memory - could there be bottlenecks there?
Happy to post more OpenSL code if needed... but at this stage I'm trying figure out if this thread loop is accurate or if there's a better way to do it.

You should consider seconds when using clock_gettime as well, you may get greater timing.start_time_s.tv_nsec than timing.finish_time_s.tv_nsec. tv_nsec starts from zero when tv_sec is increased.
timing.diff_time_s.tv_nsec =
(5000000 - (timing.finish_time_s.tv_nsec - timing.start_time_s.tv_nsec));
try something like
#define NS_IN_SEC 1000000000
(timing.finish_time_s.tv_sec * NS_IN_SEC + timing.finish_time_s.tv_nsec) -
(timing.start_time_s.tv_nsec * NS_IN_SEC + timing.start_time_s.tv_nsec)

Performance of BigInteger method 'add' on Android phone

I've been working on a cryptographic (POT) protocol in Java for my master's thesis.
It uses cryptographic Pairings and therefore makes use of an external java library called jPBC (http://gas.dia.unisa.it/projects/jpbc/).
As I want one side of the protocol to run on a mobile device, I 've made a simple GUI in Android ADT with a single button that starts the protocol. However, the protocol runs about 200 times slower on my phone (Samsung S2 plus, ARM Cortex A9 32 bit processor) than on my laptop (Intel Core i7, but only using half a core). As the difference in processors might explain a factor 10 but certainly not a factor 100/200 I figured the difference in performance would be due to the inefficiency of the jPBC library on Android.
The jPBC library makes extensive use of BigInteger for all of its calculations so I decided to investigate if BigInteger could be extra inefficient on android (it's not super efficient on normal computers either). I executed a loop of 1200 bit BigInteger calculations on the phone and the laptop. I've come up with some results that I cannot explain:
10^6 Additions and subtractions take 205ms on laptop, 48 025ms on phone (x 200).
10^5 Multiplications and divisions take 814 ms on laptop and 13 705ms on phone (x 17).
10^3 Modular Exponentiations (modPow) take 5079ms on laptop and 22 153ms on phone (x 4.5)
As there is much to be said about these results, I'll just stick to this simple question:
Can anyone either reproduce these results and confirm that BigInteger addition is immensively slow on Android, or tell me what I've done wrong?
The code:
Java method:
public static long bigIntCalculations(){
System.out.println("starting bigIntCalculations");
Random random = new Random();
BigInteger start = new BigInteger(1200, random);
BigInteger temp = new BigInteger(start.toString());
long nOfIterations = 1000000L;
long time1 = System.nanoTime()/1000000;
for (long i = 0; i < nOfIterations; i++) {
start = start.add(temp);
start = start.subtract(temp);
}
long result = (System.nanoTime()/1000000)-time1;
System.out.println(result);
return result;
}
In Android:
/** Called when the user clicks the button1*/
public void runProtocol(View view) {
long duration = Test.bigIntCalculations();
String result ="Calculations take: " + duration + " ms";
Intent intent = new Intent(this, DisplayMessageActivity.class);
intent.putExtra(CALC_RESULT, result);
startActivity(intent);
}
Many thanks!

Only x4.5 for 1200-bit modular exponentiation is a terrific result, considering the underpowered hardware. It's also a testament to how bad the JDK's BigInteger implementation is.
The Android standard library uses OpenSSL BigNum for some under-the-hood operations. Without peeking, I would guess modular exponentiation and modular inverse are handled in native code, while simpler arithmetic is handled in Java code.
For tight loops of addition and multiplication you would be generating lots of garbage, and GC performance disparity between platforms could also be having a large impact -- my guess is that some warmup + a much smaller benchmark will show closer results.
My performance pain point is modular exponentiation, so I'm pretty happy with Android performance. If that were not the case, I'd be looking at porting libraries such as gmp4j or gmp-java (two libraries by that name) to Android. Two of these provide a BigInteger-compatible API. Another offers a more direct mapping to GMPLib, which can be ideal in terms of memory management (GMP numbers are mutable).

clock_gettime can not update instantly

Update
After checking the time resolution, we tried to debug the problem in kernel space.
unsigned long long task_sched_runtime(struct task_struct *p)
{
unsigned long flags;
struct rq *rq;
u64 ns = 0;
rq = task_rq_lock(p, &flags);
ns = p->se.sum_exec_runtime + do_task_delta_exec(p, rq);
task_rq_unlock(rq, &flags);
//printk("task_sched runtime\n");
return ns;
}
Our new experiment shows that the time p->se.sum_exec_runtime is not updated instantly. But if we add printk() inside the function. the time will be updated instantly.
Old
We are developing an Android program.
However, the time measured by the function threadCpuTimenanos() is not always correct on our platform.
After experimenting, we found that the time returned from clock_gettime is not updated instantly.
Even after several while loop iterations, the time we get still doesn't change.
Here's our sample code:
while(1)
{
test = 1;
test = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &now);
printf(" clock gettime test 1 %lx, %lx , ret = %d\n",now.tv_sec , now.tv_nsec,test );
pre = now.tv_nsec;
sleep(1);
}
This code runs okay on an x86 PC. But it does not run correctly in our embedded platform ARM Cortex-A9 with kernel 2.6.35.13.
Any ideas?

I changed the clock_gettime to use the CLOCK_MONOTONIC_RAW , assigned the thread to one CPU and I get different values.
I am also working with a dual cortex-A9
while(1)
{
test = 1;
test = clock_gettime(CLOCK_MONOTONIC_RAW, &now);
printf(" clock gettime test 1 %lx, %lx , ret = %d\n",now.tv_sec , now.tv_nsec, test );
pre = now.tv_nsec;
sleep(1);
}

The resolution of clock_gettime is platform dependent. Use clock_getres() to find the resolution on your platform. According to the results of your experiment, clock resolutions on pc-x86 and on your target platform are different.

In the android CTS, there is a case that has the same problem. read timer twice but they are the same
testThreadCpuTimeNanos fail junit.framework.AssertionFailedError at
android.os.cts.DebugTest.testThreadCpuTimeNanos

$man clock_gettime
...
Note for SMP systems
The CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clocks are realized on many platforms using timers from the CPUs (TSC on i386, AR.ITC on Itanium). These registers may differ between CPUs and as a consequence these clocks may return bogus results if a process is migrated to another CPU.
If the CPUs in an SMP system have different clock sources then there is no way to maintain a correlation between the timer registers since each CPU will run at a slightly different frequency. If that is the case then clock_getcpuclockid(0) will return ENOENT to signify this condition. The two clocks will then only be useful if it can be ensured that a process stays on a certain CPU.
The processors in an SMP system do not start all at exactly the same time and therefore the timer registers are typically running at an offset. Some architectures include code that attempts to limit these offsets on bootup. However, the code cannot guarantee to accurately tune the offsets. Glibc contains no provisions to deal with these offsets (unlike the Linux Kernel). Typically these offsets are small and therefore the effects may be negligible in most cases.

The CLOCK_THREAD_CPUTIME_ID clock measures CPU time spent, not realtime, and you're spending almost-zero CPU time. Also, CLOCK_THREAD_CPUTIME_ID (the thread-specific CPU time) is implemented incorrectly on Linux/glibc and likely does not even work at all on glibc. CLOCK_PROCESS_CPUTIME_ID or whatever that one's called should work better.

Android - Scheduling an Events to Occur Every 10ms?

I'm working on creating an app that allows very low bandwidth communication via high frequency sound waves. I've gotten to the point where I can create a frequency and do the fourier transform (with the help of Moonblink's open source code for Audalyzer).
But here's my problem: I'm unable to get the code to run with the correct timing. Let's say I want a piece of code to execute every 10ms, how would I go about doing this?
I've tried using a TimerTask, but there is a huge delay before the code actually executes, like up to 100ms.
I also tried this method simply by pinging the current time and executing only when that time has elapsed. But there is still a delay problem. Do you guys have any ideas?
Thread analysis = new Thread(new Runnable()
{
#Override
public void run()
{
android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_DISPLAY);
long executeTime = System.currentTimeMillis();
manualAnalyzer.measureStart();
while (FFTransforming)
{
if(System.currentTimeMillis() >= executeTime)
{
//Reset the timer to execute again in 10ms
executeTime+=10;
//Perform Fourier Transform
manualAnalyzer.doUpdate(0);
//TODO: Analyze the results of the transform here...
}
}
manualAnalyzer.measureStop();
}
});
analysis.start();

I would recommend a very different approach: Do not try to run your code in real time.
Instead, rely on only the low-level audio code running in real time, by recording (or playing) continuously for a period of time encompassing the events of interest.
Your code then runs somewhat asynchronously to this, decoupled by the audio buffers. Your code's sense of time is determined not by the system clock as it executes, but rather by the defined inter-sample-interval of the audio data you work with. (ie, if you are using 48 Ksps then 10 mS later is 480 samples later)
You may need to modify your protocol governing interaction between the devices to widen the time window in which transmissions can be expected to occur. Ie, you can have precise timing with respect to the actual modulation and symbols within a "packet", but you should not expect nearly the same order of precision in determining when a packet is sent or received - you will have to "find" it amidst a longer recording containing noise.

Your thread/loop strategy is probably roughly as close as you're going to get. However, 10ms is not a lot of time, most Android devices are not super-powerful, and a Fourier transform is a lot of work to do. I find it unlikely that you'll be able to fit that much work in 10ms. I suspect you're going to have to increase that period.

i changed your code so that it takes the execution time of doUpdate into account. The use of System.nanoTime() should also increase accuracy.
public void run() {
android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_DISPLAY);
long executeTime=0;
long nextTime = System.nanoTime();
manualAnalyzer.measureStart();
while (FFTransforming)
{
if(System.nanoTime() >= nextTime)
{
executeTime = System.nanoTime();
//Perform Fourier Transform
manualAnalyzer.doUpdate(0);
//TODO: Analyze the results of the transform here...
executeTime = System.nanoTime() - executeTime;
//guard against the case that doUpdate took longer than 10ms
final long i = executeTime/10000000;
//set the timer to execute again at the next full 10ms intervall
nextTime+= 10000000+ i*10000000
}
}
manualAnalyzer.measureStop();
}
What else could you do?
eliminate Garbage Collection
go native with the NDK (just an idea, this might as well give no benefit)

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.