I am trying to set the Performance Monitor User Mode Enable register on all cpus on a Nexus 4 running a mako kernel.
Right now I am setting the registers in a loadable module:
void enable_registers(void* info)
{
unsigned int set = 1;
/* enable user-mode access to the performance counter*/
asm volatile ("mcr p15, 0, %0, c9, c14, 0\n\t" : : "r" (set));
}
int init_module(void)
{
online = num_online_cpus();
possible = num_possible_cpus();
present = num_present_cpus();
printk (KERN_INFO "Online Cpus=%d\nPossible Cpus=%d\nPresent Cpus=%d\n", online, possible, present);
on_each_cpu(enable_registers , NULL, 1);
return 0;
}
The problem is that on_each_cpu only runs the function on Online cpus and as shown by the printk statment:
Online Cpus=1
Possible Cpus=4
Present Cpus=4
Only one of the four is online when I call on_each_cpu. So my question is, how do I force a cpu to be online, or how can force a certain cpu to execute code?
Thanks
You don't need to run the code on every cpu right now. What you need to do is arrange so that when the offline cpus come back online, your code is able to execute and enable the access to the PMU.
One way to achieve that would be with a cpu hotplug notifier.
Related
I am trying to obtain a bitmap of the number of cores which are online in an android device. I am trying to create a command line tool in C++ that does some additional functionality based on how many cores are on and in particular which cores are available.
I have tried to use the following to try and get the number of cores on in C++:
cpus = sysconf( _SC_NPROCESSORS_ONLN );
This gives me the number of cores in the system but not which cores are presently ON.
Does anyone know a potential way to do this?
There's no clear cut answer to this problem.
You can use nproc to see how many cores you have available, but this won't tell you how many cores you have online.
You can use top to view the utilization of each core. You can then parse the information from top to infer which cores are presently on.
I was able to get the core online status using this:
int numCPU = 1;
char *status = (char*)calloc(32,sizeof(char));
char *directory = (char*)calloc(1024,sizeof(char));
sprintf(directory, "/sys/devices/system/cpu/cpu%d/online", numCPU);
FILE *online = fopen(directory, "r");
if(online)
{
size = fread(status, sizeof(char), 32, online);
}
printf("Core %d status=%d", numCPU, status);
I've been using a few Android apps that hook onto another process, scan its allocated memory and edit it. Obviously, I was using it to mess around with some games.
Then, it got me thinking, "How are they doing it?"
I know how to get the list of currently running apps but hooking onto another process and scanning and editing the process' memory are.. Beyond my knowledge.
It seems that I'd need some kind of "root" privileges to execute code like that but I don't mind. I just want to know how these app developers did it to sate my curiosity.
So..
Assuming root privileges are enabled..
1) How can I hook onto a currently running different app?
2) How can I scan its memory regions?
3) How can I edit its memory regions?
inb4 "Have you tried googling?"
I thought about it and did a tonne of Googling (1+ hours) but no results because the words "RAM" and "memory" just gives me stuff like how to track the current app's memory allocations and whatnot. In other words, not what I am looking for.
So, I finally turned to opening a thread here.
Putting this here for posterity
After a fair bit of research (read, 5 days straight), as far as Linux is concerned, one may attach to a process, read its memory and detach by simply doing this:
Heavily commented for the newbies like me, uncomment and whatever if you're better
#include <sys/ptrace.h> //For ptrace()
#include <sys/wait.h> //For waitpid()
int main () {
int pid = 1337; //The process id you wish to attach to
int address = 0x13371337; //The address you wish to read in the process
//First, attach to the process
//All ptrace() operations that fail return -1, the exceptions are
//PTRACE_PEEK* operations
if (ptrace(PTRACE_ATTACH, pid, NULL, NULL) == -1) {
//Read the value of errno for details.
//To get a human readable, call strerror()
//strerror(errno) <-- Returns a human readable version of the
//error that occurred
return 0;
}
//Now, attaching doesn't mean we can read the value straight away
//We have to wait for the process to stop
int status;
//waitpid() returns -1 on failure
//W.I.F, not W.T.F
//WIFSTOPPED() returns true if the process was stopped when we attached to it
if (waitpid(pid, &status, 0) == -1 || !WIFSTOPPED(status)) {
//Failed, read the value of errno or strerror(errno)
return 0;
}
errno = 0; //Set errno to zero
//We are about to perform a PTRACE_PEEK* operation, it is possible that the value
//we read at the address is -1, if so, ptrace() will return -1 EVEN THOUGH it succeeded!
//This is why we need to 'clear' the value of errno.
int value = ptrace(PTRACE_PEEKDATA, pid, (void*)addr, NULL);
if (value == -1 && errno != 0) {
//Failed, read the value of errno or strerror(errno)
return 0;
} else {
//Success! Read the value
}
//Now, we have to detach from the process
ptrace(PTRACE_DETACH, pid, NULL, NULL);
return 0;
}
References:
http://linux.die.net/man/2/ptrace
http://linux.die.net/man/2/waitpid
How does this relate to editing Android app memory values?
Well, the headers for ptrace and wait exist in the Android NDK. So, to read/write an app's RAM, you will need native code in your app.
Also, ptrace() requires root privileges.
Why did it take you this long?
I've never written this kind of code before.
As far as Linux is concerned, it's forbidden by kernel to modify other memory that belongs to other processes (by the way, this is why there are no viruses on Linux).
What you are actually doing is editing Shared Preferences. They are written in plain text, and that means they can be edited if you have access to them(root).
You can check out CheatDroid application at Play Store. Also, if you want to develop similar app yourself, you can also check this link to create your first root app. http://www.xda-developers.com/android/how-to-build-an-android-app-part-2-writing-a-root-app-xda-tv/
I am writing network communication program with Android ndk, using epoll.
I found the method ‘epoll_wait’ woken not very accurate
while(1){
struct epoll_event events[3];
log_string("epoll_wait start");//here will print start time
events_len = epoll_wait(_epoll_fd, events, 3, 20 * 1000);// wait 20 second,for test,I use pipe instead of socket,monitor a pipe EPOLLIN event
if (events_len <= 0) {
log_string("epoll_wait end events_len=%d,errno=%d", events_len, errno);//Normally,the events_len always is 0,and errno is 0
}
}
The above code runs on the PC(like Ubuntun PC) is very normal,as expected.
If it runs on Android Phone(use Android Service , separate thread to run) is as expected at first.
After some time,epoll_wait becomes not very accurate,sometimes got -1 and errno=4,sometimes waited very long.
So I only know that phenomenon, but do not know why.
Can you tell why and tell me the best practices for use android epoll?
thx
4 is EINTR, which means your app got a signal. This isn't really an error, just restart epoll.
Regarding "waited very long", does your app hold at least a partial wakelock?
Update
After checking the time resolution, we tried to debug the problem in kernel space.
unsigned long long task_sched_runtime(struct task_struct *p)
{
unsigned long flags;
struct rq *rq;
u64 ns = 0;
rq = task_rq_lock(p, &flags);
ns = p->se.sum_exec_runtime + do_task_delta_exec(p, rq);
task_rq_unlock(rq, &flags);
//printk("task_sched runtime\n");
return ns;
}
Our new experiment shows that the time p->se.sum_exec_runtime is not updated instantly. But if we add printk() inside the function. the time will be updated instantly.
Old
We are developing an Android program.
However, the time measured by the function threadCpuTimenanos() is not always correct on our platform.
After experimenting, we found that the time returned from clock_gettime is not updated instantly.
Even after several while loop iterations, the time we get still doesn't change.
Here's our sample code:
while(1)
{
test = 1;
test = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &now);
printf(" clock gettime test 1 %lx, %lx , ret = %d\n",now.tv_sec , now.tv_nsec,test );
pre = now.tv_nsec;
sleep(1);
}
This code runs okay on an x86 PC. But it does not run correctly in our embedded platform ARM Cortex-A9 with kernel 2.6.35.13.
Any ideas?
I changed the clock_gettime to use the CLOCK_MONOTONIC_RAW , assigned the thread to one CPU and I get different values.
I am also working with a dual cortex-A9
while(1)
{
test = 1;
test = clock_gettime(CLOCK_MONOTONIC_RAW, &now);
printf(" clock gettime test 1 %lx, %lx , ret = %d\n",now.tv_sec , now.tv_nsec, test );
pre = now.tv_nsec;
sleep(1);
}
The resolution of clock_gettime is platform dependent. Use clock_getres() to find the resolution on your platform. According to the results of your experiment, clock resolutions on pc-x86 and on your target platform are different.
In the android CTS, there is a case that has the same problem. read timer twice but they are the same
testThreadCpuTimeNanos fail junit.framework.AssertionFailedError at
android.os.cts.DebugTest.testThreadCpuTimeNanos
$man clock_gettime
...
Note for SMP systems
The CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clocks are realized on many platforms using timers from the CPUs (TSC on i386, AR.ITC on Itanium). These registers may differ between CPUs and as a consequence these clocks may return bogus results if a process is migrated to another CPU.
If the CPUs in an SMP system have different clock sources then there is no way to maintain a correlation between the timer registers since each CPU will run at a slightly different frequency. If that is the case then clock_getcpuclockid(0) will return ENOENT to signify this condition. The two clocks will then only be useful if it can be ensured that a process stays on a certain CPU.
The processors in an SMP system do not start all at exactly the same time and therefore the timer registers are typically running at an offset. Some architectures include code that attempts to limit these offsets on bootup. However, the code cannot guarantee to accurately tune the offsets. Glibc contains no provisions to deal with these offsets (unlike the Linux Kernel). Typically these offsets are small and therefore the effects may be negligible in most cases.
The CLOCK_THREAD_CPUTIME_ID clock measures CPU time spent, not realtime, and you're spending almost-zero CPU time. Also, CLOCK_THREAD_CPUTIME_ID (the thread-specific CPU time) is implemented incorrectly on Linux/glibc and likely does not even work at all on glibc. CLOCK_PROCESS_CPUTIME_ID or whatever that one's called should work better.
So i overclocked my phone to 1.664ghz and I know there are apps that test your phone's CPU performance and stressers but I would like to make my own someway. What is the best way to really make your CPU work? I was thinking just making a for loop do 1 million iterations of doing some time-consuming math...but that did not work becuase my phone did it in a few milliseconds i think...i tried trillions of iterations...the app froze but my task manager did not show the cpu even being used by the app. Usually stress test apps show up as red and say cpu:85% ram: 10mb ...So how can i really make my processor seriously think?
To compile a regex string:
Pattern p1 = Pattern.compile("a*b"); // a simple regex
// slightly more complex regex: an attempt at validating email addresses
Pattern p2 = Pattern.compile("[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)\b");
You need to launch these in background threads:
class RegexThread extends Thread {
RegexThread() {
// Create a new, second thread
super("Regex Thread");
start(); // Start the thread
}
// This is the entry point for the second thread.
public void run() {
while(true) {
Pattern p = Pattern.compile("[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)\b");
}
}
}
class CPUStresser {
public static void main(String args[]) {
static int NUM_THREADS = 10, RUNNING_TIME = 120; // run 10 threads for 120s
for(int i = 0; i < NUM_THREADS; ++i) {
new RegexThread(); // create a new thread
}
Thread.sleep(1000 * RUNNING_TIME);
}
}
(above code appropriated from here)
See how that goes.
I would suggest a slightly different test, it is not a simple mathematical algorithms and functions. There are plenty of odd-looking tests whose results always contains all reviews. You launch the application, it works for a while, and then gives you the result in standard scores. The more points more (or less), it is considered that the device better. But that the comparison results mean in real life, is not always clear. And not all.
Regard to mathematics, the first thing that comes to mind is a massive amount of counting decimal places and the task to count the number "pi"
OK. No problem, we will do it:
Here's a test number one - "The Number Pi" - how long it takes your phone to calculate the ten million digits of Pi (3.14) (if someone said this phrase a hundred years ago, exactly would be immediately went to a psychiatric hospital)
When you feel that the phone is slow. You turn / twist interface. But how to measure it - it is unclear.
Angry Birds run on different devices at different times - perhaps test "Angry Birds"
We think further - get a couple more tests, "heavy book" and "a large page."
algorithm of calculation:
Test "of Pi"
Take the Speed Pi.
Count ten million marks by using a slow algorithm "Abraham Sharp Series. Repeat measurements several times, take the average.
Test "Angry Birds"
Take the very first Angry Birds (not required, but these versions are not the most optimized)
Measure the time from launch to the first sounds of music. Exit. Immediately run over and over again. Repeat several times and take the average.
Test "Large Page"
Measure the load time of heavy site pages. You can do it with your favorite browser :)
You can use This link (sorry for the Cyrillic)
This page is maintained by using "computers browser" along with pictures. Total turns out 6.5 Mb and 99 files (I'm still on this page in its stored version of a small sound file)
All 99 files upload to the phone. Turn off Wi-Fi and mobile Internet (this is important!)
Page opens with your browser. Click the "back" button. And now click "Forward" and measure the time the page is fully loaded. And so a few times. Back-forward, backward-forward. As usual, we take the average.
All results are given in seconds.
During testing all devices that support microSD cards, was one and the same card-Transcend 16 Gb, class 10. And all data on it.
Well, the actual results of the tests for some devices TEST RESULT
https://play.google.com/store/apps/details?id=xcom.saplin.xOPS - the app crunches numbers (integer and float) on multiple threads (2x number of cores) and builds performance and CPU temperature graphs.
https://github.com/maxim-saplin/xOPS-Console/blob/master/Saplin.xOPS/Compute.cs - that's the core of the app