Kmsg timestamps are 500ms in the future - android

I am trying to keep track of when the system wakes and suspends (ideally when monotonic_time starts and stops) so that I can accurately correlate monotonic time-stamps to the realtime clock.
On android the first method that came to mind was to monitor kmsg for a wakeup message and use its timestamp as a fairly accurate mark. As I was unsure of the accuracy of this timestamp, I decided to log the current monotonic time as well.
The following code is running in a standalone executable
fgets(mLineBuffer, sizeof(mLineBuffer), mKmsgFile);
//Find first space
char * messageContent = strchr(mLineBuffer,' ');
//Offset one to get character after space
if (strncmp (messageContent,"Enabling non-boot CPUs ...",25) == 0 )
std::cout << mLineBuffer;
std::cout << std::to_string(mMono.tv_sec) << "." << std::to_string(mMono.tv_nsec) << "\n";
I expected the time returned by clock_gettime to be at some point after the kmsg log timestamp, but instead it is anywhere from 600ms before to 200ms after.
<6>[226692.217017] Enabling non-boot CPUs ...
<6>[226692.626100] Enabling non-boot CPUs ...
<6>[226693.305535] Enabling non-boot CPUs ...
During this particular session, CLOCK_MONOTONIC consistently differed from the kmsg timestamp by roughly -500ms, only once flipping over to +179ms over the course of 10 wakeups. During a later session it was consistently off by -200ms.
The same consistent offset is present when monitoring all kmsg entries during normal operation (not suspending or waking). Perhaps returning from suspend occasionally delays my process long enough to produce a timestamp that is ahead of kmsg, resulting in the single +179ms difference.
Is this expected behavior? Does the kernel run on a separate monotonic clock?
Is there any other way to get wakeup/suspend times that correlate to monotonic time?
The ultimate goal is to use this information to help graph the contents of wakeup_sources over time, with a particular focus on activity immediately after waking. Though, if the kmsg timestamps are "incorrect", then the wakeup_sources ones probably are too.


Android NDK Sensor strange report interval to event queue

I try to access the accelerometer from the NDK. So far it works. But the way events are written to the eventqueue seems a little bit strange.
See the following code:
ASensorManager* AcquireASensorManagerInstance(void) {
typedef ASensorManager *(*PF_GETINSTANCEFORPACKAGE)(const char *name);
void* androidHandle = dlopen("", RTLD_NOW);
PF_GETINSTANCEFORPACKAGE getInstanceForPackageFunc = (PF_GETINSTANCEFORPACKAGE) dlsym(androidHandle, "ASensorManager_getInstanceForPackage");
if (getInstanceForPackageFunc) {
return getInstanceForPackageFunc(kPackageName);
typedef ASensorManager *(*PF_GETINSTANCE)();
PF_GETINSTANCE getInstanceFunc = (PF_GETINSTANCE) dlsym(androidHandle, "ASensorManager_getInstance");
return getInstanceFunc();
void init() {
sensorManager = AcquireASensorManagerInstance();
accelerometer = ASensorManager_getDefaultSensor(sensorManager, ASENSOR_TYPE_ACCELEROMETER);
accelerometerEventQueue = ASensorManager_createEventQueue(sensorManager, looper, LOOPER_ID_USER, NULL, NULL);
auto status = ASensorEventQueue_enableSensor(accelerometerEventQueue,
status = ASensorEventQueue_setEventRate(accelerometerEventQueue,
That's how I initialize everything. My SENSOR_REFRESH_PERIOD_US is 100.000 - so 10 refreshs per second. Now I have the following method to receive the events of the event queue.
vector<sensorEvent> update() {
ALooper_pollAll(0, NULL, NULL, NULL);
vector<sensorEvent> listEvents;
ASensorEvent event;
while (ASensorEventQueue_getEvents(accelerometerEventQueue, &event, 1) > 0) {
listEvents.push_back(sensorEvent{event.acceleration.x, event.acceleration.y, event.acceleration.z, (long long) event.timestamp});
return listEvents;
sensorEvent at this point is a custom struct which I use. This update method gets called via JNI from Android every 10 seconds from an IntentService (to make sure it runs even when the app itself is killed). Now I would expect to receive 100 values (10 per second * 10 seconds). In different tests I received around 130 which is also completly fine for me even it's a bit off. Then I read in the documentation of ASensorEventQueue_setEventRate that it's not forced to follow the given refresh period. So if I would get more than I wanted it would be totally fine.
But now the problem: Sometimes I receive like 13 values in 10 seconds and when I continue to call update 10 secods later I get the 130 values + the missing 117 of the run before. This happens completly random and sometimes it's not the next run but the fourth following or something like that.
I am completly fine with being off from the refresh period by having more values. But can anyone explain why it happens that there are so many values missing and they appear 10 seconds later in the next run? Or is there maybe a way to make sure I receive them in their desired run?
Your code is correct and as i see only one reason can be cause such behaviour. It is android system, for avoid drain battery, decreases frequency of accelerometer stream of events in some time after app go to background or device fall asleep.
You need to revise all axelerometer related logic and optimize according
Doze and App Standby
Also you can try to work with axelerometer in foreground service.

Scheduling latency of Android sensors handlers

rather than an answer I'm looking for an idea here.
I'd like to measure the scheduling latency of sensor sampling in Android. In particular I want to measure the time from the sensor interrupt request to when the bottom half, which is in charge of the data read, is executed.
The bottom half already has, besides the data read, a timestamping instruction. Indeed samples are collected by applications (being java or native, no difference) as a tuple [measurement, timestamp].
The timestamp follows the clock source clock_gettime(CLOCK_MONOTONIC, &t);
So assuming that the bottom-half is not preempted, somehow this timestamp gives an indication of the task scheduling instant. What is missing is a direct or indirect way to find out its corresponding irq instant.
Safely assume that we can ask any sampling rate to the sensor. The driver skeleton is the following (Galaxy's S3 gyroscope)
err = request_threaded_irq(data->client->irq, NULL,
"lsm330dlc_gyro", data);
static irqreturn_t lsm330dlc_gyro_interrupt_thread(int irq\
, void *lsm330dlc_gyro_data_p) {
struct lsm330dlc_gyro_data *data = lsm330dlc_gyro_data_p;
res = lsm330dlc_gyro_read_values(data->client,
&data->xyz_data, data->entries);
input_report_rel(data->input_dev, REL_RX, gyro_adjusted[0]);
input_report_rel(data->input_dev, REL_RY, gyro_adjusted[1]);
input_report_rel(data->input_dev, REL_RZ, gyro_adjusted[2]);
The key constraint is that I need to (well, I only have enough resources to) perform this measurement from user-space, on a commercial device, without toucing and recompliling the kernel. Hopefully with a limited mpact on the experiment accuracy. I don't know if such an experiment is possible with this constraint and so far I couldn't figure out any reasonable method.
I might consider also recompiling the kernel if the experiment then becomes straightforward.
First Its not possible to perform this measurement without touching the kernel.
Second I didnt see any bottom half configured in your ISR code.
Third if at all Bottom half is scheduled and kernel can be recompiled , you can sample jiffie value in ISR and again resample it in bottom half. take the difference between the two samples and subtract that offset from timestamp that is exported to U-space.

Accurate POSIX thread timing using NDK

I'm writing a simple NDK OpenSL ES audio app that records the users touches on a virtual piano keyboard and then plays them back forever over a set loop. After much experimenting and reading, I've settled on using a separate POSIX loop to achieve this. As you can see in the code it subtracts any processing time taken from the sleep time in order to make the interval of each loop as close to the desired sleep interval as possible (in this case it's 5000000 nanoseconds.
void init_timing_loop() {
pthread_t fade_in;
pthread_create(&fade_in, NULL, timing_loop, (void*)NULL);
void* timing_loop(void* args) {
while (1) {
clock_gettime(CLOCK_MONOTONIC, &timing.start_time_s);
tic_counter(); // simple logic gates that cycle the current tic
play_all_parts(); // for-loops through all parts and plays any notes (From an OpenSL buffer) that fall on the current tic
clock_gettime(CLOCK_MONOTONIC, &timing.finish_time_s);
timing.diff_time_s.tv_nsec = (5000000 - (timing.finish_time_s.tv_nsec - timing.start_time_s.tv_nsec));
nanosleep(&timing.diff_time_s, NULL);
return NULL;
The problem is that even using this the results are better, but quite inconsistent. sometimes notes will delay for perhaps even 50ms at a time, which makes for very wonky playback.
Is there a better way of approaching this? To debug I ran the following code:
gettimeofday(&timing.curr_time, &timing.tzp);
__android_log_print(ANDROID_LOG_DEBUG, "timing_loop", "gettimeofday: %d %d",
timing.curr_time.tv_sec, timing.curr_time.tv_usec);
Which gives a fairly consistent readout - that doesn't reflect the playback inaccuracies whatsoever. Are there other forces at work with Android preventing accurate timing? Or is OpenSL ES a potential issue? All the buffer data is loaded into memory - could there be bottlenecks there?
Happy to post more OpenSL code if needed... but at this stage I'm trying figure out if this thread loop is accurate or if there's a better way to do it.
You should consider seconds when using clock_gettime as well, you may get greater timing.start_time_s.tv_nsec than timing.finish_time_s.tv_nsec. tv_nsec starts from zero when tv_sec is increased.
timing.diff_time_s.tv_nsec =
(5000000 - (timing.finish_time_s.tv_nsec - timing.start_time_s.tv_nsec));
try something like
#define NS_IN_SEC 1000000000
(timing.finish_time_s.tv_sec * NS_IN_SEC + timing.finish_time_s.tv_nsec) -
(timing.start_time_s.tv_nsec * NS_IN_SEC + timing.start_time_s.tv_nsec)

What is the meaning of Incl CPU Time, Excl CPU Time, Incl Real CPU Time, Excl Real CPU Time in traceview?

1) Exclusive time is the time spent in the method
2) Inclusive time is the time spent in the method plus the time spent in any called functions
3) We refer to calling methods as "parents" and called methods as "children."
Reference Link : Click here
Question here is :
what are difference between
Incl CPU Time & Incl Real CPU Time ?
Excl CPU Time & Excl Real CPU Time ?
in my one example trace file
for Method1() : Incl CPU Time = 242 msec & Incl Real CPU Time = 5012 msec
i can not identify reason behind 5012-242 = 4770 msec gap in above both times.
Please help me if you have any idea.
Here's the DDMS documentation
Incl CPU time is the inclusive cpu time. It is the sum of the time spent in the function itself, as well as the sum of the times of all functions that it calls.
Excl CPU time is the exclusive cpu time. It is only the time spent in the function itself. You'll notice that it is always the same as the "incl time" of the "self" child.
The documentation doesn't clarify the difference between CPU time and real time, but I agree with Neetesh that CPU time is the time that the function is actually running (this would not include waiting on IO) and the real time is the wall clock time (which would include time spent doing IO).
cpu time is the time for which the process uses the cpu and cpu real time is the total time from the starting of process to end of process it includes waiting time of process to execute.
from the source code of .trace, you can see the cpu time detail different from the real cpu time, it's the same with the description of the android doc:
CPU time considers only the time that the thread is actively using CPU time, and real time provides absolute timing information from the moment your app enters a method to when it exits that method—regardless of whether the thread is active or sleeping.
Just as Chris and David said, I did a test.
#include <unistd.h>
#define S ((long long)1000 * 1000 * 1000)
// My CPU frequency is 3 GHz
void run() {
for (int i = 0; i < S; ++i);
void g() {
for (int i = 0; i < S; ++i);
int main() {
// run();
return 0;
As you can see, the inclusive time of function g is 8 s and its exclusive time is 2 s:

clock_gettime can not update instantly

After checking the time resolution, we tried to debug the problem in kernel space.
unsigned long long task_sched_runtime(struct task_struct *p)
unsigned long flags;
struct rq *rq;
u64 ns = 0;
rq = task_rq_lock(p, &flags);
ns = p->se.sum_exec_runtime + do_task_delta_exec(p, rq);
task_rq_unlock(rq, &flags);
//printk("task_sched runtime\n");
return ns;
Our new experiment shows that the time p->se.sum_exec_runtime is not updated instantly. But if we add printk() inside the function. the time will be updated instantly.
We are developing an Android program.
However, the time measured by the function threadCpuTimenanos() is not always correct on our platform.
After experimenting, we found that the time returned from clock_gettime is not updated instantly.
Even after several while loop iterations, the time we get still doesn't change.
Here's our sample code:
test = 1;
test = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &now);
printf(" clock gettime test 1 %lx, %lx , ret = %d\n",now.tv_sec , now.tv_nsec,test );
pre = now.tv_nsec;
This code runs okay on an x86 PC. But it does not run correctly in our embedded platform ARM Cortex-A9 with kernel
Any ideas?
I changed the clock_gettime to use the CLOCK_MONOTONIC_RAW , assigned the thread to one CPU and I get different values.
I am also working with a dual cortex-A9
test = 1;
test = clock_gettime(CLOCK_MONOTONIC_RAW, &now);
printf(" clock gettime test 1 %lx, %lx , ret = %d\n",now.tv_sec , now.tv_nsec, test );
pre = now.tv_nsec;
The resolution of clock_gettime is platform dependent. Use clock_getres() to find the resolution on your platform. According to the results of your experiment, clock resolutions on pc-x86 and on your target platform are different.
In the android CTS, there is a case that has the same problem. read timer twice but they are the same
testThreadCpuTimeNanos fail junit.framework.AssertionFailedError at
$man clock_gettime
Note for SMP systems
The CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clocks are realized on many platforms using timers from the CPUs (TSC on i386, AR.ITC on Itanium). These registers may differ between CPUs and as a consequence these clocks may return bogus results if a process is migrated to another CPU.
If the CPUs in an SMP system have different clock sources then there is no way to maintain a correlation between the timer registers since each CPU will run at a slightly different frequency. If that is the case then clock_getcpuclockid(0) will return ENOENT to signify this condition. The two clocks will then only be useful if it can be ensured that a process stays on a certain CPU.
The processors in an SMP system do not start all at exactly the same time and therefore the timer registers are typically running at an offset. Some architectures include code that attempts to limit these offsets on bootup. However, the code cannot guarantee to accurately tune the offsets. Glibc contains no provisions to deal with these offsets (unlike the Linux Kernel). Typically these offsets are small and therefore the effects may be negligible in most cases.
The CLOCK_THREAD_CPUTIME_ID clock measures CPU time spent, not realtime, and you're spending almost-zero CPU time. Also, CLOCK_THREAD_CPUTIME_ID (the thread-specific CPU time) is implemented incorrectly on Linux/glibc and likely does not even work at all on glibc. CLOCK_PROCESS_CPUTIME_ID or whatever that one's called should work better.

