What are all the possible thread states during execution for native (C/C++) threads on an Android device? Are they the same as the Java Thread States? Are they Linux threads? POSIX threads?
Not required, but bonus points for providing examples of what can cause a thread to enter each state.
Edit: As requested, here's the motivation:
I'm designing the interface for a sampling profiler that works with native C/C++ code on Android. The profiler reports will show thread states over time. I need to know what all the states are in order to a) know how many distinct states I will need to possibly visually differentiate, and b) design a color scheme that visually differentiates and groups the desirable states versus the undesirable states.
I've been told that native threads on Android are just lightweight processes. This agrees with what I've found for Linux in general. Quoting this wiki page:
A process (which includes a thread) on a Linux machine can be in any of the following states:
TASK_RUNNING - The process is either executing on a CPU or waiting to be executed.
TASK_INTERRUPTIBLE - The process is suspended (sleeping) until some condition becomes true. Raising a hardware interrupt, releasing a system resource the process is waiting for, or delivering a signal are examples of conditions that might wake up the process (put its state back to TASK_RUNNING). Typically blocking IO calls (disk/network) will result in the task being marked as TASK_INTERRUPTIBLE. As soon as the data it is waiting on is ready to be read an interrupt is raised by the device and the interrupt handler changes the state of the task to TASK_INTERRUPTIBLE. Also processes in idle mode (ie not performing any task) should be in this state.
TASK_UNINTERRUPTIBLE - Like TASK_INTERRUPTIBLE, except that delivering a signal to the sleeping process leaves its state unchanged. This process state is seldom used. It is valuable, however, under certain specific conditions in which a process must wait until a given event occurs without being interrupted. Ideally not too many tasks will be in this state.
For instance, this state may be used when a process opens a device file and the corresponding device driver starts probing for a corresponding hardware device. The device driver must not be interrupted until the probing is complete, or the hardware device could be left in an unpredictable state.
Atomic write operations may require a task to be marked as UNINTERRUPTIBLE
NFS access sometimes results in access processes being marked as UNINTERRUPTIBLE
reads/writes from/to disk can be marked thus for a fraction of a second
I/O following a page fault marks a process UNINTERRUPTIBLE
I/O to the same disk that is being accessed for page faults can result in a process marked as UNINTERRUPTIBLE
Programmers may mark a task as UNINTERRUPTIBLE instead of using INTERRUPTIBLE
TASK_STOPPED - Process execution has been stopped; the process enters this state after receiving a SIGSTOP, SIGTSTP, SIGTTIN, or SIGTTOU signal.
TASK_TRACED - Process execution has been stopped by a debugger.
EXIT_ZOMBIE - Process execution is terminated, but the parent process has not yet issued a wait4() or waitpid() system call. The OS will not clear zombie processes until the parent issues a wait()-like call.
EXIT_DEAD - The final state: the process is being removed by the system because the parent process has just issued a wait4() or waitpid() system call for it. Changing its state from EXIT_ZOMBIE to EXIT_DEAD avoids race conditions due to other threads of execution that execute wait()-like calls on the same process.
Edit: And yet the Dalvik VM Debug Monitor provides different states. From its documentation:
"thread state" must be one of:
1 - running (now executing or ready to do so)
2 - sleeping (in Thread.sleep())
3 - monitor (blocked on a monitor lock)
4 - waiting (in Object.wait())
5 - initializing
6 - starting
7 - native (executing native code)
8 - vmwait (waiting on a VM resource)
"suspended" [a separate flag in the data structure] will be 0 if the thread is running, 1 if not.
If you design a system app that has to work with threads in even more advanced way than usual app, I'd first start by examining what API is available on Android to access threads.
The answer is pthread = POSIX threads, with pthread.h header file, implemented in Bionic C library. So you have starting point for knowing what you can achieve.
Another thing is that Android doesn't implement full pthread interface, only subset needed for Android to run.
More on threads + Bionic here, and how they interact with Java and VM is described here. Also I feel that thread is actually a process, as my code uses setpriority(PRIO_PROCESS, gettid(), pr); to set new thread's priority - I don't recall where I got this info, but this works.
I assume that thread may be in running, finished or blocked (e.g. waiting for mutex) state, but that's my a bit limited knowledge since I never needed other thread state.
Now question is if your app can actually retrieve these states using available API in NDK, and if there're more states, if your users would be really interested to know.
Anyway, you may start by displaying possibly incomplete states of threads, and if your users really care, you'd learn about another states from users' feedback and requests.
Google:
Thread.State BLOCKED The thread is blocked and waiting for a lock.
Thread.State NEW The thread has been created, but has never been started.
Thread.State RUNNABLE The thread may be run.
Thread.State TERMINATED The thread has been terminated.
Thread.State TIMED_WAITING The thread is waiting for a specified amount of time.
Thread.State WAITING The thread is waiting.
These states are not very well explained - I don't see the difference between BLOCKED and WAITING, for example.
Interestingly, there is no 'RUNNING' state - do these devices ever do anything?
Related
I have phone with Snapdragon 632 Mobile Platform and some random Android app which shows what your phone has inside (RAM, SoC, sensors, screen density etc.) shows it has 8 cores.
What does it mean from Android app developer perspective?
So I can start (theoretically) up to 8 independent processes which can do work in parallel? Or this has to do with Java's Thread? Or none, find something else to study :) ?
Q : ...up to 8 independent processes which can do work in parallel?
Well, no.A process-based true-[PARALLEL] execution is way more complex, than a just-[CONCURRENT] orchestration of processes ( well known for every serious multitasking / multiprocessing O/S designer ).
Q : What does it mean from Android app developer perspective?
The SoC's 1.8 [GHz] 8-core CPU, reported by your system, is just a one class of resources the O/S has to coordinate all processes' work among - RAM being the next, storage, RTC-device(s), a (global) source of randomness, light-sensor, gyro-sensor(s), etc.
All this sharing-of-resources is a sign of a just-[CONCURRENT] orchestration of processes, where opportunistic scheduling permits a Process to go forward, once some requested resource ( CPU-core, RAM, storage, ... ) gets free to use and scheduler permits a next one waiting to make a small part of it's work and releasing and returning all such resources back, once either a time-quota expires, a signal-request arrives or some async awaiting makes such process to have to wait for some external, independently timed event ( yes, operations across a network are typical case of this ) or was ordered to go and sleep (so, why to block others who need not wait ... and can work during that time or "sleep" ).
O/S may further restrict processes, to become able to use but some of the CPU-cores - this way, such planning may show up, that a physically 8-core CPU might get reported as but a 6-core CPU from some processes, while the other 2-cores were affinity-mapped so that no user-level process will ever touch 'em, so these remain, under any circumstances, free/ready to serve the background processes, not interfering with other, user-level processing bottlenecks, that may happen on the remaining, less restricted 6-cores, where both system-level and user-level processes may get scheduled for execution to take place there.
On a processor level, further details matter. Some CPU-s have SIMD-instructions, that can process many-data, if properly pre-aligned into SIMD-registers, in one and single CPU-instruction step. On the contrary, some 8+ core CPU-s have to share but 2 physical FMA-uop units that can multiply-add, spending but a pair of CPU-CLK-cycles. So if all 8+ cores ask for this very same uOP-instruction, well, "Houston, we have a small problem here ..." - CPU-design thus CISC-CPUs have introduced ( RISC-s have completely different philosophy to avoid getting into this ) a superscalar pipelining with out-of-order instruction re-ordering, so 2-FMA-s process each step but a pair of such pack of 8-requested FMA-uops, interleaving these, on a CPU-uops level, with other ( legally re-ordered instructions ) work. Here you can see, that a deeper Level-of-Detail can surprise during execution, so HPC and hard-RealTime system designers have to pay attention to even this LoD ordering, if System-under-Test has to prove it's ultimate robustness for field-deployment.
Threads are in principle way lighter, than a fully-fledged O/S Process, so way easier to put/release from CPU-core ( cf. a context-switching ), so these are typically used for in-process [CONCURRENT] code-execution ( threads typically share the O/S delivered quota of CPU-time-sharing - i.e. when many O/S Processes inside your O/S's scheduler's queue wait for their time to execute on shared-CPU (cores), all their respective threads wait either ( no sign of thread independence from it's mother Process ). A similar scheduling logic applies to cases, when an 8-core CPU ought execute 888-threads, spawned from a single O/S Process, all that among other 999-system-processes, all waiting in a scheduler queue for their turn ) Also the memory-management is way easier for threads, as they "share" the same address-space, inherited from their mother-Process and can freely access but that address-space, under a homogeneous memory-access policy (and restrictions - will not crash other O/S Processes, yet may devastate their own one's memory state... see Thread-safe-ness issues )
Q : ...something else to study :) ?
The best place to learn from masters is to dive into the O/S design practices - best engineering comes from Real-Time systems, yet it depends a lot on your level of endurance and experience, how easy or hard that will be for you to follow and learn from.
Non-blocking, independent processes can work in a true-[PARALLEL] fashion, given no resources' blocking appears and results are deterministic in TimeDOMAIN -- all start + all execute + all finish -- at the same time. Like an orchestra performing a piece from W.A.Mozart.
If a just-[CONCURRENT] orchestration were permitted for the same piece of music, the violins might start only after they were able to borrow some or all fiddlesticks from viola-players, who might have been waiting in the concert-hall basement, as there was yet not their turn to even get into the dressing-room, piano soloist was still blocked downtown, in the traffic jam and will not be able to finish her part of the Concerto Grosso in about next 3 hours, while bass-players have superfast fiddled all their notes, as nobody was in a need of their super-long fiddle-sticks and they are almost ready to leave the concert-hall and move to playing on another "party" in the neighbouring city, as their boss promised there...
Yes, this would be a just-[CONCURRENT] orchestration, where the resulting "performance" always depends on many local-[ states, parameters ] and also heavily on externalities-( availability of taxi, actual traffic jam and its dynamics, situations like some resource {under|over}-booking )
All that makes a just-[CONCURRENT] execution way simpler in execution ( no strict coordination of resources needed - a "best-effort" - a "Do, if and when someone can" typically suffice ), but in-deterministic in results' ordering.
Wolfgang Amadeus Mozart was definitely designing his pieces of art in a true-[PARALLEL] fashion of how to orchestrate its performance - this is why we all love Amadeus and no one will ever dream to let it be executed in a just-[CONCURRENT] manner :o) no one will ever detect, the today's product of in-deterministically performed piece was the same as was performed, under different external set of external and other conditions, last night or last week, so no one could say if it was Mozart's piece or not at all ... God bless true-[PARALLEL] orchestration never permits to devastate such lovely pieces of art & performs the very way that every time the same result is (almost guaranteed to be) produced...
I'm using DDMS to monitor threads in my app, and I see that my app has a bunch of native threads as shown in follow picture. And time to time, the number of native threads increased as user interact with my app, which cause my app sometime does not serve as I expect. Is there anyway to kill these native threads?
There is no such thing as a "native thread" on Android, although some people might use that to refer to threads that are not attached to the VM (which would also make them invisible to DDMS). The threads happen to be executing (or waiting) in native code at the time you did a thread dump, but may spend most of their time executing bytecode. (A list of Dalvik thread states is available here.)
The names of the threads suggests that they were created without being given an explicit name. The one thread with a name, NsdManager probably exists because you're using NsdManager, which "responses to requests from an application are on listener callbacks on a seperate thread" [sic].
It's possible that you can glean some useful information from a stack trace. In DDMS, double-click the thread to get a backtrace. On a rooted device, you can kill -3 <pid> to get a full dump, including native stack frames.
Killing arbitrary threads is not allowed, as they might be holding locks or other resources. If you can determine what is starting them, and that they are unnecessary, you can prevent them from being started in the first place.
I'm fighting the known bug in Android that a blocked USB read thread cannot be unblocked - period. Nothing unblocks it; not closing the underlying object (as is typical with sockets), not using NIO and calling FileChannel.close (which sends an exception to the blocked thread), nothing. So I'm stuck crafting up some sort of workaround that tolerates this bug in Android.
The biggest problem is that since the thread won't die, it retains a reference to the underlying FileInputStream object (or the FileChannel object, or whatever you're using). Because that object still exists, you cannot reassociate with the connected USB device. You get the well-known "could not open /dev/usb_accessory" message of despair.
So... since the thread cannot be killed nor interrupted externally, and since it won't wake up on its own to release the object, I'm wondering when such a blocked thread and its associate resources are cleaned up by the operating system. In most OS's the thread would be part of the overall process, and when that process is terminated all threads and objects would get cleaned up at the same time - thus finally releasing the USB connection so something else can associate with it. But experiments suggest that the thread or object may live beyond the process. How, and in what context, I don't know, but so far I'm still getting that "could not open /dev/usb_accessory" message even after the previous process has been terminated (!?!).
So... what finally cleans up everything associated with a process, including all of its threads and instanced objects? How do I "clean the slate" so a new process has a fresh shot at associating with /dev/usb_accessory?
Thanks!
I am working on a strange issue with the i2c-omap driver. I am not sure if the problem happens at other time or not, but it happens around 5% of the time I tried to power off the system.
During system power off, I write to some registers in the PMIC via I2C. In i2c-omap.c, I can see that the calling thread is waiting on wait_for_completion_timeout with a timeout value set to 1 second. And I can see the IRQ called "complete" (I added printk AFTER "complete"). However, after "complete" gets called, the wait_for_completion_timeout did not return. Instead, it takes up to 5 MINUTES before it returns. And the return value of wait_for_completion_timeout is positive indicating that there is no timeout. And the whole I2C transaction was successful.
In the meantime, I can see printk messages from other drivers. And the serial console still works. It is on Android, and if I use "top" I can see system_server is taking about 95% of the CPU. Killing system_server can make the wait_for_completion_timeout return immediately.
So my question is what could a user space app (system_server) do to make a kernel "wait_for_completion_timeout" not being wake up?
Thanks!
wait_for_completion_timeout only guarantees that the thread waiting on a condition would become "runnable" when either (i) completion happens or (ii) timeout expires.
After that it's the job of scheduler to schedule that thread and change it's state from "runnable" to "running". The thread itself(or the completion framework) is not responsible to make a thread runnable, that's the job of scheduler.
As you have pointed out, system_server is consuming 95% of cpu and therefore making it hard for the completion thread to get scheduled. That explains why the thread is not getting scheduled.
Well, I kind of figured it out.
In the CFS scheduling, in enqueue_entity, it does "vruntime += min_vruntime" in some condition, and in dequeue_entity it does the opposite under some condition. However, those are not always executed in pair. So under some unknown condition, when min_vruntime is pretty big, the vruntime can get pretty big, so the task would be put to the right side of the rbtree and not get scheduled for a long time.
I am not sure what is the best way to fix this from the root cause, what I did is a hack in enqueue_entity, if I found vruntime>min_vruntime and the function is called for WAKEUP, I always set vruntime=min_vruntime, thus the task would be put to the relatively left side of the rbtree.
The kernel version I am using is 2.6.37
Anyone has a suggestion on how this should be fixed in a better way?
I know the accepted, correct solutions for gracefully closing a thread.
But assume my game is supposed to be fault-tolerant, and when a new gamegplay is started, it tries to gracefully close the (now-paused) thread of the old gameplay. If it fails to join it / close it (e.g. because the old thread is buggy and is in an infinite loop), it instantiates the new Thread and starts it. But I don't like the fact that the old thread is still there, eating resources.
Is there an accepted way to kill an unresponsive thread without killing the process? It seems to me there isn't, in fact, I read somewhere that a Thread might not react to Thread.stop() either.
So there is no way dealing with a thread in an infinite loop (e.g. due to a bug), is it? Even if it reacts to Thread.stop(), the docs say that Thread.stop() may leave Dalvik VM in an inconsistent state...
If you need this capability, you must design it and implement it. Obviously, if you don't design and implement a graceful way to shut down a thread, then there will be no way to gracefully shut down a thread. There is no generic solution because the solution is application-specific. For example, it depends on what resources the thread might hold and what shared state the thread may hold locks on or have corrupted.
The canonical answer is this: If you need this capability, don't use threads. Use processes.
The core reason is the way threads work. You acquire a lock and then you manipulate shared data. While you're manipulating that shared data, it can enter an inconsistent state. It is the absolute responsibility of a thread to restore the data to a consistent state before releasing the lock. (Consider, for example, deleting an object from a doubly-linked list. You must adjust the forward link or the reverse link first. In between those two operations, the linked-list is in an inconsistent state.)
Say you have this code:
Acquire a lock or enter a synchronized block.
Begin modifying the shared state the lock protects.
Bug
Return the data the lock protects to a consistent state.
Release the lock.
So, now, what do we do? At step 3, the thread holds a lock and it has encountered a bug and triggered an exception. If we don't release the lock it acquired in step 1, every thread that tries to acquire that same lock will wait forever, and we're doomed. If we do release the lock it acquired in step 1, every thread that acquires the lock will then see the inconsistent shared state the thread failed to clean up because it never got to step 4. Either way, we're doomed.
If a thread encounters an exceptional condition the application programmer did not create a sane way to handle, the process is doomed.