Can't execute JavaVM->DetachCurrentThread(): "attempting to detach while still running code"

Can't execute JavaVM->DetachCurrentThread(): "attempting to detach while still running code" - android

I have an Android app that uses NDK - a regular Android Java app with regular UI and C++ core. There are places in the core where I need to call Java methods, which means I need a JNIEnv* for that thread, which in turn means that I need to call JavaVM->AttachCurrentThread() to get the valid env.
Previously, was just doing AttachCurrentThread and didn't bother to detach at all. It worked fine in Dalvik, but ART aborts the application as soon as a thread that has called AttachCurrentThread exits without calling DetachCurrentThread. So I've read the JNI reference, and indeed it says that I must call DetachCurrentThread. But when I do that, ART aborts the app with the following message:
attempting to detach while still running code
What's the problem here, and how to call DetachCurrentThread properly?

Dalvik will also abort if the thread exits without detaching. This is implemented through a pthread key -- see threadExitCheck() in Thread.cpp.
A thread may not detach unless its call stack is empty. The reasoning behind this is to ensure that any resources like monitor locks (i.e. synchronized statements) are properly released as the stack unwinds.
The second and subsequent attach calls are, as defined by the spec, low-cost no-ops. There's no reference counting, so detach always detaches, no matter how many attaches have happened. One solution is to add your own reference-counted wrapper.
Another approach is to attach and detach every time. This is used by the app framework on certain callbacks. This wasn't so much a deliberate choice as a side-effect of wrapping Java sources around code developed primarily in C++, and trying to shoe-horn the functionality in. If you look at SurfaceTexture.cpp, particularly JNISurfaceTextureContext::onFrameAvailable(), you can see that when SurfaceTexture needs to invoke a Java-language callback function, it will attach the thread, invoke the callback, and then if the thread was just attached it will immediately detach it. The "needsDetach" flag is set by calling GetEnv to see if the thread was previously attached.
This isn't a great thing performance-wise, as each attach needs to allocate a Thread object and do some internal VM housekeeping, but it does yield the correct behavior.

I'll try a direct and practical approach (with sample code, without use of classes) answering this question for the occasional developer that came up with this error in android, in cases where they had it working and after a OS or framework update (Qt?) it started to give problems with that error and message.
JNIEXPORT void Java_com_package_class_function(JNIEnv* env.... {
JavaVM* jvm;
env->GetJavaVM(&jvm);
JNIEnv* myNewEnv; // as the code to run might be in a different thread (connections to signals for example) we will have a 'new one'
JavaVMAttachArgs jvmArgs;
jvmArgs.version = JNI_VERSION_1_6;
int attachedHere = 0; // know if detaching at the end is necessary
jint res = jvm->GetEnv((void**)&myNewEnv, JNI_VERSION_1_6); // checks if current env needs attaching or it is already attached
if (JNI_EDETACHED == res) {
// Supported but not attached yet, needs to call AttachCurrentThread
res = jvm->AttachCurrentThread(reinterpret_cast<JNIEnv **>(&myNewEnv), &jvmArgs);
if (JNI_OK == res) {
attachedHere = 1;
} else {
// Failed to attach, cancel
return;
}
} else if (JNI_OK == res) {
// Current thread already attached, do not attach 'again' (just to save the attachedHere flag)
// We make sure to keep attachedHere = 0
} else {
// JNI_EVERSION, specified version is not supported cancel this..
return;
}
// Execute code using myNewEnv
// ...
if (attachedHere) { // Key check
jvm->DetachCurrentThread(); // Done only when attachment was done here
}
}
Everything made sense after seeing the The Invocation API docs for GetEnv:
RETURNS:
If the current thread is not attached to the VM, sets *env to NULL, and returns JNI_EDETACHED. If the specified version is not supported, sets *env to NULL, and returns JNI_EVERSION. Otherwise, sets *env to the appropriate interface, and returns JNI_OK.
Credits to:
- This question Getting error "attempting to detach while still running code" when calling JavaVm->DetachCurrentThread that in its example made it clear that it was necessary to double check every time (even though before calling detach it doesn't do it).
- #Michael that in this question comments he notes it clearly about not calling detach.
- What #fadden said: "There's no reference counting, so detach always detaches, no matter how many attaches have happened."

Related

Flutter/Dart: Bad State errors when trying to close down a stream pipeline

I'm building a mobile app in flutter which pipes the mic audio (mic_stream lib) to a websocket. I’m really struggling to close down the stream pipeline when I'm done with it. I’m getting various “Bad State” exceptions such as Cannot add event while adding a stream. The particulars depend on how I set up the pipeline but it seems to be at the root because the returned addStream future never completes. Any ideas what would cause that?
As said above, the source stream is from the mic_stream lib which pulls from native via Flutter's EventChannel.receiveBroadcastStream. The docs for this method says its returned stream will only close down when there are no more listeners. I try closing my websocket and get a similar error for the same reason (websocket internal bad state b/c addStream never completes). I'm tried wrapping the mic stream in a StreamController and closing that but I get the error mentioned above.
Starting to feel like it's a bug. Maybe EventChannel's stream is special? Or is it related to it being a "broadcast" stream.
Feeling stuck. Any help appreciated...thx

Flutter makes this a little confusing by returning a stream from EventChannel that you can't really use in the normal pipeline chaining way if you ever need to close it. Perhaps they should have done internally what I'm about to show as the workaround.
First for clarity, when you use addStream on StreamController (StreamConsumer rather) it blocks you from "manual" control via the add() method and also the close() until that stream completes. This makes sense, if you think about it, since the source stream should determine when it closes. That's why addStream() returns a Future – so you know when you can resume using those methods, or add another stream. Doing so beforehand will trigger the Bad State errors mentioned above.
From the docs for EventChannel::receiveBroadcastStream()...
Stream activation happens only when stream listener count changes from 0 to 1. Stream deactivation happens only when stream listener count changes from 1 to 0.
So we need to decide when it is done, and to do this we need to control its subscription rather than bury it in a pipeline or a StreamController's private internals via the addStream() method. So instead we'll listen to it directly, capturing the subscription to close when we're done. Then we just proxy the data into a StreamController or pipeline manually via add()
Stream<Uint8List> micStream = await MicStream.microphone(
sampleRate: AUDIO_SAMPLE_RATE,
channelConfig: ChannelConfig.CHANNEL_IN_MONO,
audioFormat: AudioFormat.ENCODING_PCM_16BIT);//,
// audioSource: AudioSource.MIC); // ios only supports default at the mo'
StreamController? s;
// We need to control the listener subscription
// so we can end this stream as per the docs instructions
final micListener = micStream.listen((event) {
print('emitting...');
// Feed the streamcon manually
s!.add(event);
});
s= StreamController();
// Let the SCon's close() trigger the Listener's cancel()
s!.onCancel = () {
print("onCancel");
micListener.cancel();
};
s!.done.whenComplete(() {
print("done");
});
// Further consumers will use the _StreamCon's_ stream,
// _not_ the micStream above
s!.stream.listen((event) => print("listening..."));
// Now we can close the StreamController when we are done.
Future.delayed(Duration(seconds: 3), () {
s!.close();
});

What is the best way to execute asynchronous code inside CameraX analyze()?

I'm using CameraX's image analysis use case that keeps calling the analyze() method in my custom Analyzer class. Inside analyze(), before doing anything else, I need to send a request to a connected device and wait for its response; the latency is very low and I'm already doing it synchronously with no issues, but I was told it's better to make it asynchronous just in case the device responds too slowly.
I know that MLKit's process() returns a Task<List<T>> and I already call onSuccessListener { } on it, so I was wondering if I can use a similar approach (I can't return a Task<T> from my function, how do I create one?). Otherwise would you suggest threads, or coroutines, or something else?
Edit: below there's a simplified example of what I'm trying to do. For a given frame sent by the camera I just need to perform only the current analysis in line, then I return so that analyze() will be called again with the next frame, on which it will perform the next analysis.
It might look hacky but it's for an app that continuously runs in foreground on a single-purpose device (let's call it Dev A) with no user interaction provided by touch or other conventional means, so it needs some kind of trigger to start doing what is required.
The trigger might as well be when the first image analysis in line is successful, but running MLKit or TFLite models from real time camera feed all day long makes Dev A overheat excessively. The best solution so far seems to be waiting for the trigger to come from an external device (Dev B) that operates independently.
Since Dev B may respond with some delay I need to communicate with it asynchronously, hence the reason for the question in the first place. While there are certainly several architectural nuances to discuss, the current root of the problem is that I can't decide (or rather I don't know) how to handle the repeating "connection" with Dev B in a non-blocking way.
I mean, can I just treat this issue like any other case where multithreading is needed, or the fact that the camera is involved might pose additional threats? The backpressure strategy is set to STRATEGY_KEEP_ONLY_LATEST, so in theory if the current call to analyze() hasn't finished yet the new frames are dropped and nothing bad happens even if inside the method I'm still waiting for the async call to Dev B to finish, or am I missing something?
var connected = false
lateinit var result: Boolean
var analysis1 = true
var analysis2 = true
override fun analyze() {
if (!connected) {
result = connectToDevice() // needs to be async
connected = true
}
// need positive result to proceed, otherwise start over
if (!result) {
connected = false
return
}
if (analysis1) {
// perform analysis #1...
analysis1 = false
// when an analysis is done, exit early and perform next analysis on next frame
return
}
if (analysis2) {
// perform analysis #2...
analysis2 = false
// same as above
return
}
// when all analyses are done, reset all flags to start over
connected = false
analysis1 = true
analysis2 = true
}

Android - JNI / NDK - crash with SIGSEV - signal handling not triggered

I have Android native C++ code. However, sometimes when I send app to background and back, it crash with SIGSEGV. I want to debug it using my own signal handling and print stack trace, however, when this error occurs, my signal handling is not triggered at all.
To JNI_OnLoad method, I have added:
struct sigaction sighandler;
memset (&sighandler, '\0', sizeof(sighandler));
sighandler.sa_sigaction = &android_sigaction;
sighandler.sa_flags = SA_SIGINFO;
int watched_signals[] = { SIGABRT, SIGILL, SIGSEGV, SIGINT, SIGKILL };
for(int signal : watched_signals)
{
sigaction(signal, &sighandler, &old_sa[signal]);
}
And I have:
static struct sigaction old_sa[NSIG];
static void android_sigaction(int signal, siginfo_t *siginfo, void *context)
{
MY_LOG("Sending PID: %ld, UID: %ld\n", (long)siginfo->si_pid, (long)siginfo->si_uid);
old_sa[signal].sa_handler(signal);
}
However, android_sigaction is never trigerred for the error, when app goes from background. I have tried to create bug in code (writing outside array bounds), trigger it with button push and the callback is correctly called.
What is going on?

Assuming that you're using Android 5.0+ device, your problem may be caused by ART. It exposes own signal() and sigaction() so it has a chance to steal signal and pass it somewhere else.
For debugging purposes you could try direct syscall:
for(int signal : watched_signals)
{
syscall(_NR_sigaction, signal, &sighandler, &old_sa[signal]);
}
So now your handler goes directly to kernel and ART shouldn't change it.
Of course it is OK only for debugging. If you want to go with this for a prod - you need to develop some logic that will respect previous handler.
P.S. also checking returned value and errno is a good idea as well.

Calling a function using the context pointer using JNI on android causes a segfault

I found this bit of code in one of the example tango projects using the JNI and I have no idea what the context is nor how to use it. The example code works, but my code does not.
void OnXYZijAvailableRouter(void *context, const TangoXYZij *xyz_ij) {
SynchronizationApplication *app =
static_cast<SynchronizationApplication *>(context);
app->OnXYZijAvailable(xyz_ij);
}
I tried mimicking it below:
void OnFrameAvailableRouter(void *context, const TangoCameraId id,
const TangoImageBuffer *buffer) {
SynchronizationApplication *app =
static_cast<SynchronizationApplication *>(context);
LOGE("Before onframe call.");
app->onFrameAvailable(id, buffer);
LOGE("After onframe call.");
}
When I try to run it, however, I get this output:
Before onframe call.
Fatal signal 11 (SIGSEGV) at 0x00000308 (code=1), thread 15673 (Binder_2)
Now I managed to find the pointer that causes the seg fault, but I have no idea why it does not work.
Naturally, I might have done something wrong, but I have no idea what since I made an exact copy of the code in the example.
int SynchronizationApplication::TangoConnectCallbacks() {
TangoErrorType depth_ret =
TangoService_connectOnXYZijAvailable(OnXYZijAvailableRouter);
depth_ret = TangoService_connectOnFrameAvailable(TangoCameraId::TANGO_CAMERA_COLOR, NULL,
OnFrameAvailableRouter);
return depth_ret;
}
The functions I call from the routers.
void OnXYZijAvailable(const TangoXYZij *xyz_ij);
void onFrameAvailable(const TangoCameraId id, const TangoImageBuffer *buffer);
What exactly is the context? I have read some explanations, but I still do not understand why I can call the function using the context in the example above, nor why I need the router function at all. I have read this SO answer and the android page on the concept, but I see no link between the context and my class.

In the OnXYZijAvailableRouter (the depth callback), the context is the instance passed in from the TangoService_connect function. I believe in the application class, there should be a line like this: TangoService_connect(this, tango_config_); So this become the context when the callback is called. This context also applies to pose and event callbacks.
In the case of OnFrameAvailableRouter, the context is the instance you passed in in the TangoService_connectOnFrameAvailable. In this case, the code is setting a NULL as context, but in the callback, it's trying the call a function on NULL. That's the crash point.
I believe if you change the it to TangoService_connectOnFrameAvailable(TangoCameraId::TANGO_CAMERA_COLOR, this, OnFrameAvailableRouter); it should work fine.
The router function is for the callbacks, I haven't find a way of giving a function pointer of a instance to the API. But let me know if you find a way to do that, I would like to know as well..

Android. NDK. How to log calling destructor of global variable?

As we all know android doesn't unload *.so after close application. I had found the solve by adding "exit(0)" at the end, that is solved problem, but I wanna know exactly that all are OK.
The code is work fine as expected after solving the problem:
static int value = 0;
// In android_main
LOGI("value = %d", value); // always print 0, but not 1 after second run of
// application as it was without "exit(0)" at the end
value = 1;
I wanna to test that on class like:
class A {
A() {
LOGI("Constructor");
}
~A() {
LOGI("Destructor");
}
statis A a;
In such way prints only "Constructor".
Maybe because of destructor is calling after when LOGI isn't working more for application that will be closed ?
Question: why LOGI in destructor isn't working? According to first example on top destructor is calling really.

This is not only pointless, but quite possibly counterproductive. If android wants the memory utilized by your process, it will terminate the process to reclaim it; if it doesn't, it won't.
To specifically address your question, killing or exiting a process does not invoke destructors, it merely terminates execution and the kernel bulk-releases all memory and (conventional) resources.
Do not try to second guess the system, as that can frequently result in killing a process only to have android immediately restart it. Further, it can allegedly cause problems with a few Android IPC resources (like the camera) which may not be freed up when the process of a utilizing application unexpectedly dies.

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.

Can't execute JavaVM->DetachCurrentThread(): "attempting to detach while still running code" - android

Related

Flutter/Dart: Bad State errors when trying to close down a stream pipeline

What is the best way to execute asynchronous code inside CameraX analyze()?

Android - JNI / NDK - crash with SIGSEV - signal handling not triggered

Calling a function using the context pointer using JNI on android causes a segfault

Android. NDK. How to log calling destructor of global variable?

Categories

Resources