I tried to use setprop libc.debug.malloc = 1 to find out leak.
I made an demo program and introduced memory leak in that but the above flag is not able to detect this leak.
I tried below commands:
adb shell setprop libc.debug.malloc 1
adb shell stop
adb shell start
jstring Java_com_example_hellojni_HelloJni_stringFromJNI(JNIEnv* env,
jobject thiz) {
int *p = malloc(sizeof(int));
p[1] = 100;
return (*env)->NewStringUTF(env, "Hello from JNI !");
}
Any help would be appreciated.
Thanks
libc.debug.malloc is not valgrind. It tracks native heap allocations, but doesn't really detect leaks directly. It works best in conjuction with DDMS; see this answer for information about using it for native leak chasing (and maybe this older answer).
(Note you can use valgrind on recent versions of Android, but getting it set up can be an adventure.)
FWIW, different levels of libc.debug.malloc are reasonably good at finding use-after-free and buffer overruns:
/* 1 - For memory leak detections.
* 5 - For filling allocated / freed memory with patterns defined by
* CHK_SENTINEL_VALUE, and CHK_FILL_FREE macros.
* 10 - For adding pre-, and post- allocation stubs in order to detect
* buffer overruns.
For example, if you set libc.debug.malloc = 10 and add a free() call to your example above, you'll likely get a warning message from the library because you set p[1] rather than p[0].
Related
I'm writing a kernel driver for a Linux kernel running on Android devices (Nexus 5X).
I have a kernel buffer and I want to expose a device to read from it. I can read and write from the kernel buffer but I cannot write to the userspace buffer received from the read syscall. The very strange thing is that copy_to_user works only for less than 128 bytes... it makes no sense to me.
The code is the following ( truncated ):
static ssize_t dev_read(struct file *filep, char __user *buffer, size_t len, loff_t *offset){
unsigned long sent;
// ...
pr_err("MYLOGGER: copying from buffer: head=%d, tail=%d, cnt=%d, sent=%lu, access=%lu\n",
head, tail, cnt, sent,
access_ok(VERIFY_WRITE, buffer, sent));
if(sent >= 1) {
sent -= copy_to_user(buffer, mybuf + tail, sent);
pr_err("MYLOGGER: sent %lu bytes\n", sent);
// ...
}
// ...
}
The output is the following:
[ 56.476834] MYLOGGER: device opened
[ 56.476861] MYLOGGER: reading from buffer
[ 56.476872] MYLOGGER: copying from buffer: head=5666644, tail=0, cnt=5666644, sent=4096, access=1
[ 56.476882] MYLOGGER: sent 0 bytes
As you can see from the log sent is 4096, no integer overflow here.
When using dd I'm able to read up to 128 bytes per call ( dd if=/dev/mylog bs=128 ). I think that when using more than 128 bytes dd uses a buffer from the heap and the kernel cannot access it anymore, which is what I cannot understand.
I'm using copy_to_user from the read syscall handler, I've also printed the current->pid and it is the same process.
The kernel sources can be found from google android sources.
The function copy_to_user is defined at arch/arm64/include/asm/uaccess.h and the __copy_to_user can be found in arch/arm64/lib/copy_to_user.S.
Thank you for your time, I hope to get rid of this madness with your precious help.
-- EDIT --
I've wrote a small snippet to get the vm_area_struct of the destination userspace buffer and I print out the permissions, this is the result:
MYLOGGER: buffer belongs to vm_area with permissions rw-p
So that address should be writable...
-- EDIT --
I've written more debugging code, logging the state of the memory page used by the userspace buffer.
MYLOGGER: page=(0x7e3782d000-0x7e3782e000) present=1
Long story short it works when the page is present and will not cause a page fault. This is insanely weird, the page fault shall be managed by the virtual memory allocator that would load the page into the main memory...
For some reason, if the page is not present in memory the kernel will not fetch it.
My best guess is the __copy_to_user assembly function exception handler, which returns the number of uncopied bytes.
This exception handler is executed before the virtual memory page fault callback. Thus you won't be able to write to userspace unless the pages are already present in memory.
My current workaround is to preload those pages using get_user_pages.
I hope that this helps someone else :)
The problem was that I held a spin_lock.
copy_{to,from}_user shall never be called while holding a spin_lock.
Using a mutex solves the problem.
I feel so stupid to had wasted days on this...
I am trying to set the Performance Monitor User Mode Enable register on all cpus on a Nexus 4 running a mako kernel.
Right now I am setting the registers in a loadable module:
void enable_registers(void* info)
{
unsigned int set = 1;
/* enable user-mode access to the performance counter*/
asm volatile ("mcr p15, 0, %0, c9, c14, 0\n\t" : : "r" (set));
}
int init_module(void)
{
online = num_online_cpus();
possible = num_possible_cpus();
present = num_present_cpus();
printk (KERN_INFO "Online Cpus=%d\nPossible Cpus=%d\nPresent Cpus=%d\n", online, possible, present);
on_each_cpu(enable_registers , NULL, 1);
return 0;
}
The problem is that on_each_cpu only runs the function on Online cpus and as shown by the printk statment:
Online Cpus=1
Possible Cpus=4
Present Cpus=4
Only one of the four is online when I call on_each_cpu. So my question is, how do I force a cpu to be online, or how can force a certain cpu to execute code?
Thanks
You don't need to run the code on every cpu right now. What you need to do is arrange so that when the offline cpus come back online, your code is able to execute and enable the access to the PMU.
One way to achieve that would be with a cpu hotplug notifier.
Here's a simplified version of the code I'm using
Java:
private native void malloc(int bytes);
private native void free();
// this is called when I want to create a very large buffer in native memory
malloc(32 * 1024 * 1024);
// EDIT: after allocating, we need to initialize it before Android sees it as anythign other than a "reservation"
memset(blob, '\0', sizeof(char) * bytes);
...
// and when I'm done, I call this
free()
C:
static char* blob = NULL;
void Java_com_example_MyClass_malloc(JNIEnv * env, jobject this, jint bytes)
{
blob = (char*) malloc(sizeof(char) * bytes);
if (NULL == blob) {
__android_log_print(ANDROID_LOG_DEBUG, DEBUG_TAG, "Failed to allocate memory\n");
} else {
char m[50];
sprintf(m, "Allocated %d bytes", sizeof(char) * bytes);
__android_log_print(ANDROID_LOG_DEBUG, DEBUG_TAG, m);
}
}
void Java_com_example_MyClass_free(JNIEnv * env, jobject this)
{
free(blob);
blob = NULL;
}
Now when I call malloc() from MyClass.java, I would expect to see 32M of memory allocated and that I would be able to observe this drop in available memory somewhere.
I haven't seen any indication of that however, either in adb shell dumpsys meminfo or adb shell cat /proc/meminfo. I am pretty new to C, but have a bunch of Java experience. I'm looking to allocate a bunch of memory outside of Dalvik's heap (so it's not managed by Android/dalvik) for testing purposes. Hackbod has led me to believe that Android currently does not place restrictions on the amount of memory allocated in Native code, so this seems to be the correct approach. Am I doing this right?
You should see an increase in "private / dirty" pages after the memset(). If you have the extra developer command-line utilities installed on the device, you can run procrank or showmap <pid> to see this easily. Requires a rooted device.
Failing that, have the process copy the contents of /proc/self/maps to a file before and after the allocation. (Easiest is to write it to external storage; you'll need the WRITE_EXTERNAL_STORAGE permission in your manifest.) By comparing the map output you should either see a new 32MB region, or an existing region expanding by 32MB. This works because 32MB is above dlmalloc's internal-heap threshold, so the memory should be allocated using a call to mmap().
There is no fixed limit on the amount of memory you can allocate from native code. However, the more you allocate, the tastier you look to the kernel's low-memory process killer.
I've been porting a cross platform C++ engine to Android, and noticed that it will inexplicably (and inconsistently) block when calling pthread_mutex_lock. This engine has already been working for many years on several platforms, and the problematic code hasn't changed in years, so I doubt it's a deadlock or otherwise buggy code. It must be my port to Android..
So far there are several places in the code that block on pthread_mutex_lock. It isn't entirely reproducible either. When it hangs, there's no suspicious output in LogCat.
I modified the mutex code like this (edited for brevity... real code checks all return values):
void MutexCreate( Mutex* m )
{
#ifdef WINDOWS
InitializeCriticalSection( m );
#else ANDROID
pthread_mutex_init( m, NULL );
#endif
}
void MutexDestroy( Mutex* m )
{
#ifdef WINDOWS
DeleteCriticalSection( m );
#else ANDROID
pthread_mutex_destroy( m, NULL );
#endif
}
void MutexLock( Mutex* m )
{
#ifdef WINDOWS
EnterCriticalSection( m );
#else ANDROID
pthread_mutex_lock( m );
#endif
}
void MutexUnlock( Mutex* m )
{
#ifdef WINDOWS
LeaveCriticalSection( m );
#else ANDROID
pthread_mutex_unlock( m );
#endif
}
I tried modifying MutexCreate to make error-checking and recursive mutexes, but it didn't matter. I wasn't even getting errors or log output either, so either that means my mutex code is just fine, or the errors/logs weren't being shown. How exactly does the OS notify you of bad mutex usage?
The engine makes heavy use of static variables, including mutexes. I can't see how, but is that a problem? I doubt it because I modified lots of mutexes to be allocated on the heap instead, and the same behavior occurred. But that may be because I missed some static mutexes. I'm probably grasping at straws here.
I read several references including:
http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_init.html
http://www.embedded-linux.co.uk/tutorial/mutex_mutandis
http://linux.die.net/man/3/pthread_mutex_init
Android NDK Mutex
Android NDK problem pthread_mutex_unlock issue
The "errorcheck" mutexes will check a couple of things (like attempts to use a non-recursive mutex recursively) but nothing spectacular.
You said "real code checks all return values", so presumably your code explodes if any pthread call returns a nonzero value. (Not sure why your pthread_mutex_destroy takes two args; assuming copy & paste error.)
The pthread code is widely used within Android and has no known hangups, so the issue is not likely in the pthread implementation itself.
The current implementation of mutexes fits in 32 bits, so if you print *(pthread_mutex_t* mut) as an integer you should be able to figure out what state it's in (technically, what state it was in at some point in the past). The definition in bionic/libc/bionic/pthread.c is:
/* a mutex is implemented as a 32-bit integer holding the following fields
*
* bits: name description
* 31-16 tid owner thread's kernel id (recursive and errorcheck only)
* 15-14 type mutex type
* 13 shared process-shared flag
* 12-2 counter counter of recursive mutexes
* 1-0 state lock state (0, 1 or 2)
*/
"Fast" mutexes have a type of 0, and don't set the tid field. In fact, a generic mutex will have a value of 0 (not held), 1 (held), or 2 (held, with contention). If you ever see a fast mutex whose value is not one of those, chances are something came along and stomped on it.
It also means that, if you configure your program to use recursive mutexes, you can see which thread holds the mutex by pulling the bits out (either by printing the mutex value when trylock indicates you're about to stall, or dumping state with gdb on a hung process). That, plus the output of ps -t, will let you know if the thread that locked the mutex still exists.
We ( http://www.mosync.com ) have compiled our ARM recompiler with the Android NDK which takes our internal byte code and generates ARM machine code. When executing recompiled code we see an enormous increase in performance, with one small exception, we can't use any Java Bitmap operations.
The native system uses a function which takes care of all the calls to the Java side which the recompiled code is calling. On the Java (Dalvik) side we then have bindings to Android features. There are no problems while recompiling the code or when executing the machine code. The exact same source code works on Symbian and Windows Mobile 6.x so the recompiler seems to generate correct ARM machine code.
Like I said, the problem we have is that we can't use Java Bitmap objects. We have verified that the parameters which are sent from the Java code is correct, and we have tried following the execution down in Android's own JNI systems. The problem is that we get an UnsupportedOperationException with "size must fit in 32 bits.". The problem seems consistent on Android 1.5 to 2.3. We haven't tried the recompiler on any Android 3 devices.
Is this a bug which other people have encountered, I guess other developers have done similar things.
I found the message in dalvik_system_VMRuntime.c:
/*
* public native boolean trackExternalAllocation(long size)
*
* Asks the VM if <size> bytes can be allocated in an external heap.
* This information may be used to limit the amount of memory available
* to Dalvik threads. Returns false if the VM would rather that the caller
* did not allocate that much memory. If the call returns false, the VM
* will not update its internal counts.
*/
static void Dalvik_dalvik_system_VMRuntime_trackExternalAllocation(
const u4* args, JValue* pResult)
{
s8 longSize = GET_ARG_LONG(args, 1);
/* Fit in 32 bits. */
if (longSize < 0) {
dvmThrowException("Ljava/lang/IllegalArgumentException;",
"size must be positive");
RETURN_VOID();
} else if (longSize > INT_MAX) {
dvmThrowException("Ljava/lang/UnsupportedOperationException;",
"size must fit in 32 bits");
RETURN_VOID();
}
RETURN_BOOLEAN(dvmTrackExternalAllocation((size_t)longSize));
}
This method is called, for example, from GraphicsJNI::setJavaPixelRef:
size_t size = size64.get32();
jlong jsize = size; // the VM wants longs for the size
if (reportSizeToVM) {
// SkDebugf("-------------- inform VM we've allocated %d bytes\n", size);
bool r = env->CallBooleanMethod(gVMRuntime_singleton,
gVMRuntime_trackExternalAllocationMethodID,
jsize);
I would say it seems that the code you're calling is trying to allocate a too big size. If you show the actual Java call which fails and values of all the arguments that you pass to it, it might be easier to find the reason.
I managed to find a work-around. When I wrap all the Bitmap.createBitmap calls inside a Activity.runOnUiThread() It works.