Android crashing on specific device - caused by int64 allocation - android

I'm having a problem that only seems to happen on my Lenovo Thinkpad Tablet running Android OS 3.1. I am running a native app using the NDK. The application runs fine in the emulator and on other devices.
Whenever I allocate an int64_t (defined as long long) I get a SIGILL crash with signal (4). As an example these lines will crash on the device.
int64_t i = 0;
long long j = 0;
I should note, the application runs fine, I can see menus rendering correctly, animating and waiting for input. When I touch, I allocate int64 variables for the timestamps, this is when the crash occurs. Regardless of where I allocate an int64 in this app, I get a crash.
The strange thing is, I loaded up the native-activity sample that comes with the NDK and tried allocating the above data types and it works fine. Both applications have the same Application.mk and very similar Android.mk files. I have also tried cleaning the project.
I am really unsure of what to look at next.

I have solved the problem. This project is a port from an iOS project which has some NEON math classes in it. We use the following flags for NEON support:
-mfpu=neon -mfloat-abi=softfp
We used the same flags in the Android project initially which worked initially. Although as soon as we got a new test device (Lenovo Thinkpad Tablet) we started getting the crash as above. Since building for armeabi-v5 worked and doesn't use NEON I knew it was related. It turns out there are better ways to compile for NEON for Android than using the above flags. I removed the above flags so that our Android.mk looks like so:
ifeq ($(TARGET_ARCH_ABI),armeabi-v7a)
LOCAL_CFLAGS += -DHAS_NEON=1
MY_SRC_FILES += myfile.cpp.neon
else
LOCAL_CFLAGS += -DHAS_NEON=0
MY_SRC_FILES += myfile.cpp
endif
This means that only the files that actually need to be built for NEON are. The processor inside the Lenovo Thinkpad Tablet (Nvidia Tegra 2) doesn't support NEON so clearly building all files with NEON support was generating the instructions in a way which the processor didn't like.
Thanks for Keith for suggesting that I try the other architectures which led me to my solution.

Sounds like your compiler is generating 64-bit instructions that are not implemented by the processor on your machine. Are you cross-compiling? If so, make sure you're targeting the correct version of the ARM (x86?) chip in your tablet.

Related

Opencl can't find gpu on ARM

I'm trying to run image processing app on android/ARM, using opencv's ocl module. In some case(android4.2.2/Qualcomm snapdragon msm8930/Adreno TM305), it runs well.
But in other cases(android4.4.2/rockchip RK3288/mali-T764; android4.4/sumsung exynos5410/powerVR SGX 544mp),there are problems. CV::ocl::getOpenCLDevice() shows there is no opencl platform or device.
I'm sure all three tested systems support opencl. Could anyone tell me what's the problem here? Thanks!
Well,I fixed the problem already. The problem is some android devices don't have opencl lib file libopencl.so in file system, or the file has a different name(for example, libgles_Mali.so). To use opencl, we should set environment first.
Specifically, add
setenv("OPENCV_OPENCL_BINARY", "libGLES_Mali.so", 0);
Before
cv::ocl::getOpenCLDevices()

android kernel build (first time)

Ok so first off,
Im brand new to android dev. This is my first attempt at any form of kernel anything. I have a limited knowledge of java and python, but no C.
I have a galaxy tab 4 sm-t330nu running 4.4.2. its running a qualcomm snapdragon 400 msm8226 cpu. im simply trying to do a test build with a vanilla kernel at this point. (also my build environment is the newest kali 1.1 and im loosely following the tutorial at https://github.com/offensive-security/kali-nethunter/wiki/Porting-Nethunter)
so i have all of the required dependencies (i hope), and ive downloaded my source from samsung opensource. unzipped and went through the available defconfigs. after finding "msm8226-sec_milletwifiue_defconfig" i decided it was the most likely candidate for my tablet. (when doing a custom recovery i remember it being "philz touch milletwifiue something)
Ive done my exports (arch= subarch= cross_compile=) and all seems well. When i run a build following exactly as the tutorial says (using the defconfig in their example as a test) i receive an error stating "must define variant_defconfig". So i instead do "make variant_defconfig=msm8974_sec_defconfig" and it builds great.
Now the issue:
When i change "msm8974_sec_defconfig" to my actual msm8226 i receive an error on every build that i cannot seem to workaround. (cut down for size)
CC arch/arm/kernel/armksyms.o
CC arch/arm/kernel/module.o
AS arch/arm/kernel/sleep.o
CC arch/arm/kernel/suspend.o
CC arch/arm/kernel/io.o
arch/arm/kernel/io.c: In function '_memcpy_fromio':
arch/arm/kernel/io.c:14:3: error: implicit declaration of function 'nop' [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[1]: *** [arch/arm/kernel/io.o] Error 1
make: *** [arch/arm/kernel] Error 2
My exact bash line reads
make VARIANT_DEFCONFIG=msm8226-sec_milletwifiue_defconfig
Any assistance on clearing this up would be great
edit
although im not familiar with c, it seems to me that '_memcpy_fromio' is where the error lies. and my google searches tell me that the error is that a function is used without being declared. however i dont know if memcpy is a function? or is the function within class memcpy (dont know if c has classes just closest equivalent that i know of) how do i debug this code and declare what needs to be declared (more importantly, if this is a stock kernel thats used by thousands of devices, how can it possibly have an undeclared function?
/edit
found the answer! needed
#import linux/modules.h
#import linux/kernel.h

Android SDL2 App black screen then exits

I have a working SDL2 demo running in Windows, but when I port this code to Android the finished app will display a black screen for few seconds and then quietly exit (no errors, nothing), rather than display the pretty test graphics within a game-loop.
If I add a call to SDL_ShowSimpleMessageBox at the start of my main, nothing happens (I've since learned that it's not implemented for Android yet - grrr), but if I comment-out my main code, ndk-build complains that it's missing, so it's certainly being included in the build, but doesn't appear to get called.
I've followed the steps in README-android.txt. After about 200 hours of solving problems over the past 2 months I eventually produced an apk. Have the following fixes I made maybe broken something?
The first problem is NDK-BUILD failing to find SDL_config.h, because of the instructions at step #2 in README-android.txt being wrong. Fixed restructuring the directories, or editing paths in Android.mk.
Second problem is NDK-BUILD failing to find EGL/eglplatform.h. Fixed by adding APP_PLATFORM := android-9 to Application.mk.
Third problem is NDK-BUILD not recognising C++11, so I added APP_CPPFLAGS += -std=c++11 to Application.mk.
Fourth problem is NDK-BUILD not finding #include <cstdarg> (used for va_list and va_start). Including <SDL.h> instead fixed this.
Fith problem is ant build failing at [aapt] Generating resource IDs. I fixed this in Android SDK Manager by deleting build-tools verion 21.1.1 and installing version 20 instead.
This is the first time I've posted for help on here because I'm desperate. I'm a veteran C++ coder but a complete novice when it comes to java. I'm using C++ in eclipse for the Windows SDL2, and I built the apk entirely from the command-line.
Tried on several different devices. Looking at the logcat, there's a signal 7 SIGBUS error:
V/SurfaceView( 3497): Layout: x=0 y=0 w=1280 h=720, frame=Rect(0, 0 - 1280, 720)
F/libc ( 3497): Fatal signal 7 (SIGBUS) at 0x00000000 (code=128)
I/ActivityManager( 162): Displayed org.libsdl.app/.SDLActivity: +416ms
V/SDL ( 3497): onWindowFocusChanged(): true
W/InputManagerService( 162): Starting input on non-focused client
com.android.internal.view.IInputMethodClient$StubProxy#410b72d8 (uid=10021 pid=331)
I've found the problem. Feel free to slap my wrists.
I've always coded with the assumption that I can address memory with 8-bit aligned pointers. This has worked for me for the past 25 years, until now. My code fails when it's targeting Android because whatever CPU it's using refuses to address memory with pointers that are 8-bit aligned. So as soon as I move a pointer along a byte-aligned butter and cast that pointer to an object and reference that pointer, BAM, android quietly exits the app.
Now the question is what to do with this question? It logs all the problems and solutions I've battled with over the past 2 months, so must be useful to someone?
I had a similar problem, but quite a different cause.
After starting the app it remained black and seemed to minimize itself after half a second. Switching to the app also minimized it again.
It turns out Makefile Syntax highlighting is bugged in SciTE (and also Stackoverflow it seems), so that something like this:
# LOCAL_SRC_FILES := $(SDL_PATH)/src/main/android/SDL_android_main.c \
LOCAL_SRC_FILES := $(SDL_PATH)/src/main/android/SDL_android_main.c \
$(PROJECT_PATH)/main.cpp
was in effect commented out completely but not shown as such. This resulted in I guess the main routine from SDL being used instead of mine.
This problem could be spotted by looking into the compiled file with nm -C android-project/obj/local/arm64-v8a/libmain.so and noticing that it contains almost no functions:
00000000000004c4 t atexit
00000000000004ac t __atexit_handler_wrapper
0000000000011008 A __bss_end__
0000000000011008 A _bss_end__
0000000000011008 A __bss_start
0000000000011008 A __bss_start__
U __cxa_atexit##LIBC
U __cxa_finalize##LIBC
0000000000010d70 t $d
0000000000011000 d $d
0000000000011000 d __dso_handle
0000000000010d80 a _DYNAMIC
0000000000011008 A _edata
0000000000011008 A _end
0000000000011008 A __end__
0000000000010ff8 a _GLOBAL_OFFSET_TABLE_
00000000000004a0 t __on_dlclose
00000000000004ac t $x
00000000000004a0 t $x
0000000000000480 t $x

Why compile Android kernel module with -fno-pic?

I often read that Android kernel modules have to be compiled with -fno-pic to work. Is this specific to the ARM architecture, or why don't/(when do) kernel modules for x86 need to be compiled with that flag?
According to https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/Code-Gen-Options.html, -fno-pic is the negative form of the -fpic parameter. From the same link:
-fpic Generate position-independent code (PIC) suitable for use in a shared library, if supported for the target machine. Such code
accesses all constant addresses through a global offset table (GOT).
The dynamic loader resolves the GOT entries when the program starts
(the dynamic loader is not part of GCC; it is part of the operating
system). If the GOT size for the linked executable exceeds a
machine-specific maximum size, you get an error message from the
linker indicating that -fpic does not work; in that case, recompile
with -fPIC instead. (These maximums are 8k on the SPARC, 28k on
AArch64 and 32k on the m68k and RS/6000. The x86 has no such limit.)
Position-independent code requires special support, and therefore
works only on certain machines. For the x86, GCC supports PIC for
System V but not for the Sun 386i. Code generated for the IBM RS/6000
is always position-independent.
When this flag is set, the macros __pic__ and __PIC__ are defined to
1.
So -fno-pic means something like "Do not use position-independent code (PIC)."
But why?
Well, by looking at https://developer.arm.com/products/software-development-tools/hpc/documentation/note-about-building-position-independent-code-pic-on-aarch64, we find that:
Using the -fpic compiler flag with GCC compilers on AArch64 causes the
compiler to generate one less instruction per address computation in
the code, and can provide code size and performance benefits. However,
it also sets a limit of 32k for the Global Offset Table (GOT), and the
build can fail at the executable linking stage because the GOT
overflows.
So, in the end, it seems like -fno-pic is more of a precaution than a real need. This, of course, is a guess and there might be more things involved.

Eclipse ADT - Native debug

On my Windows 7 platform, I have the latest version of adt bundle (20140321) and ndk (r9d) installed. The installation is as clean as it gets. The environment variables NDK_ROOT, PATH, etc. are all defined properly.
The application that I am working on has some native code that gets built with armeabi-v7a ABI. The app platform is andorid-19. Project settings define build as "ndk-build NDK_DEBUG=1."
From Eclipe, when I debug my application on my samsung tablet as "Android Java Application," everything works as expected. However, when I try to debug the app as "Android Native Application," I get the following error:
Attempting to connect debugger to 'com.mycomp.myapp' on port 8604
gdbserver output:
Cannot attach to lwp 28275: Operation not permitted (1)
Exiting
Verify if the application was built with NDK_DEBUG=1
The application runs fine on the device though.
I have looked at various messages on stackoverflow. However, I could not find any concrete step to fix this problem. Would appreciate if you can guide me in the right direction. Regards.
I have same problem. After wasting some time I found that when I return my simple function without any operation problem does not occurs.
void dmpBuffer(char* msg, unsigned char* buffer, int buffLen)
{
//return;
char szDumpBuffer[128];
for(int i=0; i<buffLen; i++)
sscanf(&szDumpBuffer[i*2], "%02X", buffer[i]);
LOGE("%s:%s", msg, szDumpBuffer);
}
maybe it is because of memory problems.
Update:
As I had Guessed it was memory problem.
My BIG mistake was using sscanf instead of sprintf!!!
My problem solved.

Categories

Resources