Why compile Android kernel module with -fno-pic?

Why compile Android kernel module with -fno-pic? - android

I often read that Android kernel modules have to be compiled with -fno-pic to work. Is this specific to the ARM architecture, or why don't/(when do) kernel modules for x86 need to be compiled with that flag?

According to https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/Code-Gen-Options.html, -fno-pic is the negative form of the -fpic parameter. From the same link:
-fpic Generate position-independent code (PIC) suitable for use in a shared library, if supported for the target machine. Such code
accesses all constant addresses through a global offset table (GOT).
The dynamic loader resolves the GOT entries when the program starts
(the dynamic loader is not part of GCC; it is part of the operating
system). If the GOT size for the linked executable exceeds a
machine-specific maximum size, you get an error message from the
linker indicating that -fpic does not work; in that case, recompile
with -fPIC instead. (These maximums are 8k on the SPARC, 28k on
AArch64 and 32k on the m68k and RS/6000. The x86 has no such limit.)
Position-independent code requires special support, and therefore
works only on certain machines. For the x86, GCC supports PIC for
System V but not for the Sun 386i. Code generated for the IBM RS/6000
is always position-independent.
When this flag is set, the macros __pic__ and __PIC__ are defined to
1.
So -fno-pic means something like "Do not use position-independent code (PIC)."
But why?
Well, by looking at https://developer.arm.com/products/software-development-tools/hpc/documentation/note-about-building-position-independent-code-pic-on-aarch64, we find that:
Using the -fpic compiler flag with GCC compilers on AArch64 causes the
compiler to generate one less instruction per address computation in
the code, and can provide code size and performance benefits. However,
it also sets a limit of 32k for the Global Offset Table (GOT), and the
build can fail at the executable linking stage because the GOT
overflows.
So, in the end, it seems like -fno-pic is more of a precaution than a real need. This, of course, is a guess and there might be more things involved.

Related

Linking errors when using functions from <complex.h> in using API level 22

I'm porting a C and C++ library that currently works on iOS to be used on an Android application. I'm down to these last 3 linker errors (obfuscated for privacy reasons):
/Users/fer662/projects/xxx/jni/xxx_preprocessing.c:10184: error: undefined reference to 'cexp'
/Users/fer662/projects/xxx/jni/xxx_preprocessing.c:10184: error: undefined reference to 'cpowf'
/Users/fer662/projects/xxx/jni/xxx_preprocessing.c:10285: error: undefined reference to 'cabs'
Now I understand these normally come from linking with libm.so (-lm), but i'm doing that already. If I go and check the offending so with nm:
nm -g /Users/fer662/Library/Android/sdk/ndk-bundle/platforms/android-22/arch-x86/usr/lib/libm.so | grep cpow
Nothing comes back. It DOES hoever, if I use api 28
nm -g /Users/fer662/Library/Android/sdk/ndk-bundle/platforms/android-28/arch-x86/usr/lib/libm.so | grep cpow
00003900 T cpow
00003910 T cpowf
00003920 T cpowl
Also, in the static library it does show, even on api 22:
nm -g /Users/fer662/Library/Android/sdk/ndk-bundle/platforms/android-22/arch-x86/usr/lib/libm.a | grep cpow
s_cpow.o:
00000000 T cpow
s_cpowf.o:
00000000 T cpowf
s_cpowl.o:
00000000 T cpowl
The inconsistency is puzzling. Shouldn't it be missing from the header altogether if not supported? Why does the static lib have it and the dylib not?
Would it make sense to statically link against it? And if so, how would I do it, taking into account the right path for the current api version?
My other option seems to go steal an implementation of libm (say http://openlibm.org/) or just these 3 functions I'm using from it.

tl;dr: yes, static linking libm.a should be fine
Check the libm.map.txt file: https://android.googlesource.com/platform/bionic/+/master/libm/libm.map.txt#289
These functions weren't added to Android until O.
Also, in the static library it does show, even on api 22
The static library isn't an API 22 static library. It's actually a ToT build from AOSP. If you're going to static link something, there's no point in using something old.
The reason it (there's actually only one version of libc.a/libm.a per ABI) is duplicated into each API directory is because build systems made for old NDKs expect it. If you look at the unified toolchain in r19 (toolchains/llvm/prebuilts/$HOST), you'll see that there's only one copy per ABI.
The inconsistency is puzzling. Shouldn't it be missing from the header altogether if not supported? Why does the static lib have it and the dylib not?
The header has an ifdef guard that hides it: https://android.googlesource.com/platform/prebuilts/ndk/+/dev/platform/sysroot/usr/include/complex.h#237
If you had a declaration for these functions and you think you were building for API 22, there's something wrong with your build system.
Would it make sense to statically link against it? And if so, how would I do it, taking into account the right path for the current api version?
In general for these sorts of problems this isn't a good solution since the Zygote has already loaded a libc, and loading another one can lead to all sorts of issues since they can conflict. Additionally, much of libc's networking is actually dispatched to netd, and the protocol between libc and netd has changed in the past (and is unfortunately not a versioned protocol).
Building with libc.a is only viable with standalone executables (think strace and gdbserver) rather than apps, and even then only if you don't need networking.
That said, libm.a is much simpler. The complex interactions that make libc.a unusable for apps don't affect libm. The only time you'll end up actually running code in libm is when the compiler somehow failed to inline the operation. Static linking libm.a into your application should be fine.

ffmpeg for Android: neon build has text relocations

Hi I successfully built the appunite ffmpeg library including arm-v7a neon support, however when I try to run the libraries on my Marshmallow device I get this error:
01-08 23:42:02.350: E/AndroidRuntime(10144): java.lang.UnsatisfiedLinkError:
dlopen failed: /data/app/com.example.demo-1/lib/arm/libffmpeg-neon.so: has text relocations
When I use the non-neon builds it works without any problems.
So I googled a bit and found out, that this is probably a bug in the corresponding C/C++ code but on the other hand it should be fixed when rebuilt with with NDK v. 10e. This is what I did. But I still get these text relocations:
~/Projekte/AndroidFFmpeg$ /usr/Android/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-readelf -a library/src/main/jniLibs/armeabi-v7a/libffmpeg-neon.so | grep TEXTREL
0x00000016 (TEXTREL) 0x0
0x0000001e (FLAGS) SYMBOLIC TEXTREL
This questions seemed to be relevant, but didn't help:
"ffmpeg has text relocations" error in Android
libavcodec.so: has text relocations
How do I fix that?

This should be fixed already (since commit https://git.libav.org/?p=libav.git;a=commitdiff;h=f963f80399d, December 2014), so make sure you build a new enough version and it should be fine.
arm, aarch64 and x86_64 should all work fine without text relocations, but for 32 bit x86, you can't easily avoid it. (For x86, the simplest way around it is to do --disable-asm, but that does give quite a bit of performance loss.)

Native Android - Can't locate undefined method when loading a shared lib even though previously loaded shared lib contains the definition

I am building an Android app that loads in 2 native shared libraries at runtime: 1 that was built with an unresolved symbol in it and the other which resolves and defines that symbol. In Java, I load the shared library that defines the symbol first, then load the library that has the symbol declared as unresolved, and at this point, the runtime fails with:
"Cannot load library: reloc_library[]: 33 cannot locate 'someMethod'
So here's the one unique difference. The shared library with the undefined symbol obviously doesn't know about the shared library with the definition for the symbol in it.
I just assumed that if I loaded the library with the definition of the method first that when I loaded the 2nd library that called the method, it would be able to find it. Am I wrong on that? It seems in my case, an explicit dependency HAS to be compiled in between the two native libs, which means (I think) making .so's with unresolved symbols is useless.
I have searched vigorously for a similar issue with no luck. I think my problem is due to an architectural limitation, and I am considering approaching it a couple of other ways, but I would like to know if it can be fixed simply.
To be sure it wasn't some complexity of the library itself, I created two very simple C files:
fcn_defined.c:
int someMethod()
{
return 1;
}
fcn_undefined.c:
extern int someMethod();
int someOtherMethod()
{
someMethod();
}
Then build two shared objects where the fcn_undefined.c code creates a .so with someMethod still undefined and fcn_defined.c builds a .so with someMethod defined:
gcc -o libfcn_undefined.so fcn_undefined.c -shared -Wl,--export-dynamic
gcc -o libfcn_defined.so fcn_defined.c -shared -Wl,--export-dynamic
Doing a nm on these produces:
libfcn_undefined.so:
0001f08 d _DYNAMIC
00001fe8 d _GLOBAL_OFFSET_TABLE_
00002004 A __bss_start
U __cxa_atexit
U __cxa_finalize
00002000 d __dso_handle
00000290 t __on_dlclose
00002004 A _edata
00002004 A _end
000002a0 t atexit
000002b4 T someOtherMethod
U someMethod
and libfcn_defined.so:
00001f0c d _DYNAMIC
00001fec d _GLOBAL_OFFSET_TABLE_
00002004 A __bss_start
U __cxa_atexit
U __cxa_finalize
00002000 d __dso_handle
0000025c t __on_dlclose
00002004 A _edata
00002004 A _end
0000026c t atexit
00000280 T someMethod
So you can see someMethod() is defined in libfcn_defined.so (and it appears in the read elf dynsym section) and is undefined in the other lib.
If anyone is interested in the readelf output, I can add that as well.
In the Java side, I have a simple button in the emulator that I click, and it creates a class with the following in it:
static
{
System.loadLibrary("fcn_defined");
System.loadLibrary("fcn_undefined");
}
Just out of curiosity, I added a "-lfcn_defined" to the fcn_undefined compile line, and compared the nm and readelf outputs. the only difference in nm was that the "T someOtherMethod" started a few bytes further out and the readelf difference was the "NEEDED" line for fcn_defined. That's pretty much about what I expected. And it doesn't crash like this.
That's pretty much the full explanation. I did find some details about how Android forces you to load your libraries in reverse dependency order in Java, because it has (rather it had, has been fixed in API 18) no reference to your app's lib path in the LD_LIBRARY_PATH envvar. Unfortunately, I am requiring a minimum API lvl 10 to be able to use my app because of the market penetration, and secondly I tried API 19 anyway, and it still fails.
If I had to guess, I believe Android just doesn't support finding a symbol if you haven't explicitly told it to look at library X for the symbol. In other words, because I didn't build the library fcn_undefined with an explicit dependency on libfcn_defined.so, Android can't resolve it. Does anyone know if this is a bug or by design? Is this normal? It seems like you wouldn't have the option to create a .so with unresolved symbols if this was the case, and even funnier is that the Android NDK toolchain I'm using to build this has this feature on by default when you use ld (it doesn't complain about unresolved), and I tried turning the feature off but didn't seem to do anything, no warnings or errors generating the library.
So you may ask why I don't just compile the fcn_undefined library with a dependency on the fcn_defined library. Well that gets into a much bigger architectural discussion. The code I'm working with (fcn_undefined.c in this example) is a python extension built with a cross compiled python toolchain for ARM, and I'm calling this library from an NDK library, so now the NDK library depends on the python module which has an unresolved method in Python, which is defined in a static lib. Linking the static lib into the NDK shared lib means that I can't load the native shared libs in the correct order in Java (due to the issue mentioned previously that they fixed in API 18). I'm trying to work with the existing system since a team of others use it, and it is used to build for many platforms. sigh I clearly have other things to figure out, but I was hoping to nail the one above down at least.

The behavior you so beautifully demonstrated is by design (or lack of, if you will). You are right in part, the crazy_linker does resolve some of such issues (but not all of them). There is an easy but ugly workaround. Build a dummy libfcn_defined.so which only has T someMethod in its nm. Use it to link libfnc_undefined.so, using LD_LIBS. NDK will show a warning, but that's OK. LAoad the real libfcn_defined.so in your Java.

By Unix/ELF design, you need a NEEDED entry in libfnc_undefined.so that lists libfnc_defines.so as a dependency for the dynamic linker to look into it for missing symbols.
I.e. you should ensure that -lfcn_defined (or /path/to/libfcn_defined.so) appears in the link command that generated libfcn_undefined.so.
If you use ndk-build to generate both libraries, just list libfcn_defined.so as a LOCAL_STATIC_LIBRARIES or LOCAL_SHARED_LIBRARIES entry for libfcn_undefined.so.
If you use another build system, adapt accordingly.

Getting rid of wchar_t size linker warning

I compile my Android NDK library with -fshort-wchar. I know the RTL assumes 4-byte wchar_t, I know what I'm doing, the library works. However, on every build linker gives me the following warning for every object file:
ld.exe: warning: MyFile.o uses 2-byte wchar_t yet the output is to use 4-byte wchar_t; use of wchar_t values across objects may fail
When I provide
LOCAL_LDLIBS := --no-wchar-size-warning
This gives me an "unrecognized option" error.

Adding APP_LDFLAGS += -Wl,--no-wchar-size-warning (to Application.mk) works fine for me on NDKs at least as early as r7.
I assume it would work just the same as:
LOCAL_LDLIBS := -Wl,--no-wchar-size-warning

Have you seen this? The post explains that the --no-wchar-size-warning option will make the linker treat the mismatch as a warning, not an error. As in the enum case, the authors choose to display the message anyway.
You don't see the effect of setting this flag in your project because as detailed elsewhere, using -fshort-wchar automatically adds -Wl,--no-wchar-size-warning.

Is arm_neon.h missing all float16_t types?

I'm using NEON SIMD instruction to write a part of an Android app, targeting Cortex A8 processors. According to this reference manual, NEON supports 16-bit and 32-bit floats, that is float16_t and float32_t. When I tried using float16_t and all of its associated vector types, I got an error saying that this type is undeclared. When looking through contents of arm_neon.h, I found that this type is indeed undeclared.
Is there a reason for this? ARM's Advanced SIMD obviously supports such data types and instructions. Has anyone encountered / resolved this? Is it documented anywhere?

Cortex-A8 processors do not support 16-bit floats in hardware.
Cortex-A9 processors do have instructions to convert between 16- and 32-bit floating-point, but that's all you get (and that's all that should be provided on an IEEE-754 system -- float16 is not intended for arithmetic, only for compact storage). The usage model is to load float16 data, convert it to float32 to do your arithmetic, and then convert back to float16 before storing.

Can you specify an example section in the document you linked where these 16 bit float operations are called out? I see quite a few 16 bit integer operations defined. Are you using ARM's compiler or gcc? And are you talking about SIMD or NEON?
"NEON™ technology builds on the concept of SIMD with a dedicated module to provide 128-bit wide vector operations, compared to the 32bit wide SIMD in the ARMv6 architecture."
EDIT:
I tried this with no compiler complaints:
int myfun ( int a)
{
__fp16 b;
b=a+1;
return(b+1);
}
using this command line:
arm-none-linux-gnueabi-gcc -S -mcpu=mpcore -mfp16-format=ieee -mfpu=neon-fp16 simd.c
Using codesourcery lite 2011.03
arm-none-linux-gnueabi-gcc --version
arm-none-linux-gnueabi-gcc (Sourcery G++ Lite 2011.03-41) 4.5.2
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Hmm, not too familiar with SIMD instructions. The document you posted does not mention float16_t, but instead uses the number of lanes as well (e.g. float16x4_t)
Also, did you try "Float16_t" instead of "float16_t" ?
This is my home laptop, so I don't have access to the ARM compiler, but I'll try and recheck this tomorrow in the office

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.