GCC to emit ARM idiv instructions (continued)

GCC to emit ARM idiv instructions (continued) - android

I am wondering if this is possible for a Krait 400 CPU. I followed some of the suggestions
here
When I compile with mcpu=cortexa15 , then the code compiles and effectively I see udiv instructions in the assembly dump.
However, I would like to know:
Is it possible to get it to work with march=armv7-a? (not specifying a cpu; this is how I have it originally)
I tried to use mcpu=krait2, but since I am not using the snapdragon llvm (I don't know yet how much effort that would be) it does not recognize it. Is it possible to get the cpu definition from the llvm and somehow make it available to my compiler?
Any other method/patch/trick?
My compiler options are as follows:
/development/android-ndk-r8e/toolchains/arm-linux-androideabi-4.7/prebuilt/linux-x86_64/bin/arm-linux-androideabi-gcc -DANDROID -DNEON -fexceptions -Wno-psabi --sysroot=/development/android-ndk-r8e/platforms/android-14/arch-arm -fpic -funwind-tables -funswitch-loops -finline-limit=300 -fsigned-char -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp -mfpu=neon -fdata-sections -ffunction-sections -Wa,--noexecstack -marm -fomit-frame-pointer -fstrict-aliasing -O3 -DNDEBUG
The error that I get is:
Error: selected processor does not support ARM mode `udiv r1,r1,r3'
As a side note I have to say that I am just beginning o understand the whole scheme, therefore I want to keep it in small steps to understand what I am doing.
Thanks in advance.
EDIT 1:
I tried compiling a separate module only including the udiv instruction. That module is compiled using the -mcpu=cortex-a15 arameter, while the rest of the application is compiled using the -march=armv7-a parameter. The result was (somehow expected) that the function call overhead affected the time performance of the application. I could not get inline code since tring to get in inline resulted in the same error that I originally had. I will switch to the the Snapdragon to see if there is a better performance before trying to reinvent the wheel. Thanks everybody for their answers and tips.

idiv - an amalgam to mean both sdiv and udiv is supported is an optional Cortex-A instruction. The support by a Cortex-A can be queried via the ID_ISAR0 cp15 registers, in bits [27:24].
/* Get idiv support. */
unsigned int ISAR0;
int idiv;
__asm ("mrc 15, 0, %0, c0, c2, 0" :"=r" (ISAR0));
#ifdef __thumb2__
idiv = (ISAR0 & 0xf000000UL) ? 1 : 0;
#else
idiv = (ISAR0 & 0xf000000UL) == 0x2000000UL ? 1 : 0;
#endif
Bits [27:24] are 0001, if only thumb2 supports the udiv and sdiv instructions. If the bits [27:24] are 0010, then both modes support the instructions.
As the gcc flags -march=armv7-a, etc mean that the code should work on ALL CPUs of this type and this instruction is optional, it would be an error for gcc to emit this instruction.
You may compile different modules with different flags such as,
gcc -march=armv7-a -o general.o -c general.c
gcc -mcpu=cortex-a15 -D_USE_IDIV_=1 -o fast_idiv.o -c fast_div.c
These modules can be linked together and the above code can be used to select at run time an appropriate routine. For example, both files may have,
#include "fir_template.def"
and this file might have,
#ifdef _USE_IDIV_
#define _FUNC(x) idiv_ ## x
#else
#define _FUNC(x) x
#endif
int _FUNC(fir8)(FILTER8 *filter, SAMPLE *data,)
{
....
}
If you know your code will only run on a Cortex-a15, then use the -mcpu option. If you want this to run faster IF it can and be generic (support all armv7-a CPUs), then you must ID the CPU as outlined above and dynamically select the code.
Addendum: The files above (general.c and fast_idiv.c) could be put in separate shared libraries with the same API. Then interrogate /proc/cpuinfo and see if idiv is supported. Set the LD_LIBRARY_PATH (or dlopen()) to the appropriate version. The choice will depend on how much code is involved.

Related

difference between prebuild tool-chain and custom tool-chain compilers in android NDK

so I'm trying to figure out how to build ICU for android. Initially I tryed to make it with standalone tool-chain, and after some battles I was able to do that at least for x86_64 arch (didn't try with others). However I don't want to have fully custom build system so I decide to figure out how to make it with prebuild toolchains. And I found that it's behave very different - which is very strange. So this was my command when I actually try to configure ICU with standalone tool-chain:
icu/source/configure --disable-shared --enable-static --disable-dyload
--disable-extras --disable-tests --disable-samples --prefix=/icu/build --host=x86_64-linux-android --with-cross-build=/toplay/icu/icu_linux CC=/custom_toolchain/bin/clang
CXX=/custom_toolchain/bin/clang++
LD=/custom_toolchain/bin/x86_64-linux-android-ld
AR=/custom_toolchain/bin/x86_64-linux-android-ar
CFLAGS="-fPIC -DANDROID -fdata-sections -ffunction-sections"
CXXFLAGS="-fPIC -DANDROID -frtti -fno-exceptions -fdata-sections
-ffunction-sections"
So, having all the same command, but only changing compilers and tools from prebuild toolchain which will look like:
icu/source/configure --disable-shared --enable-static --disable-dyload
--disable-extras --disable-tests --disable-samples --prefix=/icu/build --host=x86_64-linux-android --with-cross-build=/toplay/icu/icu_linux CC=/ndk-bundle/toolchains/x86_64-4.9/prebuilt/linux-x86_64/bin/clang
CXX=/ndk-bundle/toolchains/x86_64-4.9/prebuilt/linux-x86_64/bin/clang++
LD=/ndk-bundle/toolchains/x86_64-4.9/prebuilt/linux-x86_64/bin/x86_64-linux-android-ld
AR=/ndk-bundle/toolchains/x86_64-4.9/prebuilt/linux-x86_64//bin/x86_64-linux-android-ar
CFLAGS="-fPIC -DANDROID -fdata-sections -ffunction-sections"
CXXFLAGS="-fPIC -DANDROID -frtti -fno-exceptions -fdata-sections
-ffunction-sections"
I get very different configure step results. Which I placed there: (TLDR: main diff: in prebuild tool-chain case system can't understand that it's cross-compile mode, it's find nl_langinfo, strtod_l which isn't available in android) And if standalone tool-chain initially could build ICU, in prebuild case build process eventually broke.
So my question: what is the difference between compilers and tools in prebuild and standalone case and what flags/settings I need to add to make it work in prebuild case?

This is the expected behavior. I've answered this on our bugtracker.
Our Clang defaults to targeting x86 Linux, not any flavor of Android. Setting up your target flags is one of the many things standalone toolchains do.
I'm not really sure what problem you're trying to solve. Whatever you get working with autoconf is going to essentially be a cobbled together standalone toolchain. Standalone toolchains exist entirely for dealing with this kind of scenario.
To answer your specific question here:
what is the difference between compilers and tools in prebuild and standalone case and what flags/settings I need to add to make it work in prebuild case?
Standalone toolchains are the prebuilt toolchains with a different directory layout (so the compilers can infer the locations of binutils, the sysroot, and the STL) and a few default flags (like -target for Clang). If you were to get this working, you'd have just reinvented this wheel (possibly by using -gcc-toolchain and a bunch of --sysroot, -isystem, -L stuff rather than changing the directory structure.
In case "why doesn't this work out of the box?" is a follow up question, remember that in Android you have many architectures, even more target versions of the OS, and a handful of STLs to choose from. Neither Clang nor GCC can currently be set up in a way that it can deal with all of Android's variations (long term I do expect to change that, but that's quite a ways down the road).

gcc disable -Wall flag for specific files/folders

I have some open source library files in my project (e.g: http://nothings.org/stb_vorbis/stb_vorbis.c). -Wall option is enabled in my Android.mk file. During compilation several warnings are generated in stb_vorbis.c.
warning: unused variable <var>
warning: statement with no effect
warning: <var> defined but not used
warning: <var> may be used uninitialized in this function
For some reason I do not want to modify stb_vorbis.c but still want the -Wall option to be available for my own source files. Is there any way to disable -Wall option for specific files/folder ?

Is there any way to disable -Wall option for specific files/folder ?
I do not believe there is any gcc option that could be used to achieve this. You need to change your Makefile that you used to compile the problematic source files.
You could do something like
CFLAGS:=$(filter-out -Wall, $(CFLAGS))
in the Makefile for stb_vorbis, if your make supports filter-out function.
Also you could write a specific rule for that stb_vorbis.c:
STB_VOBIS_CFLAGS:=$(filter-out -Wall, $(CFLAGS))
stb_vorbis.o: stb_vorbis.c ...
$(CC) $(STB_VOBIS_CFLAGS) -o $# -c $<

Although there's no way to turn off -Wall with one option, you can turn off specific warnings in GCC, using the -Wno-* for warning *. So, for example, to suppress the unused variable warning you can add -Wno-unused-variable. See http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html for more information on different warning options.
So for example you could use target-specific variables, like:
stb_vorbis.c: CFLAGS += -Wno-unused-variable

Using target-specific variables you can append the -w option to inhibit all warning messages for stb_vorbis.c file:
stb_vorbis.o: CFLAGS += -w
or for an entire directory:
third_party/%.o: CFLAGS += -w
A general rule of thumb is that if you have two or more mutually exclusive options, gcc will generally take later options over earlier ones.

using -isystem was the working solution for me.
For example, instead of -IC:\\boost_1_52_0, say -isystem C:\\boost_1_52_0.
Described in detail in the following thread:
How do you disable the unused variable warnings coming out of gcc in 3rd party code I do not wish to edit?
Thank you #lee-duhem for your kind and useful suggestions.

Dead function is not removed from the shared object built from Android NDK

We noticed that some dead functions are not removed from the generated shared object file (.so) that is built as release (via "ndk-build" without any parameter).
To prove that we introduced a dummy function that is definitely not called anywhere (and is also not exported since the default visibility is already set to "hidden" for the whole .so). Somehow the symbol of the dummy function still exists and we can see it by using "nm" against the generated .so.
We are using NDK r8d on Linux 11.10.
Is there any specific compiler/linker flags that we need to apply to Android.mk in order to get the dead code removed?
Thank you!

Removing dead functions can greatly reduce the binary size too. For this, change C and C++ compilation flags and the linker flags in Android.mk.
LOCAL_CPPFLAGS += -ffunction-sections -fdata-sections
LOCAL_CFLAGS += -ffunction-sections -fdata-sections
LOCAL_LDFLAGS += -Wl,--gc-sections
Also, the visibility features in GCC can be of help.
http://gcc.gnu.org/wiki/Visibility

How to enable intrinsics in compiler?

Who can explain: how to enable intrinsics in c code?
I would like to use special dsp instructions in armv5TE
Consider qadd instruction, it nicely works when i use assembler approach, like this:
inline int function_qadd(int a, int b) {
__asm__ (
"qadd %0, %1, %2" : "=r" (a) : "r" (a), "r" (b));
return a;
}
But when i tried to use __qadd intrinsic instead of asm like this:
int add_result = __qadd(5,10);
LOGI("qadd='%d'", add_result);
i got error:
error: undefined reference to '__qadd'
What i am doing wrong, how to enable intrinsics in c code?
UPDATE:
I have ndk android-ndk-r8c (windows version), it have GCC 4.6 as default:
The GCC 4.6 compiler is still the default,
Besides i explicitly specify in android.mk
NDK_TOOLCHAIN_VERSION=4.6
My compiler flags is:
LOCAL_CFLAGS += -std=c99 -ffast-math -march=armv5te -mfpu=vfp -mfloat-abi=softfp
Besides i check the asm code generated by gcc throught -S compiler flag, it generate qadd instruction:
qadd r3, r3, r2

The intrinsic function __qadd is not available for the GCC compiler. The link to the documentation you've provided is for the (non-free) armcc compiler.
Using the assembler approach is the only practical way to use the qadd instruction if you're using GCC.

GCC 4.4.3 offsetof constant expression bug. How should I work around this?

I have a struct that contains a static constant expression that uses the offset macro defined in stddef.h
struct SomeType {
int m_member;
};
static const size_t memberOffset = offsetof(SomeType, m_member);
in GCC 4.4.3 (I'm using Androids NDK r7) this generates the following error:
arm-linux-androideabi-g++ -MMD -MP -MF ./obj/local/armeabi-v7a/... -fpic -ffunction-sections -funwind-tables -fstack-protector -D__ARM_ARCH_5__ -D__ARM_ARCH_5T__ -D__ARM_ARCH_5E__ -D__ARM_ARCH_5TE__ -Wno-psabi -march=armv7-a -mfloat-abi=softfp -mfpu=vfp -fno-exceptions -fno-rtti -O2 -fomit-frame-pointer -fstrict-aliasing -funswitch-loops -finline-limit=300 - -I/Users/Byron/bin/android-ndk-r7/sources/cxx-stl/system/include -
-Wa,--noexecstack -O0 -g -w -D_ANDROID -I/blah/bin/android-ndk-r7/platforms/android-14/arch-arm/usr/include -c
/MyFile.h:330: error: '->' cannot appear in a constant-expression
/MyFile.h:330: error: '&' cannot appear in a constant-expression
This seems like a compiler bug. Does anyone have a good workaround for this? I found references to a bug of this nature on GCC 3.4 but not later versions. hmmm

Standards
In the C++98 standard, there's some information in
C.2.4.1 Macro offsetof(type, memberdesignator) [diff.offsetof]
The macro offsetof, defined in <cstddef>, accepts a restricted set of type arguments in this International
Standard. §18.1 describes the change.
(C.2.4.1 showed up with offsetof in the contents, so I went there first.) And:
§18.1 Types 18 Language support library
¶5 The macro offsetof accepts a restricted set of type arguments in this International Standard. type
shall be a POD structure or a POD union (clause 9). The result of applying the offsetof macro to a field that
is a static data member or a function member is undefined.
For comparison, the C99 standard says:
offsetof(type, member-designator)
which expands to an integer constant expression that has type size_t, the value of
which is the offset in bytes, to the structure member (designated by member-designator),
from the beginning of its structure (designated by type). The type and member designator
shall be such that given
static type t;
then the expression &(t.member-designator) evaluates to an address constant. (If the
specified member is a bit-field, the behavior is undefined.)
Your code
Your code meets the requirements of both the C++ and C standards, it seems to me.
When I use G++ 4.1.2 and GCC 4.5.1 on RedHat (RHEL 5), this code compiles without complaint with the -Wall -Wextra options:
#include <cstddef>
struct SomeType {
int m_member;
};
static const int memberOffset = offsetof(SomeType, m_member);
It also compiles without complaint with #include <stddef.h> and with the GCC compilers (if I use struct SomeType in the macro invocation).
I wonder - I got errors until I included <cstddef>...did you include that? I also added the type int to the declaration, of course.
Assuming that you haven't made any bloopers in your code, it seems to me that you probably have found a bug in the <cstddef> (or <stddef.h>) header on your platform. You should not be getting the error, and the Linux-based G++ appears to confirm that.
Workarounds?
You will need to review how offsetof() is defined in your system headers. You will then probably redefine it in such a way as not to run into the problem.
You might be able to use something like this, assuming you identify your broken system somehow and execute #define BROKEN_OFFSETOF_MACRO (or add -DBROKEN_OFFSETOF_MACRO to the command line).
#include <cstddef>
#ifdef BROKEN_OFFSETOF_MACRO
#undef offsetof
#define offsetof(type, member) ((size_t)((char *)&(*(type *)0).member - \
(char *)&(*(type *)0)))
#endif /* BROKEN_OFFSETOF_MACRO */
struct SomeType {
int m_member;
};
static const int memberOffset = offsetof(SomeType, m_member);
The size_t cast is present since the difference between two addresses is a ptrdiff_t and the offset() macro is defined to return size_t. The macro is nothing other than ugly, but that's why it is normally hidden in a system header where you don't have to look at it in all its ghastliness. But when all else fails, you must do whatever is necessary.
I know that once, circa 1990, I encountered a C compiler that would not allow 0 but it would allow 1024 instead. The distributed <stddef.h> header, of course, used 0, so I 'fixed' it by changing the 0 to 1024 (twice) for the duration (until I got a better compiler on a better machine).

offsetof() must be defined using pointer arithmetic.
GCC probably doesn't like that in constant expressions because in theory the pointers could change and so it is non-const.
Workaround might be make it a static int without const?

Develop Reference

The Android operating system is a mobile operating system that was developed by Google (GOOGL?) to be primarily used for touchscreen devices, cell phones, and tablets.