So, I knew nothing about ARM instructions and have only just started trying to understand it. I've looked up a bit on ARM and some of the better links were these:
Converting very simple ARM instructions to binary/hex
http://simplemachines.it/doc/arm_inst.pdf
According to the first link, instructions have the following format and are 32-bits:
[cond][00][immediate][opcode][alerts condition codes?][Rn][Rd][Operand 2]
However, when I disassembled some .so files, I saw that most some of the instructions were 16 bits and had a different format.
Why the discrepancy? Is there a spec. for this?
An example would be how it encodes mov.
A simple mov r0 #255 is 20ff, only 16 bits as opposed to 32. Strange (to me).
As I understand it, you can't normally specify an entire 32-bit value in one instruction. I don't know the syntax for assemblers to compile.
What I've been doing is editing existing .so files using a hex editor.. And running them.
I tried to AND two values but just couldn't do it, something like:
mov r0 #63 ;0x003f
mov r1 #16896 ;0x4200
and r0, r0, r1, lsl #8 ;should be 0x423f at this point
mov r15, r14 ;method returns at this point, right?
Except, I didn't have an assembler and had to do it like this:
Find method that returns int in disassembled .so file
Over-write entries
Run and see output
So, this is what I got:
mov r0 0x003f - 1110 00 1 1101 0 0000 0000 0000 00111111 - e3a0 003f
mov r1 0x0042 - 1110 00 1 1101 0 0000 0001 0000 01000010 - e3a0 1042
and r0, r0, r1, lsl #8 - 1110 00 0 0000 0 0000 0000 00001000 0001 - e000 0081
mov r15, r14 - 1110 00 0 1101 0 0000 1111 0000 0000 1110 - e1a0f00e
So.. Yeah, I thought about instructions, encoded them using a table, used a calculator to convert it into hex, used a hex editor, transferred to my phone and ran the app..
And I just kept getting a zero when the method was called. Am I missing something here?
So..
Yeah.
Why does android's ARM seem to have 16 bit instructions mixed with 32 bit instructions and why isn't my little attempt working?
Do realize that when you say 'ARM instructions' that isn't just one instruction set but a multitude depending on the architecture and what extensions are supported. The document you linked to starts off by mentioning architecture v4 which if you check your favorite wiki for 'ARM architecture' - it would point out that v4 is a 'legacy' architecture.
Android itself started support on ARMv6 with most modern devices running ARMv7 or ARMv7a. While I don't know for sure I would think that the 16 bit instructions are Thumb2 and not the original Thumb extension which was intended to improve code density as ARM was designing for the embedded market of 10 to 20 years ago where a megabyte would be a lot of memory.
If you are learning, I would reference ARM's documentation available at their website:
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0406b/index.html
and possibly look at the ARM instructions coming out of gcc:
http://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
and get a cross-compiler for ARM setup so that you can take a higher level language like C and generate ARM binaries.
Related
I have custom ESP32WROOM based hardware and a mobile app that communicate using Bluetooth Classic via the BluetoothSerial library in Arduino. Everything works great on a wide variety of test devices except for ONE customer who is using a C5L Max with Android 11.
I have verified that this phone works up until its most recent system update. After the update, the moment the app connects to the ESP32, it kernel panics:
Core 0 register dump:
PC : 0x4002f7c2 PS : 0x00060330 A0 : 0x800268aa A1 : 0x3ffd26f0
A2 : 0x00000000 A3 : 0x00000000 A4 : 0x3ffcf8b8 A5 : 0x3ffcf8bc
A6 : 0x3ffcf8c0 A7 : 0x3ffcf8c4 A8 : 0x8002f7ad A9 : 0x3ffd26d0
A10 : 0x3cdb3940 A11 : 0x00000007 A12 : 0x00060320 A13 : 0x00000021
A14 : 0x00000074 A15 : 0x00000000 SAR : 0x0000001d EXCCAUSE: 0x00000006
EXCVADDR: 0x00000000 LBEG : 0x4000c46c LEND : 0x4000c477 LCOUNT : 0x00000000
ELF file SHA256: 0000000000000000
Backtrace: 0x4002f7c2:0x3ffd26f0 0x400268a7:0x3ffd2710 0x401ad0be:0x3ffd2750 0x4008869a:0x3ffd2770 0x40019d11:0x3ffd27a0 0x40055b4d:0x3ffd27c0 0x401a8a93:0x3ffd27e0 0x401a90a1:0x3ffd2800 0x40090f4a:0x3ffd2830
Rebooting...
Now for the real head scratcher: I went into our repo, and installed a variety of old firmware binaries on the ESP32. Our firmware from before June of 2021 works with the updated phone! So... I pull the source for that working firmware, build it, and am still getting the same crash behavior. This means it is not a source issue, but something that has changed in the underlying libraries or Arduino core. So I have now tried building our old firmware with every version of Arduino and the ESP32 libraries from that era, and still no luck.
What else could have changed that I'm missing? It has to be something on the system / compiler / library level, but I'm stumped as to what it could be.
After reading this interesting article about code obfuscation in Android, I'm trying to do it for research purposes but after applying the technique into a classes.dex file I'm getting a crash.
The next is the code I'm trying to run after applying the technique:
0006e8: |[0006e8] com.example.root.bji.MainActivity.paintGUI:()V
0006f8: 1202 |0000: const/4 v2, #int 0 // #0
0006fa: 1a01 0000 |0001: const-string v1, "" // string#0000
0006fe: 1200 |0003: const/4 v0, #int 0 // #0
000700: 1303 1400 |0004: const/16 v3, #int 20 // #14
000704: 3244 0900 |0006: if-eq v4, v4, 000f // +0009
000708: 2600 0300 0000 |0008: fill-array-data v0, 0000000b // +00000003
00070e: 0003 0100 1600 0000 1212 0000 0000 ... |000b: array-data (15 units)
00072c: 0000 |001a: nop // spacer
00072e: 0000 |001b: nop // spacer
... more NOPs ...
000742: 0000 |0025: nop // spacer
000744: 0000 |0026: nop // spacer
000746: 1503 087f |0027: const/high16 v3, #int 2131230720 // #7f08
...
To give you some context, I want to keep clear some assignations like the 0 value into the v2 register at 0x6f8 ("const/4 v2, 0" => 12 02), which will be shown in the GUI at the end of this method (at 0x746 and beyond); and using this obfuscation technique, "hide" the modification of the v2 register setting a value of 1 into the v2 register at 0x716 ("const/4 v2, 1" => 12 12).
If you follow the code at 0x704 the branch is done to 0x716, where the "const/4 v2, 1"r esides, inside the fill-data-array-payload.
And the problem I'm facing is a crash when I'm running the code (I've tried it from 4.3 to 5.1), and what logcat tells me when the crash happens is:
W/dalvikvm(13874): VFY: invalid branch target 9 (-> 0xf) at 0x6
W/dalvikvm(13874): VFY: rejected Lcom/example/root/bji/MainActivity;.paintGUI ()V
W/dalvikvm(13874): VFY: rejecting opcode 0x32 at 0x0006
W/dalvikvm(13874): VFY: rejected Lcom/example/root/bji/MainActivity;.paintGUI ()V
W/dalvikvm(13874): Verifier rejected class Lcom/example/root/bji/MainActivity;
W/dalvikvm(13874): Class init failed in newInstance call (Lcom/example/root/bji/MainActivity;)
D/AndroidRuntime(13874): Shutting down VM
For what I understand in the logs, the OS is rejecting the "if-eq" jump because the offset pointed (I've tried other branch instructions but the result is the same). The only way the code works is if I point to an offset outside the fill-array-data-payload, but then there is no obfuscation technique applied :P.
Anyone have tried something similar to this technique or have fight against this branch verification rejection?
This is not expected to work. The bytecode verifier explicitly checks all branches for validity. The question of whether or not an address is an instruction or data is determined by a linear walk through the method. Data chunks are essentially very large instructions, so they get stepped over.
You can make this work if you modify the .odex output, and set the "pre-verified" flag on the class so the verifier doesn't examine it again -- but you can't distribute an APK that way.
This "obfuscation" technique worked due to an issue in dalvik. This issue was fixed somewhere around the 4.3 timeframe, although I'm not sure the first released version that contained the fix. And lollipop uses ART, which never had this issue.
Here is the change that fixed this issue: https://android-review.googlesource.com/#/c/57985/
Update 2017-05-17. I no longer work for the company where this question originated, and do not have access to Delphi XEx. While I was there, the problem was solved by migrating to mixed FPC+GCC (Pascal+C), with NEON intrinsics for some routines where it made a difference. (FPC+GCC is highly recommended also because it enables using standard tools, particularly Valgrind.) If someone can demonstrate, with credible examples, how they are actually able to produce optimized ARM code from Delphi XEx, I'm happy to accept the answer.
Embarcadero's Delphi compilers use an LLVM backend to produce native ARM code for Android devices. I have large amounts of Pascal code that I need to compile into Android applications and I would like to know how to make Delphi generate more efficient code. Right now, I'm not even talking about advanced features like automatic SIMD optimizations, just about producing reasonable code. Surely there must be a way to pass parameters to the LLVM side, or somehow affect the result? Usually, any compiler will have many options to affect code compilation and optimization, but Delphi's ARM targets seem to be just "optimization on/off" and that's it.
LLVM is supposed to be capable of producing reasonably tight and sensible code, but it seems that Delphi is using its facilities in a weird way. Delphi wants to use the stack very heavily, and it generally only utilizes the processor's registers r0-r3 as temporary variables. Perhaps the craziest of all, it seems to be loading normal 32 bit integers as four 1-byte load operations. How to make Delphi produce better ARM code, and without the byte-by-byte hassle it is making for Android?
At first I thought the byte-by-byte loading was for swapping byte order from big-endian, but that was not the case, it is really just loading a 32 bit number with 4 single-byte loads.* It might be to load the full 32 bits without doing an unaligned word-sized memory load. (whether it SHOULD avoid that is another thing, which would hint to the whole thing being a compiler bug)*
Let's look at this simple function:
function ReadInteger(APInteger : PInteger) : Integer;
begin
Result := APInteger^;
end;
Even with optimizations switched on, Delphi XE7 with update pack 1, as well as XE6, produce the following ARM assembly code for that function:
Disassembly of section .text._ZN16Uarmcodetestform11ReadIntegerEPi:
00000000 <_ZN16Uarmcodetestform11ReadIntegerEPi>:
0: b580 push {r7, lr}
2: 466f mov r7, sp
4: b083 sub sp, #12
6: 9002 str r0, [sp, #8]
8: 78c1 ldrb r1, [r0, #3]
a: 7882 ldrb r2, [r0, #2]
c: ea42 2101 orr.w r1, r2, r1, lsl #8
10: 7842 ldrb r2, [r0, #1]
12: 7803 ldrb r3, [r0, #0]
14: ea43 2202 orr.w r2, r3, r2, lsl #8
18: ea42 4101 orr.w r1, r2, r1, lsl #16
1c: 9101 str r1, [sp, #4]
1e: 9000 str r0, [sp, #0]
20: 4608 mov r0, r1
22: b003 add sp, #12
24: bd80 pop {r7, pc}
Just count the number of instructions and memory accesses Delphi needs for that. And constructing a 32 bit integer from 4 single-byte loads... If I change the function a little bit and use a var parameter instead of a pointer, it is slightly less convoluted:
Disassembly of section .text._ZN16Uarmcodetestform14ReadIntegerVarERi:
00000000 <_ZN16Uarmcodetestform14ReadIntegerVarERi>:
0: b580 push {r7, lr}
2: 466f mov r7, sp
4: b083 sub sp, #12
6: 9002 str r0, [sp, #8]
8: 6801 ldr r1, [r0, #0]
a: 9101 str r1, [sp, #4]
c: 9000 str r0, [sp, #0]
e: 4608 mov r0, r1
10: b003 add sp, #12
12: bd80 pop {r7, pc}
I won't include the disassembly here, but for iOS, Delphi produces identical code for the pointer and var parameter versions, and they are almost but not exactly the same as the Android var parameter version.
Edit: to clarify, the byte-by-byte loading is only on Android. And only on Android, the pointer and var parameter versions differ from each other. On iOS both versions generate exactly the same code.
For comparison, here's what FPC 2.7.1 (SVN trunk version from March 2014) thinks of the function with optimization level -O2. The pointer and var parameter versions are exactly the same.
Disassembly of section .text.n_p$armcodetest_$$_readinteger$pinteger$$longint:
00000000 <P$ARMCODETEST_$$_READINTEGER$PINTEGER$$LONGINT>:
0: 6800 ldr r0, [r0, #0]
2: 46f7 mov pc, lr
I also tested an equivalent C function with the C compiler that comes with the Android NDK.
int ReadInteger(int *APInteger)
{
return *APInteger;
}
And this compiles into essentially the same thing FPC made:
Disassembly of section .text._Z11ReadIntegerPi:
00000000 <_Z11ReadIntegerPi>:
0: 6800 ldr r0, [r0, #0]
2: 4770 bx lr
We are investigating the issue. In short, it depends on the potential mis-alignment (to 32 boundary) of the Integer referenced by a pointer. Need a little more time to have all of the answers... and a plan to address this.
Marco Cantù, moderator on Delphi Developers
Also reference Why are the Delphi zlib and zip libraries so slow under 64 bit? as Win64 libraries are shipped built without optimizations.
In the QP Report: RSP-9922
Bad ARM code produced by the compiler, $O directive ignored?, Marco added following explanation:
There are multiple issues here:
As indicated, optimization settings apply only to entire unit files and not to individual functions. Simply put, turning optimization on and off in the same file will have no effect.
Furthermore, simply having "Debug information" enabled turns off optimization. Thus, when one is debugging, explicitly turning on optimizations will have no effect. Consequently, the CPU view in the IDE will not be able to display a disassembled view of optimized code.
Third, loading non-aligned 64bit data is not safe and does result in errors, hence the separate 4 one byte operations that are needed in given scenarios.
I'm trying to debug a shared library to which I have the source code and debugging symbols for using gdb.
I do not have debugging symbols or code for the process that actually uses this shared library (I compile it myself, so I can have everything, but the resulting binary is stripped, to simulate a situation where I don't have the code).
The process prints the address for target function foo I'm trying to debug, to test that gdb knows the right location for symbols from the shared library. foo exists the my shared library.
My method of printing it is adding the following line to the binary that uses my shared library:
printf("%p\n", foo)
...and to add complexity, this is an Android system I'm debugging remotely.
The scenario I'm trying follows:
On target:
root#phone:/proc/23806 # gdbserver --attach :5555 23806
Attached; pid = 23806
Listening on port 5555
Remote debugging from host 127.0.0.1
On host:
[build#build-machine shared]$ /home/build/shared/prebuilts/gcc/linux-x86/arm/arm-eabi-4.7/bin/arm-eabi-gdb
GNU gdb (GDB) 7.3.1-gg2
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-linux-gnu --target=arm-linux-android".
For bug reporting instructions, please see:
(gdb) target remote :5555
Remote debugging using :5555
0xb6f17fa0 in ?? ()
(gdb) add-symbol-file out/target/product/armv7-a-neon/symbols/system/lib/libShared.so
The address where out/target/product/armv7-a-neon/symbols/system/lib/libShared.so has been loaded is missing
Now I know what I need - the relocated .text section of this shared library in the process address space, but I have no idea how to find it.
I tried /proc/23806/smaps:
root#phone:/proc/23806 # cat maps | grep Shared
b6ea0000-b6edb000 r-xp 00000000 b3:10 3337 /system/lib/libShared.so
b6edc000-b6ede000 r--p 0003b000 b3:10 3337 /system/lib/libShared.so
b6ede000-b6edf000 rw-p 0003d000 b3:10 3337 /system/lib/libShared.so
And the .text section is located at 0x0003ff00 in the .so file:
[build#build-machine shared]$ objdump -h out/target/product/armv7-a-neon/symbols/system/lib/libShared.so | grep text
7 .text 0002835c 00003ff0 00003ff0 00003ff0 2**3
So now I'm supposed to have the address where my shared library is located:
0xb6ea0000+0x00003ff0=0xb6ea3ff0 (where the library is loaded+.text offset from the beginning)
So I did:
(gdb) add-symbol-file out/target/product/armv7-a-neon/symbols/system/lib/libShared.so 0xb6ea3ff0
add symbol table from file "out/target/product/armv7-a-neon/symbols/system/lib/libShared.so" at
.text_addr = 0xb6ea3ff0
(y or n) y
Now I tried setting a breakpoint for the foo function from my shared library:
(gdb) b F10
Breakpoint 1 at 0xb6ea41de: file frameworks/native/test/shared/src/shared, line 122.
And it doesn't match the value from my binary which was 0xb6ea4217 (printed on the screen).
It appears I did not provide the correct memory location for the shared library, but I'm clueless why.
Any help is appreciated!
Okay, so after scratching my head on this one on and off for some time now, I finally discovered what went wrong.
The solution came from a different angle, I recently had to debug some code I had partial sources for, so I did hybrid source/assembly debugging and noticed that when debugging the source, things start to skew - I can't use next instruction as it will crash - but when I debug instructions everything works great!
I then added and compiled the following short code in the AOSP tree:
int main(int argc, char** argv)
{
int first,second;
first=1;
second=2;
return first+second;
}
And, as expected, it would not debug properly (assembly debugging works, source debugging does not).
Then I noticed argc has been OPTIMIZED OUT!
So... what really happened here was a compiler optimization that prevents debugging of source code as there is no 1:1 relations between the generated instructions and the actual source. Since I left the default build flags in the hands of the AOSP build script, I got these weird debugging issues...
Thanks #EmpyloyedRussian for the assistance!
Your best bet is to run (gdb) x/10i 0xb6ea41de and (gdb) x/10i 0xb6ea4217.
I am guessing that either GDB, or your program prints the address of the PLT entry, and not the real address of foo.
P.S. Your method of calling add-symbol-file appears to be correct.
My Android application (using native library) print this warning on Android 4.4 :
linker mylib.so has text relocations. This is wasting memory and is a security risk. Please fix.
Have you got an idea of what it is and how to fix it ?
Thanks,
This would appear to be a result of two ndk-gcc bugs mentioned at https://code.google.com/p/android/issues/detail?id=23203
and stated there to have been fixed as of ndk-r8c.
It would appear that the check for libraries with the issue has been added only recently.
Note: please do not edit this post to hide the link URL. It is explicit because the destination is what makes it authoritative.
Further Note Changing NDK versions is only a fix when the warning is due to the code of your application. It will have no effect if the warning is instead on a system component such as libdvm - that can only be fixed by a system update.
You need to make the code in your library position independent...add -fpic or -fPIC to your LOCALC_FLAGS in your Android.mk and you also need to ensure that you're not linking against any static or shared libraries that contain text relocations themselves. If they do and you can re-compile them, use one of the flags mentioned above.
In short, you need to compile your library with one of the -fpic or -fPIC flags, where PIC is an abbreviation for Position Independent Code.
The longer answer is that your yourlib.so has been compiled in a manner that does not conform to the Google Android standard for an ELF file, where this Dynamic Array Tag entry is unexpected. In the best case the library will still run, but it is still an error and future AOS version will probably not allow it to run.
DT_TEXTREL 0x16 (22)
To check whats in you library use something along the line of:
# readelf --wide -S yourlib.so
There are 37 section headers, starting at offset 0x40:
Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000000000 002400 068f80 00 AX 0 0 16
[ 2] .rodata PROGBITS 0000000000000000 06b380 05ad00 00 WA 0 0 32
...
[16] .rela.text RELA 0000000000000000 26b8e8 023040 18 14 1 8
...
[36] .rela.debug_frame RELA 0000000000000000 25a608 0112e0 18 14 27 8
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
Please see my extensive answer on the topic, for more DT entry details. For details how to write proper dynamic libraries this is a must-read.
I got the same error with my application.
The application was using a native daemon that used a native library which was not implementing all the functions in its header file. When I added the required implementations to the native library everything just worked.
I don't know if you have the exact same issue but it just probably means the your native side has some mismatch.