No refrence for instructions strneh, lsreq, streqb - android

I was looking at strcpy.S file in android platform at path libc/arch-arm/bionic, in this file there are many arm instructions which i am not able to understand, i am also referring ARM System Developers Guide.
Here except "tst" and "tstne" i am not able to find any refrence for others in any book or ARM refrence manual.
tst r2, #0xff00
iteet ne
strneh r2, [ip], #2
lsreq r2, r2, #8
r2, [ip]
tstne r2, #0xff
Not only these instructions there are many others in different files also.
Does anyone have any idea what these instructions are ?

The first instructions it the it-instruction from the thumb instructions set.
iteet ne
This instruction marks the next three instructions to be conditional executable. The last three characters of the instruction make a pattern consisting of e (else) or t (then). The operand 'ne' specifies the condition to be evaluated.
The other three instructions are ordinary ARM instructions with conditionals:
strneh r2, [ip], #2 ; store halfword if not equal
lsreq r2, r2, #8 ; logical shift right if equal
tstne r2, #0xff ; test if not equal
These are the three instructions affected by the it-instruction. They come with ne/eq conditional flags as well.
As you can see the conditions of the it-instructions and the conditions of the other three instructions are in conflict to each other. This is a bug in the code. Most likely it hasn't been discovered before because the code-snippet is from the ARM-big-endian code, and I know of no android phone that uses ARM in big endian.
Btw, it's worthwhile to know why the conditions are given in the it-instruction and in the instructions itself. This is part of the unified arm assembly standard. On the ARM you have two modes of operation: Thumb mode (uses It-instruction, less powerful) and ARM-mode (more powerful, uses condition-flags in the instructions itself).
If you limit yourself to the capabilities of the thumb-mode it is possible to write code that would compile in thumb and ARM-mode. This is done here.
If you assemble for Thumb-mode the It-instruction will be used to control the conditions of the next three instruction, and the conditions within the instructions gets ignored. If you assemble to ARM-instruction set the It-instruction gets ignored and the conditions from the instruction itself will become active.
This works well as long as the it-instruction and the conditions in the arm-instructions match. As I said before this is not the case here, so it will either not work in thumb-mode, arm-mode or both :-)

strneh is a store command with some conditional execution/size specifier suffixes. The ARM docs are a good place to start: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/Chdehgih.html
If you google "arm conditional execution", you'll find a number of blogs/articles that may also help: http://blogs.arm.com/software-enablement/258-condition-codes-2-conditional-execution/
As for your *strneh" instruction:
str = store
ne = execute if not equal (Z flag clear)
h = perform a half-word operation

Related

ARM64 hooking: insert a jump in memory

I'm trying to hook on a 64 bit ARM Android device the SSL_do_handshake function available in libssl.so, until now I have:
address of libssl.so (found in proc//maps)
offset of SSL_do_handshake(found in libssl.so)
absolute address of SSL_do_handshake (address_of_libssl + offset_of_SSL_do_handshake)
What I'm doing is save the instructions at absolute address of libssl.so (for recovery) and overwrite them with my instructions (for jump):
LDR X9,#8
Br X9
address for jump 7f75a5b890
In detail what I'm saving in memory is 49000058 20011fd6 90b8a5757f if I look in memory after the overwriting I found my instructions and the address. If I try to execute it and call SSL_do_hanshake I have a crash before the jump.
I did the same for ARMv7 using thumb32 with the same approach but different instructions:
ldr pc [pc,#0]
address for jump (f36e240d)
In memory dff800f0 0d246ef3 and it works.
Any idea of what I'm doing wrong?

Getting a physical address from an allocated buf in a module without using virt_to_phys macro

I am trying to write a android arm kernel module in which I need to use a virt_to_phys translation of a memory var allocated using _kmalloc.
I do know that I can use the macro virt_to_physc to do this task. However, I dont have the specifically full kernel source, and beacuse virt_to_physc is a macro
I can't get a function address reading kallsyms to use in my module , so I would like to find another way to do this task.
I've been trying to do it using MMU (registers ATS1Cxx and PAR) to perform V=>P as Iam working in an ARMv7 proccessor but I couldnt make it work.
That's my test code...
int hello_init_module(void) {
printk("Virtual MEM:0x%X \n", allocated_buf);
//Trying to get the physc mem
asm("\t mcr p15, 0, %[value], c7, c8, 2\n"
"\t isb\n"
\t mrc p15, 0, %[result], c7, c4, 0\n" : [result]"=r" (pa) : [value]"r" (allocated_buf));
printk("Physical using MMU : %x\n", pa );
//This show the right address, but I wanna do it without calling the macro.
printk("Physical using virt_2_physc: 0x%X",virt_to_phys((int *) allocated_buf);)
}
What Iam actually doing is developing a module that is intended to work in two devices with the same 3.4.10 kernel but different memory arquitectures,
I can make the module works as they have the same VER_MAGIC and functions crc, so the module load perfectly in both devices.
The main problem is that because of diferences in their arquitecture, PAGE_OFFSET and PHYS_OFFSET actually change in both of them.
So, I've wondering if there is a way to make the translation without define this values as constant in my module.That's what I tried using MMU to perform V=>P , but MMU hasnt worked in my case, it always returns 0x1F.
According to cat /proc/cpuinfo . Iam working with a
Processor : ARMv7 Processor rev 0 (v7l)
processor : 0
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls
CPU implementer : 0x51
CPU architecture: 7
If it's not possible to do it using MMU as alternative way of using virt_to_phys.
Does somebody know other way to do it?

Arm Instructions [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
i wanted to ask what is happening in arm instruction here. I have knowledge of ASM but i am having hard time understanding ARM. I try to look up the info on the internet that gives the basics but the thing i am looking at is a little different. So here is a code i am trying to understand.. Can you please explain what these instructions are doing. I will mark the ones i dont understand. Its a code from IDA. Can someone please explain the entire function. i will really be grateful. Thanks
LDR R3, =(unk_E9BFB0 - 0x6B1B4C) //This once,i dont get it, is it subtracting?
LDR R8, [R5]
MOV R1, R6
ADD R3, PC, R3
LDR LR, [R4,#0xC] //This instruction
LDR R12, =(aDraw_debug - 0x6B1B68) //This once
MOV R2, R7
STR R8, [R3,#0x30]
MOV R0, R3
STR R4, [R3]
ADD R12, PC, R12 ; "draw_debug" //This once
STR R3, [R5]
STR R12, [R3,#0x2C]
ADD R12, LR, #1
STR R12, [R4,#0xC]
BL __aeabi_atexit
LDR is the instruction on ARM to load a register with a value from memory. The following, however, are the pseudoinstruction form of LDR:
LDR R3, =(unk_E9BFB0 - 0x6B1B4C) //This once,i dont get it, is it subtracting?
LDR R12, =(aDraw_debug - 0x6B1B68) //This once
I believe it is constructing an offset from a code location to pull data out of the .text section. aDraw_debug - 0x6B1B68, for example, is likely taking the address of label aDraw_debug, and subtracting the location of some instruction, 0x6B1B68.
The end result is that this essentially will load the offset of aDraw_debug (from an arbitrary point) into R12. Likewise, the other instruction will load unk_E9BFB0 into R3. That one is probably compiler-generated data because of the mangled name.
The other LDR instruction:
LDR LR, [R4,#0xC] //This instruction
are a straightforward affair. LDR is still being used to load a register from memory, but the addressing is different. In this case, LR, the link register, is being loaded with the data at address R4 + 0xC in memory.
The ADD is straightforward as well, and I'm not sure if your were confused as to what it was doing but:
ADD R12, PC, R12 ; "draw_debug" //This once
this simply adds PC + R12 and stores it in R12 without updating the processor flags. This takes the offset determined earlier and applies it to the current PC.
Overall, it looks like this is code to store values into some sort of struct or class. The compiler chose to do PC-relative addressing, but the offset is likely outside of the usable range for the instruction. PC-relative LDR/STR instructions can only address +/- 4096 from PC.
The ldr reg,=something is a shortcut. The arm is a mostly fixed length instruction set or lets say there are not enough bits to allow for any possible immediate value. Also this shortcut allows you to use labels, load the address of some label into this register. What the assembler does with this is when it generates the machine code it allocates a word location somewhere not too far away that is not in the execution path (after an unconditional branch or bx for example). Then for the ldr reg,= instruction it encodes a pc relative instruction ldr reg,[pc,+offset] to load that value into the register. This allows for any 32 bit value to be loaded, no restrictions. It also allows for labels to be used where the data value (address) is filled in later by the linker. Your example takes that one further and does math on a label, so the assembler or linker will have to resolve the label, do the math, then put that value in the word location.
ARM.COM Arm instruction Set
Info Center of ARM.com - Quick Reference Cards for ARM instruction set
Doc for ARM instruction set
Indexed for each instruction web page
Arm-instructionset
Embedded Systems Architecture: The ARM Instruction Set Architecture by Mark McDermott With help from ourgood friends at ARM

ALSA - unmuting devices?

I have been trying to capture audio, within a native linux program running on an Android device via adb shell.
Since I seemed to be getting only (very quiet) noise, i.e. no actual signal (interestingly, an Android/Java program doing similar did show there was a signal on that input),
I executed alsa_amixer, which had one entry that looked like the right one:
Simple mixer control 'Capture',0
Capabilities: cvolume cswitch penum
Capture channels: Front Left - Front Right
Limits: Capture 0 - 63
Front Left: Capture 31 [49%] [0.00dB] [off]
Front Right: Capture 31 [49%] [0.00dB] [off]
"off". That would explain the noise.
So I looked for examples of how to use alsa_amixer to unmute the channels, I found different suggestions for parameters like "49% on" or "49% unmute", or just "unmute" none of which works. (if the volume% is left out, it says "Invalid command!", otherwise, the volume is set, but the on/unmute is ignored)
I also searched how to do this programatically (which I'll ultimately need to do, although the manual approach would be helpful for now), but wasn't too lucky there.
The only ALSA lib command I found which sounds like it could do something like that was "snd_mixer_selem_set_capture_switch_all", but the docs don't day what the parameter does (1/0 is not on/off, I tried that ;) )
The manual approach to set these things via alsa_amixer does work - but only if android is built with the 'BoardConfigCommon.mk' modified, at the entry: BOARD_USES_ALSA_AUDIO := false, instead of true.
Yeah, this will probably disable ALSA for android, which is why it wouldn't meddle with the mixer settings anymore.
To you android programmers out there, note that this is a very niche use case of course, as was to be expected by my original post to begin with.
This is not what most people would want to do.
I just happen to tinker with an android device here in unusual ways ;-)
Just posting the code as question giver suggested, also don't like external links.
#include <alsa/asoundlib.h>
int main()
{
snd_mixer_t *handle;
snd_mixer_selem_id_t *sid;
snd_mixer_open(&handle, 0);
snd_mixer_attach(handle, "default");
snd_mixer_selem_register(handle, NULL, NULL);
snd_mixer_load(handle);
snd_mixer_selem_id_alloca(&sid);
snd_mixer_selem_id_set_index(sid, 0);
snd_mixer_selem_id_set_name(sid, "Capture");
snd_mixer_elem_t* elem = snd_mixer_find_selem(handle, sid);
snd_mixer_selem_set_capture_switch_all(elem, 0);
snd_mixer_selem_set_capture_dB_all(elem, 0, 0);
snd_mixer_close(handle);
}

How does mterp (Dalvik VM) organize its byte-code interprete loop?

I am studying Android Dalvik VM and encounter a question when I read the mterp code in file vm/mterp/out/InterpC-portable.cpp. Actually it's the main interpreter loop of dalvik vm to interprete the byte code in dex file. If I wrote this file, I will choose a switch-case structure to do like this:
while (hasMoreIns()) {
int ins = getNextIns();
switch(ins) {
case MOV:
//interprete this instruction
...
break;
case ADD:
...
break;
...
default: break;
}
}
However, what mterp uses is very different with my thoughts, it uses some magical code(for me) like this:
FINISH(0);
HANDLE_OPCODE(OP_NOP)
FINISH(1);
OP_END
HANDLE_OPCODE(OP_MOVE)
...
OP_END
...
I google it and find it seems to be a modified "threaded" style execution, which different with switch-case style and has a better performance because it remove the branch operation in while loop. But I still can't understand this code and why it's better on performance. How does it find the next code to interpreter?
As a brief bit of guidance, the out directory is filled with preprocessed files and is not what I'd call a great thing to read, if you're trying to figure out the code. The source (per se) that corresponds to InterpC-portable.cpp is the contents of the portable and c directories.
In terms of how the code does opcode dispatch, you'll want to look at the definition of the FINISH macro, in portable/stubdefs.cpp:
# define FINISH(_offset) { \
ADJUST_PC(_offset); \
inst = FETCH(0); \
if (self->interpBreak.ctl.subMode) { \
dvmCheckBefore(pc, fp, self); \
} \
goto *handlerTable[INST_INST(inst)]; \
}
This macro is used at the end of each opcode definition and serves as the equivalent of a switch (opcode) statement. Briefly, this reads the instruction code unit pointed at by the PC — inst = FETCH(0) — grabs the opcode out of it — INST_INST(inst) — and uses that opcode as an index into the table of addresses of all the opcodes. The address is directly branched to with the goto statement.
The goto is a "computed goto," which is a non-standard GCC extension. You can read about it in the GCC manual, and you can also find a bit about the topic in the presentation I gave on Dalvik internals at Google IO back in 2008. (Find it at https://sites.google.com/site/io/dalvik-vm-internals.)
My talk also touches on the topic of the performance characteristics of this technique. Briefly, it saves some amount of branching and plays relatively nice with branch prediction. However, there are better ways to write an interpreter (as I cover in the talk, and as the CPU-specific Dalvik interpreters in fact work).
And for just a bit more of the larger context, compiling bytecode to native CPU instructions is in general going to result in faster execution than even the most well-tuned interpreter, assuming you have sufficient RAM to hold the compiled result. The trace-based Dalvik JIT that was introduced in Froyo was meant to make a tradeoff wherein modest amounts of extra RAM were used to achieve reasonably-fruitful performance gains.

Categories

Resources