ARM 汇编的mov操作立即数的疑问

ARM 108浏览

from: http://blog.chinaunix.net/uid-20799298-id-99632.html

1. 因为对arm汇编有些指令还不能理解,特别是一些相似功能指令间的区别。偶然在网上搜到“faq ARM assembly”,其中描述的几个问题还是值得好好研究一下。

2. 慢慢的发现自己也不再害怕英文的文档了,耐心看至少也能懂个大概。大批经典的文章和书籍都是en文的,所以经常看英文文档是一个非常好的习惯。看看GNU的一些reference manual,哪个不是经典而又值得学习并研究的!

3. 学习别人写文档的风格,重点要注意条理性。能够把一个问题、一个知识点阐述清晰明白,这不仅需要对知识点的掌握,还需要良好的语言表达能力以及对文章细心、整洁的排版。我想,这些细节才能够体现一个人的水平和他所能到达的高度。


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------


本篇来看一下mov这个指令,相关指令 ldr

Question on MOV
Does the instruction “Mov” have indirect addressing?
Answer. No, e.g. you cannot use mov r1,[r2] 
“MOV loads a value into the destination register, from another register, a shifted register, or an immediate 8-bit value.”

Examples:
  MOV R0, R1 if R1 has 0x00001234, after this ,R1=R2=0x00001234
  MOV R0, #0x12; after this R0 has #0x12
  MOV R0, #300is wrong the value should be less than 255
  MOV R0, 200; is wrong, # is missing “mov R0,#200: is correct

Note: the immediate value must be prefixed by #

从上面的描述可以看出,mov的作用有两个:

1. 在寄存器之间传递值。

2. 给寄存器传递一个立即数,此时需要用“#”来修饰立即数,并且立即数为8位的,其值不能超过255.

但是在vivi中的head.S里,有许多类似 mov r1, #0x53000000 的语句:


简单的写几句汇编,测试一下:

.global _start 
.align 0 
_start: 
        mov r1, #0x12 
        mov r2, #300 
        mov r3, #0x53000000

.end


这样,编译并没有报错。

反汇编后如下

8000: e3a01012 mov r1, #18 ; 0x12 
8004: e3a02f4b mov r2, #300 ; 0x12c 
8008: e3a03453 mov r3, #1392508928 ; 0x53000000


这是为什么呢?为什么用mov也可以?看别的汇编里都是写ldr r1, =0x53000000


将程序改一下:

.global _start 
.align 0 
_start: 
        mov r1, #0x12 
        mov r2, #300 
        ldr r3, =0x53000000 
.end



汇编也没有报错,反汇编可看到如下三句话:

8000: e3a01012 mov r1, #18 ; 0x12 
8004: e3a02f4b mov r2, #300 ; 0x12c 
8008: e3a03453 mov r3, #1392508928 ; 0x53000000



发现,ldr r3, =0x53000000 被经过汇编之后,实际上变为了 mov r3, #0x53000000 。可见,此处ldr相当于一个伪指令。



关于ldr,下面这段说的很清楚。在分析汇编时,很重要一点就是分清“地址” 和“地址里的内容”,对这个能熟练把握了,相信能对C语言中的指针有更深的理解。

Question on the use of LDR,ADR

Are there any difference between statement 1,2,3 in the following program?

Data1p DCD 0, 0 ;just happen the address is at 0x40000000
;DCD (reserve a 32-bit word)is a pseudo instruction to
;allocate a memory location for this data.
align
align
:
:
Statment1 LDR R0, =Data1p ; Put the address Data1p into R0
Statment2 ADR R0, Data1p ; Put the address Data1p into R0
Statment3 LDR R0, =0x40000000 ; Put the value0x40000000 into R0,
;just happen it is the address of Data1p

Answer: They are all the same. Every statement will generate the same result. Such that the address, not the data content, will be saved into R0. For example, the value in R0 is 0x40000000.


到这里,相信一定对“mov”和“ldr”两个产生了一些疑惑。(注:此处的ldr指的是伪指令pseudo-instruction,而不是内存到寄存器传递值的instruction

下面的这段对两者的区别作了很全面的分析。

Question on differences between LDR and MOV
What is the difference between MOV and LDR, they seem to have the same function of saving information into registers?

Answer part 1: How different are they?
Note: “#” for mov, “=” for ldr. To define an immediate value
MOV can only move an 8-bit value (0x00->0xff=255) into a register while LDR can move a 32-bit value into a register. The immediate value is prefixed by different characters in mov and ldr: “#” formov, “=” for ldr. E.g.


Mov r1,#255 ; ok, 255 is the biggest number you can mov
Mov r1,255 ; is wrong , missing #
Mov r1,#256 ; is wrong, the number is bigger than 255
Mov r1,#0x12340000 ; is wrong, the number is bigger than 255
Ldr r1,=255; you can do this,
Ldr r1,=256; you can do this,
Ldr r1,=0x12340000; you can do this,


1. MOV can run faster than LDR.
2. LDR can move a data from a memory address to a register, MOV can only i) move data between two registers or ii) save a 8-bit immediate value to a register. e.g.

value1 DCD 0; this define an integer variable “value1” with address “=value1”

:

;A standard pair of statements for moving a data into a register
Ldr r0,=value1 ; 1) save the address of the variable value1 in r0
Ldr r1,[r0] ;2)use r0 as the address (pointer) to get value1 to r1

r1 Note: Mov cannot be used to achieve the same result, because mov,[r0] is not allowed

Answer part 2 : How similar are they?.
MOV is a real instruction, LDR is a pseudo instruction. If the immediate value is small, the assembler will use “mov” to implement it , otherwise it uses a literal pool to achieve the result.
e.g.
ldr r0,=14; the immediate value 14 <255, so it will be implemented using mov r0,#14, (see the use of # and = ) but if the immediate value is large than 255, e.g.
The assembler will generate code to place the constant 0x55555555 ldr r0,=0x55555555; for a large immediate value “mov” does work.in a nearby table in the code area. Then it uses an instruction to load a data from that table pointed by the program counter and an offset to fill up r0. The reason is because there is no way to fit a 2-bit data into a 32-instruction, and ARM design always want to make sure instructions are 32-bit. Details can be found at 
http://www.keil.com/support/man/docs/armasm/armasm_chdcegci.htm

see also the directive “LTORG” for how to setup the table. Usually it is placed at the near by code area. You don’t need to worry too much because everything is automatic; you only need to place “LTORG” at the end of the code area as shown in the example at Keil.


但是我在看完上面这段话之后还是存在疑问:为什么”mov r3, #0x53000000”是可行的呢?

http://www.keil.com/support/man/docs/armasm/armasm_cihcdbca.htm

这个网页上对mov做了一些说明,截取有关立即数的部分如下:


Syntax
MOV{cond} Rd, #imm16
where:
imm16
is any value in the range 0-65535.

可以看到,这里说 立即数是16位的。

看到这里,确实把自己弄糊涂了。理清一下:

第一个说imm8位是在网络上搜的一分资料,没有什么权威性,其可信程度也值得怀疑。

第二个是keil官司方网站里关于arm汇编的说明。


另外,在《arm体系结构与编程》这本书里,并没有说立即数的具体范围,在26页有一句:

mov r0, #0xfc0

明显立即数大于255了。

144页有提到,“ldr伪指令读取的数据超过mov操作范围”。这说明mov可操作的立即数是有一定范围的,且比ldr小。

再来分析一下立即数的产生,其寻址方式是这样的:

11         8 7                              0 
 +----------+-------------------------------+
 | Rotate   |          Imm                  |
 +----------+-------------------------------+
[7:0] Unsigned 8 bit immediate value
[11:8] Shift applied to Imm

The immediate operand rotate field is a 4 bit unsigned integer which specifies a shift operation on the 8 bit immediate value. This value is zero extended to 32 bits, and then subject to a rotate right by twice the value in the rotate field. This enables many common constants to be generated, for example all powers of 2


1. 取低
8位,先用0扩展为32位数
2. 将所得
32位数循环右移 2Rotate位,Rotate[11:8]

来分析一句:mov r2, #300。反汇编如下:

8004: e3a02f4b mov r2, #300 ; 0x12c

立即数是直接放在指令内部的。

1. 取其低8位:0x4b

2. 扩展为32位:0x0000 004b

3.   2*Rotate = 2*15 = 30

4.   循环右移30位(相当于左移2位)。即0100 1011 左移2位,得到0001 0010 1100 ,即0x12c,十进制等于300

对于0x53000000的计算方法也是相同的。 mov r1, #0x53000000 这样写确实是可行的。

----------------------------------------------------------------------------------------------------------------------------------------------------

总结:对于mov 操作立即数时的操作范围,现在还是不确定。但经过这么多的分析以及实际写的几句测试代码,至少可以说明在Linux里,用arm-linux-as来编译,mov是可以操作32位的立即数的,不然vivi如何编译成功。(怀疑是否这跟实际汇编器相关。)