How the GNU Assembler uses labels

I’m trying to understand how labels are used by GNU as. In the exception vector table of a working program, I see:

_vectors:       ldr     PC, Reset_Addr
...
Reset_Addr:     .word   Reset_Handler

Another variation on the syntax, from another program:

_start: ldr     pc,  =reset
...
reset:   ...

I have been assuming so far that Reset_Addr is an absolute address, but according to the ARM System Developer’s Guide, a [Register, offset] must be used as the operands for LDR. In fact, I have not found any documentation anywhere that supports the syntax above.

The “using as” documentation did not elaborate:

“The symbol then represents the current value of the active location counter, and is, for example, a suitable instruction operand.”

On the other hand, the ARM System Developer’s Guide has the following code snippet:

    LDR    r0, [pc, #constant_number-8-{PC}]
...
constant_number
    DCD 0xff00ffff

In this case, the label constant_number seems to be an absolute value.

Could someone please set me straight about using labels with LDR, and/or point me toward some practical as documentation?

That puzzled me also. I can’t say I truly have it straight…

Page 59 of the [manual at gnuarm has an explanation of the version with the = prefix. Also, somewhere on the Arm site there is a publication Arm: Assembly Language Programming by Peter Knaggs and Stephen Welsh of the University of Bournemouth. That has an example with brief explanation on page 59 (printed) / 73 (.pdf file).

I didn’t find any documentation on the form without the =, but from experimentation I have concluded that, if you fail to specify a base address register, the assembler tries to use an instruction of the form

ldr rd, [pc, #nn]

If that fails, the instruction without an = gives an error

uartag.a: Assembler messages:
uartag.a:63: Error: bad immediate value for offset (3758145528)
uartag.a:63: Error: internal_relocation (type: OFFSET_IMM) not fixed up

The assembler apparently won’t try a mov instruction, or put the constant in the literal pool without the = prefix.

With the = prefix, the assembler treats the case where the argument is a label differently than the case where the argument is a constant defined with a .set directive. For a neo like me, this distinction can be subtle.

The case where the argument is a label is the one you described. In that case the assembler loads the contents of the address defined by the label into the destination register. The startup vectors dissassemble as

00000000 <_startup>:
   0:	e59ff018 	ldr	pc, [pc, #24]	; 20 <Reset_Addr>
   4:	e59ff018 	ldr	pc, [pc, #24]	; 24 <Undef_Addr>
   8:	e59ff018 	ldr	pc, [pc, #24]	; 28 <SWI_Addr>
   c:	e59ff018 	ldr	pc, [pc, #24]	; 2c <PAbt_Addr>
  10:	e59ff018 	ldr	pc, [pc, #24]	; 30 <DAbt_Addr>
  14:	b8a06f58 	stmltia	r0!, {r3, r4, r6, r8, r9, sl, fp, sp, lr}
  18:	e59ff018 	ldr	pc, [pc, #24]	; 38 <IRQ_Addr>
  1c:	e59ff018 	ldr	pc, [pc, #24]	; 3c <FIQ_Addr>

but

.set U0RBR,           0x90000000        /* test */
            ldr   r1, =U0RBR             /* UART0 base                      */

produces

  70:	e3a01209 	mov	r1, #-1879048192	; 0x90000000

In this case, the assembler loaded the constant into the destination register. It did not try to load the value found at the address of the constant into the register.](http://www.gnuarm.com/pdf/as.pdf)

The two forms do the same thing, except =symbol automatically puts the reference in a literal pool. This can be issue with the vectors since there is no guarantee where the pool is placed, technically it might end up outside the mappable vector region.

Thanks, both of you.

hsutherl, it’s exactly as you’ve said.

LDR Rd, label

places the value being held at address {label} into Rd.

LDR RD, =label

places the address value of the label into Rd. It may add the value of the address to a literal pool to make it available.

There’s a reasonable explanation in ARM Assembly Language - an Introduction by J. R. Gibson, and I’ve checked the disassembly as well.