So the lw instruction is in the following format: lw RegDest, Offset(RegSource)
. Why does the second argument take in both an offset and register source? Why not only one (i.e. only register source)?
So the lw instruction is in the following format: lw RegDest, Offset(RegSource)
. Why does the second argument take in both an offset and register source? Why not only one (i.e. only register source)?
Because what else are you going to do with the rest of the 32-bit instruction word? (Assuming you're the CPU architect designing the MIPS instruction set).
It lets LUI + LW load from any arbitrary 32-bit address in 2 instructions, instead of 3. And for loop unrolling or struct pointer->member access, avoiding ADDIU instructions for pointer math. i.e. spending that amount of coding space on LW/SW allows MIPS programs to be more efficient. Sometimes you only need 0($reg)
, but other times it would be a waste of instructions to compute the final address in a register.
Leaving out the 16-bit immediate displacement can't make the instruction shorter. MIPS is a RISC with fixed-length instruction words. (It could be R-type instead of I-type, but you'd still have unused bits in that format. Classic MIPS had lots of unused coding space, and spending coding space on LW/SW, LB/LBU/SB, and so on, is worth it.)
MIPS doesn't have a lot of different opcodes (especially classic MIPS without any FPU instructions, and without 64-bit instructions). It uses a lot of the instruction coding space to support an immediate form for most instructions, with a large immediate. (Unlike ARM32 for example which uses 4 bits in each instruction for predicated execution, and more bits for "flexible" source operand (optional rotate or shift by a constant or another register, or an immediate constant). But ARM immediates are encoded as 8 bits with a rotation, allowing lots of useful bit patterns that are common in real life.)
MIPS only has one addressing mode, and imm16(reg)
can save a significant number of addiu
instructions vs. just (reg)
.
For example, consider a C function that loads or stores to a static (or global) variable. Like
unsigned rng(void) {
static unsigned seed = 1234;
return (seed = seed * 5678 + 0x1234);
}
The compiler-generated (or hand-written) asm needs to load and store from seed
, so you need it in a register. But it's a 32-bit constant that doesn't fit in a single instruction. In hand-written asm you'd probably use a pseudo-instruction like la $t0, rng.seed
, which will assemble to lui $t0, hi(rng.seed)
/ ori $t0, $t0, lo(rng.seed)
. (hi and lo get half of the 32-bit address).
But you can do better than that:
lui $t0, hi(rng.seed)
lw $t1, lo(rng.seed) ($t0)
i.e. use the low 16 bits of the address as the 16-bit displacement in the load instruction. This is in fact what compilers like gcc do:
rng: # gcc5.4 -O3
lui $5,%hi(seed.1482)
lw $4,%lo(seed.1482)($5)
nop ; classic MIPS has a 1-cycle "shadow" for loads before the result is usable, with no pipeline interlock
sll $3,$4,5 ; I should have picked a simpler multiply constant (with fewer bits set)
sll $2,$4,3
subu $2,$3,$2
sll $3,$2,3
subu $2,$3,$2
subu $2,$2,$4
sll $3,$2,4
addu $2,$2,$3
sll $2,$2,1
addiu $2,$2,4660
j $31
sw $2,%lo(seed.1482)($5) ; branch-delay slot
seed.1482:
.word 1234
There are lots of other uses for small immediate displacements from a register. For example:
struct
fieldsAs I said, there isn't much else you could do with those extra 16 bits of the instruction word that would be a good fit for MIPS. You could leave fewer than 16 bits for the displacement, but MIPS isn't PowerPC (where there are lots and lots of opcodes).