Skip to content

Commit a6b204b

Browse files
authored
[lld][AArch64] Fix handling of SHT_REL relocation addends. (#98291)
Normally, AArch64 ELF objects use the SHT_RELA type of relocation section, with addends stored in each relocation. But some legacy AArch64 object producers still use SHT_REL in some situations, storing the addend in the initial value of the data item or instruction immediate field that the relocation will modify. LLD was mishandling relocations of this type in multiple ways. Firstly, many of the cases in the `getImplicitAddend` switch statement were apparently based on a misunderstanding. The relocation types that operate on instructions should be expecting to find an instruction of the appropriate type, and should extract its immediate field. But many of them were instead behaving as if they expected to find a raw 64-, 32- or 16-bit value, and wanted to extract the right range of bits. For example, the relocation for R_AARCH64_ADD_ABS_LO12_NC read a 16-bit word and extracted its bottom 12 bits, presumably on the thinking that the relocation writes the low 12 bits of the value it computes. But the input addend for SHT_REL purposes occupies the immediate field of an AArch64 ADD instruction, which meant it should have been reading a 32-bit AArch64 instruction encoding, and extracting bits 10-21 where the immediate field lives. Worse, the R_AARCH64_MOVW_UABS_G2 relocation was reading 64 bits from the input section, and since it's only relocating a 32-bit instruction, the second half of those bits would have been completely unrelated! Adding to that confusion, most of the values being read were first sign-extended, and _then_ had a range of bits extracted, which doesn't make much sense. They should have first extracted some bits from the instruction encoding, and then sign-extended that 12-, 19-, or 21-bit result (or whatever else) to a full 64-bit value. Secondly, after the relocated value was computed, in most cases it was being written into the target instruction field via a bitwise OR operation. This meant that if the instruction field didn't initially contain all zeroes, the wrong result would end up in it. That's not even a 100% reliable strategy for SHT_RELA, which in some situations is used for its repeatability (in the sense that applying the relocation twice should cause the second answer to overwrite the first, so you can relocate an image in advance to its most likely address, and then do it again at load time if that turns out not to be available). But for SHT_REL, when you expect nonzero immediate fields in normal use, it couldn't possibly work. You could see the effect of this in the existing test, which had a lot of FFFFFF in the expected output which there wasn't any plausible justification for. Finally, one relocation type was actually missing: there was no support for R_AARCH64_ADR_PREL_LO21 at all. So I've rewritten most of the cases in `getImplicitAddend`; replaced the bitwise ORs with overwrites; and replaced the previous test with a much more thorough one, obtained by writing an input assembly file with explicitly specified relocations on instructions that also have carefully selected immediate fields, and then doing some yaml2obj seddery to turn the RELA relocation section into a REL one.
1 parent ae2e66b commit a6b204b

File tree

2 files changed

+399
-125
lines changed

2 files changed

+399
-125
lines changed

lld/ELF/Arch/AArch64.cpp

Lines changed: 83 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -239,30 +239,81 @@ int64_t AArch64::getImplicitAddend(const uint8_t *buf, RelType type) const {
239239
case R_AARCH64_IRELATIVE:
240240
case R_AARCH64_TLS_TPREL64:
241241
return read64(buf);
242+
243+
// The following relocation types all point at instructions, and
244+
// relocate an immediate field in the instruction.
245+
//
246+
// The general rule, from AAELF64 §5.7.2 "Addends and PC-bias",
247+
// says: "If the relocation relocates an instruction the immediate
248+
// field of the instruction is extracted, scaled as required by
249+
// the instruction field encoding, and sign-extended to 64 bits".
250+
251+
// The R_AARCH64_MOVW family operates on wide MOV/MOVK/MOVZ
252+
// instructions, which have a 16-bit immediate field with its low
253+
// bit in bit 5 of the instruction encoding. When the immediate
254+
// field is used as an implicit addend for REL-type relocations,
255+
// it is treated as added to the low bits of the output value, not
256+
// shifted depending on the relocation type.
257+
//
258+
// This allows REL relocations to express the requirement 'please
259+
// add 12345 to this symbol value and give me the four 16-bit
260+
// chunks of the result', by putting the same addend 12345 in all
261+
// four instructions. Carries between the 16-bit chunks are
262+
// handled correctly, because the whole 64-bit addition is done
263+
// once per relocation.
242264
case R_AARCH64_MOVW_UABS_G0:
243265
case R_AARCH64_MOVW_UABS_G0_NC:
244-
return getBits(SignExtend64<16>(read16(buf)), 0, 15);
245266
case R_AARCH64_MOVW_UABS_G1:
246267
case R_AARCH64_MOVW_UABS_G1_NC:
247-
return getBits(SignExtend64<32>(read32(buf)), 16, 31);
248268
case R_AARCH64_MOVW_UABS_G2:
249269
case R_AARCH64_MOVW_UABS_G2_NC:
250-
return getBits(read64(buf), 32, 47);
251270
case R_AARCH64_MOVW_UABS_G3:
252-
return getBits(read64(buf), 48, 63);
271+
return SignExtend64<16>(getBits(read32(buf), 5, 20));
272+
273+
// R_AARCH64_TSTBR14 points at a TBZ or TBNZ instruction, which
274+
// has a 14-bit offset measured in instructions, i.e. shifted left
275+
// by 2.
253276
case R_AARCH64_TSTBR14:
254-
return getBits(SignExtend64<32>(read32(buf)), 2, 15);
277+
return SignExtend64<16>(getBits(read32(buf), 5, 18) << 2);
278+
279+
// R_AARCH64_CONDBR19 operates on the ordinary B.cond instruction,
280+
// which has a 19-bit offset measured in instructions.
281+
//
282+
// R_AARCH64_LD_PREL_LO19 operates on the LDR (literal)
283+
// instruction, which also has a 19-bit offset, measured in 4-byte
284+
// chunks. So the calculation is the same as for
285+
// R_AARCH64_CONDBR19.
255286
case R_AARCH64_CONDBR19:
256287
case R_AARCH64_LD_PREL_LO19:
257-
return getBits(SignExtend64<32>(read32(buf)), 2, 20);
288+
return SignExtend64<21>(getBits(read32(buf), 5, 23) << 2);
289+
290+
// R_AARCH64_ADD_ABS_LO12_NC operates on ADD (immediate). The
291+
// immediate can optionally be shifted left by 12 bits, but this
292+
// relocation is intended for the case where it is not.
258293
case R_AARCH64_ADD_ABS_LO12_NC:
259-
return getBits(SignExtend64<16>(read16(buf)), 0, 11);
294+
return SignExtend64<12>(getBits(read32(buf), 10, 21));
295+
296+
// R_AARCH64_ADR_PREL_LO21 operates on an ADR instruction, whose
297+
// 21-bit immediate is split between two bits high up in the word
298+
// (in fact the two _lowest_ order bits of the value) and 19 bits
299+
// lower down.
300+
//
301+
// R_AARCH64_ADR_PREL_PG_HI21[_NC] operate on an ADRP instruction,
302+
// which encodes the immediate in the same way, but will shift it
303+
// left by 12 bits when the instruction executes. For the same
304+
// reason as the MOVW family, we don't apply that left shift here.
305+
case R_AARCH64_ADR_PREL_LO21:
260306
case R_AARCH64_ADR_PREL_PG_HI21:
261307
case R_AARCH64_ADR_PREL_PG_HI21_NC:
262-
return getBits(SignExtend64<32>(read32(buf)), 12, 32);
308+
return SignExtend64<21>((getBits(read32(buf), 5, 23) << 2) |
309+
getBits(read32(buf), 29, 30));
310+
311+
// R_AARCH64_{JUMP,CALL}26 operate on B and BL, which have a
312+
// 26-bit offset measured in instructions.
263313
case R_AARCH64_JUMP26:
264314
case R_AARCH64_CALL26:
265-
return getBits(SignExtend64<32>(read32(buf)), 2, 27);
315+
return SignExtend64<28>(getBits(read32(buf), 0, 25) << 2);
316+
266317
default:
267318
internalLinkerError(getErrorLocation(buf),
268319
"cannot read addend for relocation " + toString(type));
@@ -366,11 +417,13 @@ static void write32AArch64Addr(uint8_t *l, uint64_t imm) {
366417
write32le(l, (read32le(l) & ~mask) | immLo | immHi);
367418
}
368419

369-
static void or32le(uint8_t *p, int32_t v) { write32le(p, read32le(p) | v); }
420+
static void writeMaskedBits32le(uint8_t *p, int32_t v, uint32_t mask) {
421+
write32le(p, (read32le(p) & ~mask) | v);
422+
}
370423

371424
// Update the immediate field in a AARCH64 ldr, str, and add instruction.
372-
static void or32AArch64Imm(uint8_t *l, uint64_t imm) {
373-
or32le(l, (imm & 0xFFF) << 10);
425+
static void write32Imm12(uint8_t *l, uint64_t imm) {
426+
writeMaskedBits32le(l, (imm & 0xFFF) << 10, 0xFFF << 10);
374427
}
375428

376429
// Update the immediate field in an AArch64 movk, movn or movz instruction
@@ -443,7 +496,7 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
443496
write32(loc, val);
444497
break;
445498
case R_AARCH64_ADD_ABS_LO12_NC:
446-
or32AArch64Imm(loc, val);
499+
write32Imm12(loc, val);
447500
break;
448501
case R_AARCH64_ADR_GOT_PAGE:
449502
case R_AARCH64_ADR_PREL_PG_HI21:
@@ -470,66 +523,68 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
470523
[[fallthrough]];
471524
case R_AARCH64_CALL26:
472525
checkInt(loc, val, 28, rel);
473-
or32le(loc, (val & 0x0FFFFFFC) >> 2);
526+
writeMaskedBits32le(loc, (val & 0x0FFFFFFC) >> 2, 0x0FFFFFFC >> 2);
474527
break;
475528
case R_AARCH64_CONDBR19:
476529
case R_AARCH64_LD_PREL_LO19:
477530
case R_AARCH64_GOT_LD_PREL19:
478531
checkAlignment(loc, val, 4, rel);
479532
checkInt(loc, val, 21, rel);
480-
or32le(loc, (val & 0x1FFFFC) << 3);
533+
writeMaskedBits32le(loc, (val & 0x1FFFFC) << 3, 0x1FFFFC << 3);
481534
break;
482535
case R_AARCH64_LDST8_ABS_LO12_NC:
483536
case R_AARCH64_TLSLE_LDST8_TPREL_LO12_NC:
484-
or32AArch64Imm(loc, getBits(val, 0, 11));
537+
write32Imm12(loc, getBits(val, 0, 11));
485538
break;
486539
case R_AARCH64_LDST16_ABS_LO12_NC:
487540
case R_AARCH64_TLSLE_LDST16_TPREL_LO12_NC:
488541
checkAlignment(loc, val, 2, rel);
489-
or32AArch64Imm(loc, getBits(val, 1, 11));
542+
write32Imm12(loc, getBits(val, 1, 11));
490543
break;
491544
case R_AARCH64_LDST32_ABS_LO12_NC:
492545
case R_AARCH64_TLSLE_LDST32_TPREL_LO12_NC:
493546
checkAlignment(loc, val, 4, rel);
494-
or32AArch64Imm(loc, getBits(val, 2, 11));
547+
write32Imm12(loc, getBits(val, 2, 11));
495548
break;
496549
case R_AARCH64_LDST64_ABS_LO12_NC:
497550
case R_AARCH64_LD64_GOT_LO12_NC:
498551
case R_AARCH64_TLSIE_LD64_GOTTPREL_LO12_NC:
499552
case R_AARCH64_TLSLE_LDST64_TPREL_LO12_NC:
500553
case R_AARCH64_TLSDESC_LD64_LO12:
501554
checkAlignment(loc, val, 8, rel);
502-
or32AArch64Imm(loc, getBits(val, 3, 11));
555+
write32Imm12(loc, getBits(val, 3, 11));
503556
break;
504557
case R_AARCH64_LDST128_ABS_LO12_NC:
505558
case R_AARCH64_TLSLE_LDST128_TPREL_LO12_NC:
506559
checkAlignment(loc, val, 16, rel);
507-
or32AArch64Imm(loc, getBits(val, 4, 11));
560+
write32Imm12(loc, getBits(val, 4, 11));
508561
break;
509562
case R_AARCH64_LD64_GOTPAGE_LO15:
510563
checkAlignment(loc, val, 8, rel);
511-
or32AArch64Imm(loc, getBits(val, 3, 14));
564+
write32Imm12(loc, getBits(val, 3, 14));
512565
break;
513566
case R_AARCH64_MOVW_UABS_G0:
514567
checkUInt(loc, val, 16, rel);
515568
[[fallthrough]];
516569
case R_AARCH64_MOVW_UABS_G0_NC:
517-
or32le(loc, (val & 0xFFFF) << 5);
570+
writeMaskedBits32le(loc, (val & 0xFFFF) << 5, 0xFFFF << 5);
518571
break;
519572
case R_AARCH64_MOVW_UABS_G1:
520573
checkUInt(loc, val, 32, rel);
521574
[[fallthrough]];
522575
case R_AARCH64_MOVW_UABS_G1_NC:
523-
or32le(loc, (val & 0xFFFF0000) >> 11);
576+
writeMaskedBits32le(loc, (val & 0xFFFF0000) >> 11, 0xFFFF0000 >> 11);
524577
break;
525578
case R_AARCH64_MOVW_UABS_G2:
526579
checkUInt(loc, val, 48, rel);
527580
[[fallthrough]];
528581
case R_AARCH64_MOVW_UABS_G2_NC:
529-
or32le(loc, (val & 0xFFFF00000000) >> 27);
582+
writeMaskedBits32le(loc, (val & 0xFFFF00000000) >> 27,
583+
0xFFFF00000000 >> 27);
530584
break;
531585
case R_AARCH64_MOVW_UABS_G3:
532-
or32le(loc, (val & 0xFFFF000000000000) >> 43);
586+
writeMaskedBits32le(loc, (val & 0xFFFF000000000000) >> 43,
587+
0xFFFF000000000000 >> 43);
533588
break;
534589
case R_AARCH64_MOVW_PREL_G0:
535590
case R_AARCH64_MOVW_SABS_G0:
@@ -562,15 +617,15 @@ void AArch64::relocate(uint8_t *loc, const Relocation &rel,
562617
break;
563618
case R_AARCH64_TSTBR14:
564619
checkInt(loc, val, 16, rel);
565-
or32le(loc, (val & 0xFFFC) << 3);
620+
writeMaskedBits32le(loc, (val & 0xFFFC) << 3, 0xFFFC << 3);
566621
break;
567622
case R_AARCH64_TLSLE_ADD_TPREL_HI12:
568623
checkUInt(loc, val, 24, rel);
569-
or32AArch64Imm(loc, val >> 12);
624+
write32Imm12(loc, val >> 12);
570625
break;
571626
case R_AARCH64_TLSLE_ADD_TPREL_LO12_NC:
572627
case R_AARCH64_TLSDESC_ADD_LO12:
573-
or32AArch64Imm(loc, val);
628+
write32Imm12(loc, val);
574629
break;
575630
case R_AARCH64_TLSDESC:
576631
// For R_AARCH64_TLSDESC the addend is stored in the second 64-bit word.

0 commit comments

Comments
 (0)