Skip to content

Commit f0cd709

Browse files
pa1guptahansendc
authored andcommitted
x86/its: Align RETs in BHB clear sequence to avoid thunking
The software mitigation for BHI is to execute BHB clear sequence at syscall entry, and possibly after a cBPF program. ITS mitigation thunks RETs in the lower half of the cacheline. This causes the RETs in the BHB clear sequence to be thunked as well, adding unnecessary branches to the BHB clear sequence. Since the sequence is in hot path, align the RET instructions in the sequence to avoid thunking. This is how disassembly clear_bhb_loop() looks like after this change: 0x44 <+4>: mov $0x5,%ecx 0x49 <+9>: call 0xffffffff81001d9b <clear_bhb_loop+91> 0x4e <+14>: jmp 0xffffffff81001de5 <clear_bhb_loop+165> 0x53 <+19>: int3 ... 0x9b <+91>: call 0xffffffff81001dce <clear_bhb_loop+142> 0xa0 <+96>: ret 0xa1 <+97>: int3 ... 0xce <+142>: mov $0x5,%eax 0xd3 <+147>: jmp 0xffffffff81001dd6 <clear_bhb_loop+150> 0xd5 <+149>: nop 0xd6 <+150>: sub $0x1,%eax 0xd9 <+153>: jne 0xffffffff81001dd3 <clear_bhb_loop+147> 0xdb <+155>: sub $0x1,%ecx 0xde <+158>: jne 0xffffffff81001d9b <clear_bhb_loop+91> 0xe0 <+160>: ret 0xe1 <+161>: int3 0xe2 <+162>: int3 0xe3 <+163>: int3 0xe4 <+164>: int3 0xe5 <+165>: lfence 0xe8 <+168>: pop %rbp 0xe9 <+169>: ret Suggested-by: Andrew Cooper <[email protected]> Signed-off-by: Pawan Gupta <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Alexandre Chartre <[email protected]>
1 parent facd226 commit f0cd709

File tree

1 file changed

+17
-3
lines changed

1 file changed

+17
-3
lines changed

arch/x86/entry/entry_64.S

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1525,7 +1525,9 @@ SYM_CODE_END(rewind_stack_and_make_dead)
15251525
* ORC to unwind properly.
15261526
*
15271527
* The alignment is for performance and not for safety, and may be safely
1528-
* refactored in the future if needed.
1528+
* refactored in the future if needed. The .skips are for safety, to ensure
1529+
* that all RETs are in the second half of a cacheline to mitigate Indirect
1530+
* Target Selection, rather than taking the slowpath via its_return_thunk.
15291531
*/
15301532
SYM_FUNC_START(clear_bhb_loop)
15311533
ANNOTATE_NOENDBR
@@ -1536,18 +1538,30 @@ SYM_FUNC_START(clear_bhb_loop)
15361538
call 1f
15371539
jmp 5f
15381540
.align 64, 0xcc
1541+
/*
1542+
* Shift instructions so that the RET is in the upper half of the
1543+
* cacheline and don't take the slowpath to its_return_thunk.
1544+
*/
1545+
.skip 32 - (.Lret1 - 1f), 0xcc
15391546
ANNOTATE_INTRA_FUNCTION_CALL
15401547
1: call 2f
1541-
RET
1548+
.Lret1: RET
15421549
.align 64, 0xcc
1550+
/*
1551+
* As above shift instructions for RET at .Lret2 as well.
1552+
*
1553+
* This should be ideally be: .skip 32 - (.Lret2 - 2f), 0xcc
1554+
* but some Clang versions (e.g. 18) don't like this.
1555+
*/
1556+
.skip 32 - 18, 0xcc
15431557
2: movl $5, %eax
15441558
3: jmp 4f
15451559
nop
15461560
4: sub $1, %eax
15471561
jnz 3b
15481562
sub $1, %ecx
15491563
jnz 1b
1550-
RET
1564+
.Lret2: RET
15511565
5: lfence
15521566
pop %rbp
15531567
RET

0 commit comments

Comments
 (0)