Skip to content

Commit 8e6bd58

Browse files
authored
Merge pull request #4202 from mstorsjo/arm64-unwind-pac
Document ARM64 unwind handling for return address signing
2 parents 6b1e5f6 + 4547444 commit 8e6bd58

File tree

1 file changed

+21
-17
lines changed

1 file changed

+21
-17
lines changed

docs/build/arm64-exception-handling.md

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: "ARM64 exception handling"
33
description: Describes the exception handling conventions and data used by windows on ARM64.
4-
ms.date: 04/07/2022
4+
ms.date: 10/10/2022
55
---
66
# ARM64 exception handling
77

@@ -313,6 +313,7 @@ The unwind codes are encoded according to the table below. All unwind codes are
313313
| | 11101010: Custom stack for MSFT_OP_CONTEXT |
314314
| | 11101100: Custom stack for MSFT_OP_CLEAR_UNWOUND_TO_CALL |
315315
| | 1111xxxx: reserved |
316+
| `pac_sign_return_address` | 11111100: sign the return address in `lr` with `pacibsp` |
316317

317318
In instructions with large values covering multiple bytes, the most significant bits are stored first. This design makes it possible to find the total size in bytes of the unwind code by looking up only the first byte of the code. Since each unwind code is exactly mapped to an instruction in a prolog or epilog, you can compute the size of the prolog or epilog. Walk from the sequence start to the end, and use a lookup table or similar device to determine the length of the corresponding opcode.
318319

@@ -343,9 +344,9 @@ The fields are as follows:
343344
- **Function Length** is an 11-bit field providing the length of the entire function in bytes, divided by 4. If the function is larger than 8k, a full `.xdata` record must be used instead.
344345
- **Frame Size** is a 9-bit field indicating the number of bytes of stack that is allocated for this function, divided by 16. Functions that allocate greater than (8k-16) bytes of stack must use a full `.xdata` record. It includes the local variable area, outgoing parameter area, callee-saved Int and FP area, and home parameter area. It excludes the dynamic allocation area.
345346
- **CR** is a 2-bit flag indicating whether the function includes extra instructions to set up a frame chain and return link:
346-
- 00 = unchained function, \<x29,lr> pair isn't saved in stack.
347+
- 00 = unchained function, \<x29,lr> pair isn't saved in stack
347348
- 01 = unchained function, \<lr> is saved in stack
348-
- 10 = reserved;
349+
- 10 = chained function with signed return address
349350
- 11 = chained function, a store/load pair instruction is used in prolog/epilog \<x29,lr>
350351
- **H** is a 1-bit flag indicating whether the function homes the integer parameter registers (x0-x7) by storing them at the very start of the function. (0 = doesn't home registers, 1 = homes registers).
351352
- **RegI** is a 4-bit field indicating the number of non-volatile INT registers (x19-x28) saved in the canonical stack location.
@@ -355,28 +356,31 @@ Canonical prologs that fall into categories 1, 2 (without outgoing parameter are
355356

356357
Step 0: Pre-compute of the size of each area.
357358

358-
Step 1: Save Int callee-saved registers.
359+
Step 1: Sign the return address.
359360

360-
Step 2: This step is specific for type 4 in early sections. lr is saved at the end of Int area.
361+
Step 2: Save Int callee-saved registers.
361362

362-
Step 3: Save FP callee-saved registers.
363+
Step 3: This step is specific for type 4 in early sections. lr is saved at the end of Int area.
363364

364-
Step 4: Save input arguments in the home parameter area.
365+
Step 4: Save FP callee-saved registers.
365366

366-
Step 5: Allocate remaining stack, including local area, \<x29,lr> pair, and outgoing parameter area. 5a corresponds to canonical type 1. 5b and 5c are for canonical type 2. 5d and 5e are for both type 3 and type 4.
367+
Step 5: Save input arguments in the home parameter area.
368+
369+
Step 6: Allocate remaining stack, including local area, \<x29,lr> pair, and outgoing parameter area. 6a corresponds to canonical type 1. 6b and 6c are for canonical type 2. 6d and 6e are for both type 3 and type 4.
367370

368371
| Step # | Flag values | # of instructions | Opcode | Unwind code |
369372
|--|--|--|--|--|
370373
| 0 | | | `#intsz = RegI * 8;`<br/>`if (CR==01) #intsz += 8; // lr`<br/>`#fpsz = RegF * 8;`<br/>`if(RegF) #fpsz += 8;`<br/>`#savsz=((#intsz+#fpsz+8*8*H)+0xf)&~0xf)`<br/>`#locsz = #famsz - #savsz` |
371-
| 1 | 0 < **RegI** <= 10 | **RegI** / 2 +<br/> **RegI** % 2 | `stp x19,x20,[sp,#savsz]!`<br/>`stp x21,x22,[sp,#16]`<br/>`...` | `save_regp_x`<br/>`save_regp`<br/>`...` |
372-
| 2 | **CR** == 01\* | 1 | `str lr,[sp,#(intsz-8)]`\* | `save_reg` |
373-
| 3 | 0 < **RegF** <= 7 | (**RegF** + 1) / 2 +<br/>(**RegF** + 1) % 2) | `stp d8,d9,[sp,#intsz]`\*\*<br/>`stp d10,d11,[sp,#(intsz+16)]`<br/>`...`<br/>`str d(8+RegF),[sp,#(intsz+fpsz-8)]` | `save_fregp`<br/>`...`<br/>`save_freg` |
374-
| 4 | **H** == 1 | 4 | `stp x0,x1,[sp,#(intsz+fpsz)]`<br/>`stp x2,x3,[sp,#(intsz+fpsz+16)]`<br/>`stp x4,x5,[sp,#(intsz+fpsz+32)]`<br/>`stp x6,x7,[sp,#(intsz+fpsz+48)]` | `nop`<br/>`nop`<br/>`nop`<br/>`nop` |
375-
| 5a | **CR** == 11 &&<br/> `#locsz` <= 512 | 2 | `stp x29,lr,[sp,#-locsz]!`<br/>`mov x29,sp`\*\*\* | `save_fplr_x`<br/>`set_fp` |
376-
| 5b | **CR** == 11 &&<br/>512 < `#locsz` <= 4080 | 3 | `sub sp,sp,#locsz`<br/>`stp x29,lr,[sp,0]`<br/>`add x29,sp,0` | `alloc_m`<br/>`save_fplr`<br/>`set_fp` |
377-
| 5c | **CR** == 11 &&<br/> `#locsz` > 4080 | 4 | `sub sp,sp,4080`<br/>`sub sp,sp,#(locsz-4080)`<br/>`stp x29,lr,[sp,0]`<br/>`add x29,sp,0` | `alloc_m`<br/>`alloc_s`/`alloc_m`<br/>`save_fplr`<br/>`set_fp` |
378-
| 5d | (**CR** == 00 \|\| **CR** == 01) &&<br/>`#locsz` <= 4080 | 1 | `sub sp,sp,#locsz` | `alloc_s`/`alloc_m` |
379-
| 5e | (**CR** == 00 \|\| **CR** == 01) &&<br/>`#locsz` > 4080 | 2 | `sub sp,sp,4080`<br/>`sub sp,sp,#(locsz-4080)` | `alloc_m`<br/>`alloc_s`/`alloc_m` |
374+
| 1 | **CR** == 11 | 1 | `pacibsp` | `sign_ra` |
375+
| 2 | 0 < **RegI** <= 10 | **RegI** / 2 +<br/> **RegI** % 2 | `stp x19,x20,[sp,#savsz]!`<br/>`stp x21,x22,[sp,#16]`<br/>`...` | `save_regp_x`<br/>`save_regp`<br/>`...` |
376+
| 3 | **CR** == 01\* | 1 | `str lr,[sp,#(intsz-8)]`\* | `save_reg` |
377+
| 4 | 0 < **RegF** <= 7 | (**RegF** + 1) / 2 +<br/>(**RegF** + 1) % 2) | `stp d8,d9,[sp,#intsz]`\*\*<br/>`stp d10,d11,[sp,#(intsz+16)]`<br/>`...`<br/>`str d(8+RegF),[sp,#(intsz+fpsz-8)]` | `save_fregp`<br/>`...`<br/>`save_freg` |
378+
| 5 | **H** == 1 | 4 | `stp x0,x1,[sp,#(intsz+fpsz)]`<br/>`stp x2,x3,[sp,#(intsz+fpsz+16)]`<br/>`stp x4,x5,[sp,#(intsz+fpsz+32)]`<br/>`stp x6,x7,[sp,#(intsz+fpsz+48)]` | `nop`<br/>`nop`<br/>`nop`<br/>`nop` |
379+
| 6a | (**CR** == 10 \|\| **CR** == 11) &&<br/> `#locsz` <= 512 | 2 | `stp x29,lr,[sp,#-locsz]!`<br/>`mov x29,sp`\*\*\* | `save_fplr_x`<br/>`set_fp` |
380+
| 6b | (**CR** == 10 \|\| **CR** == 11) &&<br/>512 < `#locsz` <= 4080 | 3 | `sub sp,sp,#locsz`<br/>`stp x29,lr,[sp,0]`<br/>`add x29,sp,0` | `alloc_m`<br/>`save_fplr`<br/>`set_fp` |
381+
| 6c | (**CR** == 10 \|\| **CR** == 11) &&<br/> `#locsz` > 4080 | 4 | `sub sp,sp,4080`<br/>`sub sp,sp,#(locsz-4080)`<br/>`stp x29,lr,[sp,0]`<br/>`add x29,sp,0` | `alloc_m`<br/>`alloc_s`/`alloc_m`<br/>`save_fplr`<br/>`set_fp` |
382+
| 6d | (**CR** == 00 \|\| **CR** == 01) &&<br/>`#locsz` <= 4080 | 1 | `sub sp,sp,#locsz` | `alloc_s`/`alloc_m` |
383+
| 6e | (**CR** == 00 \|\| **CR** == 01) &&<br/>`#locsz` > 4080 | 2 | `sub sp,sp,4080`<br/>`sub sp,sp,#(locsz-4080)` | `alloc_m`<br/>`alloc_s`/`alloc_m` |
380384

381385
\* If **CR** == 01 and **RegI** is an odd number, Step 2 and last `save_rep` in step 1 are merged into one `save_regp`.
382386

0 commit comments

Comments
 (0)