Skip to content

Commit 14b4dd0

Browse files
authored
Merge pull request #11302 from jckarter/key-path-abi-doc
2 parents 4a1b3bb + 548aab5 commit 14b4dd0

File tree

1 file changed

+306
-0
lines changed

1 file changed

+306
-0
lines changed

docs/ABI/KeyPaths.md

Lines changed: 306 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,306 @@
1+
# Key Path Memory Layout
2+
3+
**Key path objects** are laid out at runtime as a heap object with a
4+
variable-sized payload containing a sequence of encoded components describing
5+
how the key path traverses a value. When the compiler sees a key path literal,
6+
it generates a **key path pattern** that can be efficiently interpreted by
7+
the runtime to instantiate a key path object when needed. This document
8+
describes the layout of both. The key path pattern layout is designed in such a
9+
way that it can be transformed in-place into a key path object with a one-time
10+
initialization in the common case where the entire path is fully specialized
11+
and crosses no resilience boundaries.
12+
13+
## ABI Concerns For Key Paths
14+
15+
For completeness, this document describes the layout of both key path objects
16+
and patterns; note however that the instantiated runtime layout of key path
17+
objects is an implementation detail of the Swift runtime, and *only key path
18+
patterns* are strictly ABI, since they are emitted by the compiler. The
19+
runtime has the freedom to change the runtime layout of key path objects, but
20+
will have to maintain the ability to instantiate from key path patterns emitted
21+
by previous ABI-stable versions of the Swift complier.
22+
23+
## Key Path Objects
24+
25+
### Buffer Header
26+
27+
Key path objects begin with the standard Swift heap object header, followed by a
28+
key path object header. Relative to the start of the heap object header:
29+
30+
Offset | Description
31+
------- | ----------------------------------------------
32+
`0` | Pointer to KVC compatibility C string, or null
33+
`1*sizeof(Int)` | Key path buffer header (32 bits)
34+
35+
If the key path is Cocoa KVC-compatible, the first word will be a pointer to
36+
the equivalent KVC string as a null-terminated UTF-8 C string. It will be null
37+
otherwise. The **key path buffer header** in the second word contains the
38+
following bit fields:
39+
40+
Bits (LSB zero) | Description
41+
--------------- | -----------
42+
0...23 | **Buffer size** in bytes
43+
24...29 | Reserved. Must be zero in Swift 4 runtime
44+
30 | 1 = Has **reference prefix**, 0 = No reference prefix
45+
31 | 1 = Is **trivial**, 0 = Has destructor
46+
47+
The *buffer size* indicates the total size in bytes of the components following
48+
the key path buffer header. A `ReferenceWritableKeyPath` may have a *reference
49+
prefix* of read-only components that can be projected before initiating
50+
mutation; bit 30 is set if one is present. A key path may capture values that
51+
require cleanup when the key path object is deallocated, but a key path that
52+
does not capture any values with cleanups will have the *trivial* bit 31 set to
53+
fast-path deallocation.
54+
55+
Components are always pointer-aligned, so the first component always starts at
56+
offset `2*sizeof(Int)`. On 64-bit platforms, this leaves four bytes of padding.
57+
58+
### Components
59+
60+
After the buffer header, one or more **key path components** appear in memory
61+
in sequence. Each component begins with a 32-bit **key path component header**
62+
describing the following component.
63+
64+
Bits (LSB zero) | Description
65+
--------------- | -----------
66+
0...28 | **Payload** (meaning is dependent on component kind)
67+
29...30 | **Component kind**
68+
31 | 1 = **End of reference prefix**, 0 = Not end of reference prefix
69+
70+
If the key path has a *reference prefix*, then exactly one component must have
71+
the *end of reference prefix* bit set in its component header. This indicates
72+
that the component after the end of the reference prefix will initiate mutation.
73+
74+
The following *component kinds* are recognized:
75+
76+
Value in bit 30&29 | Description
77+
------------------ | -----------
78+
0 | Struct/tuple/self stored property
79+
1 | Computed
80+
2 | Class stored property
81+
3 | Optional chaining/forcing/wrapping
82+
83+
- A **struct stored property** component, when given
84+
a value of the base type in memory, can project the component value in-place
85+
at a fixed offset within the base value. This applies for struct stored
86+
properties, tuple fields, and the `.self` identity component (which trivially
87+
projects at offset zero). The
88+
*payload* contains the offset in bytes of the projected field in the
89+
aggregate, or the special value `0x1FFF_FFFF`, which indicates that the
90+
offset is too large to pack into the payload and is stored in the next 32 bits
91+
after the header.
92+
- A **class stored property** component, when given a reference to a class
93+
instance, can project the component value inside the class instance at
94+
a fixed offset. The *payload*
95+
*payload* contains the offset in bytes of the projected field from the
96+
address point of the object, or the special value `0x1FFF_FFFF`, which
97+
indicates that the offset is too large to pack into the payload and is stored
98+
in the next 32 bits after the header.
99+
- An **optional** component performs an operation involving `Optional` values.
100+
The `payload` contains one of the following values:
101+
102+
Value in payload | Description
103+
---------------- | -----------
104+
0 | **Optional chaining**
105+
1 | **Optional wrapping**
106+
2 | **Optional force-unwrapping**
107+
108+
A *chaining* component behaves like the postfix `?` operator, immediately
109+
ending the key path application and returning nil when the base value is nil,
110+
or unwrapping the base value and continuing projection on the non-optional
111+
payload when non-nil. If an optional chain ends in a non-optional value,
112+
an implicit *wrapping* component is inserted to wrap it up in an
113+
optional value. A *force-unwrapping* operator behaves like the postfix
114+
`!` operator, trapping if the base value is nil, or unwrapping the value
115+
inside the optional if not.
116+
117+
- A **computed** component uses the conservative access pattern of `get`/`set`
118+
/`materializeForSet` to project from the base value. This is used as a
119+
general fallback component for any key path component without a more
120+
specialized representation, including not only computed properties but
121+
also subscripts, stored properties that require reabstraction, properties
122+
with behaviors or custom key path components (when we get those), and weak or
123+
unowned properties. The payload contains additional bitfields describing the
124+
component:
125+
126+
Bits (LSB zero) | Description
127+
--------------- | -----------
128+
24 | 1 = **Has captured arguments**, 0 = no captures
129+
25...26 | **Identifier kind**
130+
27 | 1 = **Settable**, 0 = **Get-Only**
131+
28 | 1 = **Mutating** (implies settable), 0 = Nonmutating
132+
133+
The component can *capture* context which is stored after the component in
134+
the key path object, such as generic arguments from its original context,
135+
subscript index arguments, and so on. Bit 24 is set if there are any such
136+
captures. Bits 25 and 26 discriminate the *identifier* which is used to
137+
determine equality of key paths referring to the same components. If
138+
bit 27 is set, then the key path is **settable** and can be written through,
139+
and bit 28 indicates whether the set operation **is mutating** to the base
140+
value, that is, whether setting through the component changes the base value
141+
like a value-semantics property or modifies state indirectly like a class
142+
property or `UnsafePointer.pointee`.
143+
144+
After the header, the component contains the following word-aligned fields:
145+
146+
Offset from header | Description
147+
------------------ | -----------
148+
`1*sizeof(Int)` | The **identifier** of the component.
149+
`2*sizeof(Int)` | The **getter function** for the component.
150+
`3*sizeof(Int)` | (if settable) The **setter function** for the component
151+
152+
The combination of the identifier kind bits and the identifier word are
153+
compared by the `==` operation on two key paths to determine whether they
154+
are equivalent. Neither the kind bits nor the identifier word
155+
have any stable semantic meaning other than as unique identifiers.
156+
In practice, the compiler picks a stable unique artifact of the
157+
underlying declaration, such as the naturally-abstracted getter entry point
158+
for a computed property, the offset of a reabstracted stored property, or
159+
an Objective-C selector for an imported ObjC property, to identify the
160+
component. The identifier kind bits are used to discriminate
161+
possibly-overlapping domains.
162+
163+
The getter function is a pointer to a Swift function with the signature
164+
`@convention(thin) (@in Base, UnsafeRawPointer) -> @out Value`. When
165+
the component is applied, the getter is invoked with a copy of the base
166+
value and is passed a pointer to the captured arguments of the
167+
component. If the component has no captures, the second argument is
168+
undefined.
169+
170+
The setter function is also a pointer to a Swift function. This field is
171+
only present if the *settable* bit of the header is set. If the
172+
component is nonmutating, then the function has signature
173+
`@convention(thin) (@in Base, @in Value, UnsafeRawPointer) -> ()`,
174+
or if it is mutating, then the function has signature
175+
`@convention(thin) (@inout Base, @in Value, UnsafeRawPointer) -> ()`.
176+
When a mutating application of the key path is completed, the setter is
177+
invoked with a copy of the base value (if nonmutating) or a reference to
178+
the base value (if mutating), along with a copy of the updated component
179+
value, and a pointer to the captured arguments of the component. If
180+
the component has no captures, the third argument is undefined.
181+
182+
TODO: Make getter/nonmutating setter take base borrowed,
183+
yield borrowed result (materializeForGet); use materializeForSet
184+
185+
If the component has captures, the capture area appears after the other
186+
fields, at offset `3*sizeof(Int)` for a get-only component or
187+
`4*sizeof(Int)` for a settable component. The area begins with a two-word
188+
header:
189+
190+
Offset from start | Description
191+
----------------- | -----------
192+
`0` | Size of captures in bytes
193+
`1*sizeof(Int)` | Pointer to **argument witness table**
194+
195+
followed by the captures themselves. The *argument witness table* contains
196+
pointers to functions needed for maintaining the captures:
197+
198+
Offset | Description
199+
---------------- | -----------
200+
`0` | **Destroy**, or null if trivial
201+
`1*sizeof(Int)` | **Copy**
202+
`2*sizeof(Int)` | **Is Equal**
203+
`3*sizeof(Int)` | **Hash**
204+
205+
The *destroy* function, if not null, has signature
206+
`@convention(thin) (UnsafeMutableRawPointer) -> ()` and is invoked to
207+
destroy the captures when the key path object is deallocated.
208+
209+
The *copy* function has signature
210+
`@convention(thin) (_ src: UnsafeRawPointer,
211+
_ dest: UnsafeMutableRawPointer) -> ()`
212+
and is invoked when the captures need to be copied into a new key path
213+
object, for example when two key paths are appended.
214+
215+
The *is equal* function has signature
216+
`@convention(thin) (UnsafeRawPointer, UnsafeRawPointer) -> Bool`
217+
and is invoked when the component is compared for equality with another
218+
computed component with the same identifier.
219+
220+
The *hash* function has signature
221+
`@convention(thin) (UnsafeRawPointer, UnsafeRawPointer) -> Int`
222+
and is invoked when the key path containing the component is hashed.
223+
The implementation understands a return value of zero to mean that the
224+
captures should have no effect on the hash value of the key path.
225+
226+
After every component except for the final component, a pointer-aligned
227+
pointer to the metadata for the type of the projected component is stored.
228+
(The type of the final component can be found from the `Value` generic
229+
argument of the `KeyPath<Root, Value>` type.)
230+
231+
### Examples
232+
233+
Given:
234+
235+
```swift
236+
struct A {
237+
var padding: (128 x UInt8)
238+
var b: B
239+
}
240+
241+
class B {
242+
var padding: (240 x UInt8)
243+
var c: C
244+
}
245+
246+
struct C {
247+
var padding: (384 x UInt8)
248+
var d: D
249+
}
250+
```
251+
252+
On a 64-bit platform, a key path object representing `\A.b.c.d` might look like
253+
this in memory:
254+
255+
Word | Contents
256+
---- | --------
257+
0 | isa pointer to `ReferenceWritableKeyPath<A, D>`
258+
1 | reference counts
259+
`-` | `-`
260+
2 | buffer header 0xC000_0028 - trivial, reference prefix, buffer size 40
261+
`-` | `-`
262+
3 | component header 0x8000_0080 - struct component, offset 128, end of prefix
263+
4 | type metadata pointer for `B`
264+
`-` | `-`
265+
5 | component header 0x4000_0100 - class component, offset 256
266+
6 | type metadata pointer for `C`
267+
`-` | `-`
268+
7 | component header 0x0000_0180 - struct component, offset 384
269+
270+
If we add:
271+
272+
```
273+
struct D {
274+
var computed: E { get set }
275+
}
276+
277+
struct E {
278+
subscript(b: B) -> F { get }
279+
}
280+
```
281+
282+
then `\D.e[B()]` would look like:
283+
284+
Word | Contents
285+
---- | --------
286+
0 | isa pointer to `WritableKeyPath<D, E>`
287+
1 | reference counts
288+
`-` | `-`
289+
2 | buffer header 0x0000_0058 - buffer size 88
290+
`-` | `-`
291+
3 | component header 0x3800_0000 - computed, settable, mutating
292+
4 | identifier pointer
293+
5 | getter
294+
6 | setter
295+
7 | type metadata pointer for `F`
296+
`-` | `-`
297+
8 | component header 0x2100_0000 - computed, has captures
298+
9 | identifier pointer
299+
10 | getter
300+
11 | argument size 8
301+
12 | pointer to argument witnesses for releasing/retaining/equating/hashing `B`
302+
13 | value of `B()`
303+
304+
## Key Path Patterns
305+
306+
(to be written)

0 commit comments

Comments
 (0)