@@ -2043,6 +2043,177 @@ parts::
2043
2043
return %1 : $Klass
2044
2044
}
2045
2045
2046
+ Borrowed Object based Safe Interior Pointers
2047
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2048
+
2049
+ What is an "Unsafe Interior Pointer"
2050
+ ````````````````````````````````````
2051
+
2052
+ An unsafe interior pointer is a bare pointer into the innards of an object. A
2053
+ simple example of this in C++ would be using the method std::vector: :data() to
2054
+ get to the innards of a std::vector. In general interior pointers are unsafe to
2055
+ use since languages do not provide any guarantees that the interior pointer will
2056
+ not be used after the underlying object has been deallocated. To see this,
2057
+ consider the following C++ example::
2058
+
2059
+ int unfortunateFunction() {
2060
+ int *unsafeInteriorPointer = nullptr;
2061
+ {
2062
+ std::vector<int> vector;
2063
+ vector.push_back(5);
2064
+ unsafeInteriorPointer = vector.data();
2065
+ printf("%d\n", *unsafeInteriorPointer); // Prints "5".
2066
+ } // vector deallocated here
2067
+ return *unsafeInteriorPointer; // Kaboom
2068
+ }
2069
+
2070
+ In words, C++ allows for us to get the interior pointer into the vector, but
2071
+ then lets us do whatever we want with the pointer, including use it after the
2072
+ underlying memory has been invalidated.
2073
+
2074
+ From a user's perspective, interior pointers are really useful since one can use
2075
+ it to pass data to other APIs that are only expecting a pointer and also since
2076
+ one can use it to sometimes get better performance. But from a language designer
2077
+ perspective, this sort of API verboten and leads to bugs, crashes, and security
2078
+ vulnerabilities. That being said, clearly users have a need for such
2079
+ functionality, so we, as language designers, should figure out manners to
2080
+ express these sorts of patterns in our various languages in a safe way that
2081
+ prevents user’s from foot-gunning themselves. In SIL, we have solved this
2082
+ problem via the direct modeling of interior pointer instructions as a high level
2083
+ concept in our IR.
2084
+
2085
+ Safe Interior Pointers in SIL
2086
+ `````````````````````````````
2087
+
2088
+ In contrast to LLVM-IR, SIL provides mechanisms that language designers can use
2089
+ to express concepts like the above in a manner that allows the language to
2090
+ define away compiler generated unsafe interior pointer usage using "Safe
2091
+ Interior Pointers". This is implemented in SIL by:
2092
+
2093
+ 1. Classifying a set of instructions as being "interior pointer" instructions.
2094
+ 2. Enforcing in the SILVerifier that all "interior pointer" instructions can
2095
+ only have operands with `Guaranteed `_ ownership.
2096
+ 3. Enforcing in the SILVerifier that any transitive address use of the interior
2097
+ pointer to be a liveness requirement of the "interior pointer"'s
2098
+ operand.
2099
+
2100
+ Note that the transitive address use verifier from (3) does not attempt to
2101
+ classify uses directly. Instead the verifier:
2102
+
2103
+ 1. Has an explicit list of instructions that it understands as requiring
2104
+ liveness of the base object.
2105
+
2106
+ 2. Has a second list of instructions that require liveness and produce a address
2107
+ whose transitive uses need to be recursively processed.
2108
+
2109
+ 3. Asserts on any instructions that are not known to the verifier. This ensures
2110
+ that the verifier is kept up to date with new instructions.
2111
+
2112
+ Note that typically instructions in category (1) are instructions whose uses do
2113
+ not propagate the pointer value, so they are safe. In contrast, some other
2114
+ instructions in category (1) are escaping uses of the address such as
2115
+ `pointer_to_address `_. Those uses are unsafe--the user is reponsible for
2116
+ managing unsafe pointer lifetimes and the compiler must not extend those pointer
2117
+ lifetimes.
2118
+
2119
+ These rules ensure statically that any uses of the address that are not escaped
2120
+ explicitly by an instruction like `pointer_to_address `_ are within the
2121
+ guaranteed pointers scope where the guaranteed value is statically known to be
2122
+ live. As a result, in SIL it is impossible to express such a bug in compiler
2123
+ generated code. As an example, consider the following unsafe interior pointer
2124
+ SIL::
2125
+
2126
+ class Klass { var k: KlassField }
2127
+ struct KlassWrapper { var k: Klass }
2128
+
2129
+ // ...
2130
+
2131
+ // Today SIL restricts interior pointer instructions to only have operands
2132
+ // with guaranteed ownership.
2133
+ %1 = begin_borrow %0 : $Klass
2134
+
2135
+ // %2 is an interior pointer into %1. Since %2 is an address, it's uses are
2136
+ // not treated as uses of underlying borrowed object %1 in the ownership
2137
+ // system. This is because at the ownership level objects with None
2138
+ // ownership are not verified and do not have any constraints on how they
2139
+ // are used from the ownership system.
2140
+ //
2141
+ // Instead the ownership verifier gathers up all such uses and treats them
2142
+ // as uses of the object from which the interior pointer was projected from
2143
+ // transitively. This means that this is a constraint on the guaranteed
2144
+ // objects use, not on the trivial values.
2145
+ %2 = ref_element_addr %1 : $Klass, #Klass.k // %2 is a $*KlassWrapper
2146
+ %3 = struct_element_addr %2 : $*KlassWrapper, #KlassWrapper.k // %3 is a $*Klass
2147
+
2148
+ // So if we end the borrow %1 at this point, invalidating the addresses
2149
+ // ``%2`` and ``%3``.
2150
+ end_borrow %1 : $Klass
2151
+
2152
+ // We would here be loading from an invalidated address. This would cause a
2153
+ // verifier error since %3's use here is a regular use that is inferred up
2154
+ // on %1.
2155
+ %4 = load [copy] %3 : $*KlassWrapper
2156
+
2157
+ // ...
2158
+
2159
+ Notice how due to a possible bug in the compiler, we are loading from
2160
+ potentially uninitialized memory ``%4 ``. This would have caused a verifier error
2161
+ stating that ``%4 `` was an interior pointer based use-after-free of ``%1 ``
2162
+ implying this is mal-formed SIL.
2163
+
2164
+ NOTE: This is a constraint on the base object, not on the addresses themselves
2165
+ which are viewed as outside of the ownership system since they have `None `_
2166
+ ownership.
2167
+
2168
+ In contrast to the previous example, the following example follows ownership
2169
+ invariants and is valid SIL::
2170
+
2171
+ class Klass { var k: KlassField }
2172
+ struct KlassWrapper { var k: Klass }
2173
+
2174
+ // ...
2175
+
2176
+ %1 = begin_borrow %0 : $Klass
2177
+ // %2 is an interior pointer into the Klass k. Since %2 is an address and
2178
+ // addresses have None ownership, it's uses are not treated as uses of the
2179
+ // underlying object %1.
2180
+ %2 = ref_element_addr %1 : $Klass, #Klass.k // %2 is a $*KlassWrapper
2181
+
2182
+ // Destroying %1 at this location would result in a verifier error since
2183
+ // %2's uses are considered to be uses of %1.
2184
+ //
2185
+ // end_lifetime %1 : $Klass
2186
+
2187
+ // We are statically not loading from an invalidated address here since we
2188
+ // are within the lifetime of ``%1``.
2189
+ %3 = struct_element_addr %2 : $*KlassWrapper, #KlassWrapper.k
2190
+ %4 = load [copy] %3 : $*Klass // %1 must be live here transitively
2191
+
2192
+ // ``%1``'s lifetime ends. Importantly we know that within the lifetime of
2193
+ // ``%1``, ``%0``'s lifetime can not shrink past this point, implying
2194
+ // transitive static safety.
2195
+ end_borrow %1 : $Klass
2196
+
2197
+ In the second example, we show a well-formed SIL program showing off SIL's Safe
2198
+ Interior Pointers. All of the uses of ``%2 ``, the interior pointer, are
2199
+ transitively uses of the base underlying object, ``%0 ``.
2200
+
2201
+ The current list of interior pointer SIL instructions are:
2202
+
2203
+ * `project_box `_ - projects a pointer out of a reference counted box. (*)
2204
+ * `ref_element_addr `_ - projects a field out of a reference counted class.
2205
+ * `ref_tail_addr `_ - projects out a pointer to a class’s tail allocated array
2206
+ memory (assuming the class was initialized to have such an array).
2207
+ * `open_existential_box `_ - projects the address of the value out of a boxed
2208
+ existential container using the current function context/protocol conformance
2209
+ to create an "opened archetype".
2210
+ * `project_existential_box `_ - projects a pointer to the value inside a boxed
2211
+ existential container. Must be the type for which the box was initially
2212
+ allocated for and not for an "opened" archetype.
2213
+
2214
+ (*) We still need to finish adding support for project_box, but all other
2215
+ interior pointers are guarded already.
2216
+
2046
2217
Runtime Failure
2047
2218
---------------
2048
2219
0 commit comments