Skip to content

Commit 4aa1ef3

Browse files
authored
Merge pull request #34618 from gottesmm/pr-1639094f806f1130063c975fee6b1f74f2babb6a
[ownership] Add a section to SIL.rst that describes the semantics of safe interior pointers in Ownership SIL.
2 parents c2b13cd + 9712e45 commit 4aa1ef3

File tree

1 file changed

+171
-0
lines changed

1 file changed

+171
-0
lines changed

docs/SIL.rst

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2043,6 +2043,177 @@ parts::
20432043
return %1 : $Klass
20442044
}
20452045

2046+
Borrowed Object based Safe Interior Pointers
2047+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2048+
2049+
What is an "Unsafe Interior Pointer"
2050+
````````````````````````````````````
2051+
2052+
An unsafe interior pointer is a bare pointer into the innards of an object. A
2053+
simple example of this in C++ would be using the method std::vector::data() to
2054+
get to the innards of a std::vector. In general interior pointers are unsafe to
2055+
use since languages do not provide any guarantees that the interior pointer will
2056+
not be used after the underlying object has been deallocated. To see this,
2057+
consider the following C++ example::
2058+
2059+
int unfortunateFunction() {
2060+
int *unsafeInteriorPointer = nullptr;
2061+
{
2062+
std::vector<int> vector;
2063+
vector.push_back(5);
2064+
unsafeInteriorPointer = vector.data();
2065+
printf("%d\n", *unsafeInteriorPointer); // Prints "5".
2066+
} // vector deallocated here
2067+
return *unsafeInteriorPointer; // Kaboom
2068+
}
2069+
2070+
In words, C++ allows for us to get the interior pointer into the vector, but
2071+
then lets us do whatever we want with the pointer, including use it after the
2072+
underlying memory has been invalidated.
2073+
2074+
From a user's perspective, interior pointers are really useful since one can use
2075+
it to pass data to other APIs that are only expecting a pointer and also since
2076+
one can use it to sometimes get better performance. But from a language designer
2077+
perspective, this sort of API verboten and leads to bugs, crashes, and security
2078+
vulnerabilities. That being said, clearly users have a need for such
2079+
functionality, so we, as language designers, should figure out manners to
2080+
express these sorts of patterns in our various languages in a safe way that
2081+
prevents user’s from foot-gunning themselves. In SIL, we have solved this
2082+
problem via the direct modeling of interior pointer instructions as a high level
2083+
concept in our IR.
2084+
2085+
Safe Interior Pointers in SIL
2086+
`````````````````````````````
2087+
2088+
In contrast to LLVM-IR, SIL provides mechanisms that language designers can use
2089+
to express concepts like the above in a manner that allows the language to
2090+
define away compiler generated unsafe interior pointer usage using "Safe
2091+
Interior Pointers". This is implemented in SIL by:
2092+
2093+
1. Classifying a set of instructions as being "interior pointer" instructions.
2094+
2. Enforcing in the SILVerifier that all "interior pointer" instructions can
2095+
only have operands with `Guaranteed`_ ownership.
2096+
3. Enforcing in the SILVerifier that any transitive address use of the interior
2097+
pointer to be a liveness requirement of the "interior pointer"'s
2098+
operand.
2099+
2100+
Note that the transitive address use verifier from (3) does not attempt to
2101+
classify uses directly. Instead the verifier:
2102+
2103+
1. Has an explicit list of instructions that it understands as requiring
2104+
liveness of the base object.
2105+
2106+
2. Has a second list of instructions that require liveness and produce a address
2107+
whose transitive uses need to be recursively processed.
2108+
2109+
3. Asserts on any instructions that are not known to the verifier. This ensures
2110+
that the verifier is kept up to date with new instructions.
2111+
2112+
Note that typically instructions in category (1) are instructions whose uses do
2113+
not propagate the pointer value, so they are safe. In contrast, some other
2114+
instructions in category (1) are escaping uses of the address such as
2115+
`pointer_to_address`_. Those uses are unsafe--the user is reponsible for
2116+
managing unsafe pointer lifetimes and the compiler must not extend those pointer
2117+
lifetimes.
2118+
2119+
These rules ensure statically that any uses of the address that are not escaped
2120+
explicitly by an instruction like `pointer_to_address`_ are within the
2121+
guaranteed pointers scope where the guaranteed value is statically known to be
2122+
live. As a result, in SIL it is impossible to express such a bug in compiler
2123+
generated code. As an example, consider the following unsafe interior pointer
2124+
SIL::
2125+
2126+
class Klass { var k: KlassField }
2127+
struct KlassWrapper { var k: Klass }
2128+
2129+
// ...
2130+
2131+
// Today SIL restricts interior pointer instructions to only have operands
2132+
// with guaranteed ownership.
2133+
%1 = begin_borrow %0 : $Klass
2134+
2135+
// %2 is an interior pointer into %1. Since %2 is an address, it's uses are
2136+
// not treated as uses of underlying borrowed object %1 in the ownership
2137+
// system. This is because at the ownership level objects with None
2138+
// ownership are not verified and do not have any constraints on how they
2139+
// are used from the ownership system.
2140+
//
2141+
// Instead the ownership verifier gathers up all such uses and treats them
2142+
// as uses of the object from which the interior pointer was projected from
2143+
// transitively. This means that this is a constraint on the guaranteed
2144+
// objects use, not on the trivial values.
2145+
%2 = ref_element_addr %1 : $Klass, #Klass.k // %2 is a $*KlassWrapper
2146+
%3 = struct_element_addr %2 : $*KlassWrapper, #KlassWrapper.k // %3 is a $*Klass
2147+
2148+
// So if we end the borrow %1 at this point, invalidating the addresses
2149+
// ``%2`` and ``%3``.
2150+
end_borrow %1 : $Klass
2151+
2152+
// We would here be loading from an invalidated address. This would cause a
2153+
// verifier error since %3's use here is a regular use that is inferred up
2154+
// on %1.
2155+
%4 = load [copy] %3 : $*KlassWrapper
2156+
2157+
// ...
2158+
2159+
Notice how due to a possible bug in the compiler, we are loading from
2160+
potentially uninitialized memory ``%4``. This would have caused a verifier error
2161+
stating that ``%4`` was an interior pointer based use-after-free of ``%1``
2162+
implying this is mal-formed SIL.
2163+
2164+
NOTE: This is a constraint on the base object, not on the addresses themselves
2165+
which are viewed as outside of the ownership system since they have `None`_
2166+
ownership.
2167+
2168+
In contrast to the previous example, the following example follows ownership
2169+
invariants and is valid SIL::
2170+
2171+
class Klass { var k: KlassField }
2172+
struct KlassWrapper { var k: Klass }
2173+
2174+
// ...
2175+
2176+
%1 = begin_borrow %0 : $Klass
2177+
// %2 is an interior pointer into the Klass k. Since %2 is an address and
2178+
// addresses have None ownership, it's uses are not treated as uses of the
2179+
// underlying object %1.
2180+
%2 = ref_element_addr %1 : $Klass, #Klass.k // %2 is a $*KlassWrapper
2181+
2182+
// Destroying %1 at this location would result in a verifier error since
2183+
// %2's uses are considered to be uses of %1.
2184+
//
2185+
// end_lifetime %1 : $Klass
2186+
2187+
// We are statically not loading from an invalidated address here since we
2188+
// are within the lifetime of ``%1``.
2189+
%3 = struct_element_addr %2 : $*KlassWrapper, #KlassWrapper.k
2190+
%4 = load [copy] %3 : $*Klass // %1 must be live here transitively
2191+
2192+
// ``%1``'s lifetime ends. Importantly we know that within the lifetime of
2193+
// ``%1``, ``%0``'s lifetime can not shrink past this point, implying
2194+
// transitive static safety.
2195+
end_borrow %1 : $Klass
2196+
2197+
In the second example, we show a well-formed SIL program showing off SIL's Safe
2198+
Interior Pointers. All of the uses of ``%2``, the interior pointer, are
2199+
transitively uses of the base underlying object, ``%0``.
2200+
2201+
The current list of interior pointer SIL instructions are:
2202+
2203+
* `project_box`_ - projects a pointer out of a reference counted box. (*)
2204+
* `ref_element_addr`_ - projects a field out of a reference counted class.
2205+
* `ref_tail_addr`_ - projects out a pointer to a class’s tail allocated array
2206+
memory (assuming the class was initialized to have such an array).
2207+
* `open_existential_box`_ - projects the address of the value out of a boxed
2208+
existential container using the current function context/protocol conformance
2209+
to create an "opened archetype".
2210+
* `project_existential_box`_ - projects a pointer to the value inside a boxed
2211+
existential container. Must be the type for which the box was initially
2212+
allocated for and not for an "opened" archetype.
2213+
2214+
(*) We still need to finish adding support for project_box, but all other
2215+
interior pointers are guarded already.
2216+
20462217
Runtime Failure
20472218
---------------
20482219

0 commit comments

Comments
 (0)