Skip to content

Commit 5cc6799

Browse files
authored
[docs] Clarify SIL-level TBAA rules. (#2147)
This documentation now reflects the optimizer's reality. I'm still working on formalizing language level rules for strict aliasing. Those will be introduced in a separate doc file. There some language about alias-introducing operations. This concept existed in the old documentation but was never really implemented. This all makes sense now that we have a formal model for binding memory to a type along with specific variants of pointer_to_address that either enforce strict aliasing or permit type punning. The detailed explanation of TBAA should probably be moved into a separate optimizer document, but there isn't a good place for it yet.
1 parent 93a22f1 commit 5cc6799

File tree

1 file changed

+92
-11
lines changed

1 file changed

+92
-11
lines changed

docs/SIL.rst

Lines changed: 92 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1538,15 +1538,16 @@ Class TBAA
15381538

15391539
Class instances and other *heap object references* are pointers at the
15401540
implementation level, but unlike SIL addresses, they are first class values and
1541-
can be ``capture``-d and alias. Swift, however, is memory-safe and statically
1541+
can be ``capture``-d and aliased. Swift, however, is memory-safe and statically
15421542
typed, so aliasing of classes is constrained by the type system as follows:
15431543

15441544
* A ``Builtin.NativeObject`` may alias any native Swift heap object,
15451545
including a Swift class instance, a box allocated by ``alloc_box``,
15461546
or a thick function's closure context.
15471547
It may not alias natively Objective-C class instances.
1548-
* A ``Builtin.UnknownObject`` may alias any class instance, whether Swift or
1549-
Objective-C, but may not alias non-class-instance heap objects.
1548+
* A ``Builtin.UnknownObject`` or ``Builtin.BridgeObject`` may alias
1549+
any class instance, whether Swift or Objective-C, but may not alias
1550+
non-class-instance heap objects.
15501551
* Two values of the same class type ``$C`` may alias. Two values of related
15511552
class type ``$B`` and ``$D``, where there is a subclass relationship between
15521553
``$B`` and ``$D``, may alias. Two values of unrelated class types may not
@@ -1560,6 +1561,15 @@ typed, so aliasing of classes is constrained by the type system as follows:
15601561
potentially alias concrete instances of the generic type, such as
15611562
``$C<Int>``, because ``Int`` is a potential substitution for ``T``.
15621563

1564+
A violation of the above aliasing rules only results in undefined
1565+
behavior if the aliasing references are dereferenced within Swift code.
1566+
For example,
1567+
``_SwiftNativeNS[Array|Dictionary|String]`` classes alias with
1568+
``NS[Array|Dictionary|String]`` classes even though they are not
1569+
statically related. Since Swift never directly accesses stored
1570+
properties on the Foundation classes, this aliasing does not pose a
1571+
danger.
1572+
15631573
Typed Access TBAA
15641574
~~~~~~~~~~~~~~~~~
15651575

@@ -1571,15 +1581,86 @@ Define a *typed access* of an address or reference as one of the following:
15711581
typed projection operation (e.x. ``ref_element_addr``,
15721582
``tuple_element_addr``).
15731583

1574-
It is undefined behavior to perform a typed access to an address or reference if
1575-
the stored object or referent is not an allocated object of the relevant type.
1584+
With limited exceptions, it is undefined behavior to perform a typed access to
1585+
an address or reference addressed memory is not bound to the relevant type.
1586+
1587+
This allows the optimizer to assume that two addresses cannot alias if
1588+
there does not exist a substitution of archetypes that could cause one
1589+
of the types to be the type of a subobject of the other. Additionally,
1590+
this applies to the types of the values from which the addresses were
1591+
derived via a typed projection.
15761592

1577-
This allows the optimizer to assume that two addresses cannot alias if there
1578-
does not exist a substitution of archetypes that could cause one of the types to
1579-
be the type of a subobject of the other. Additionally, this applies to the types
1580-
of the values from which the addresses were derived, ignoring "blessed"
1581-
alias-introducing operations such as ``pointer_to_address``, the ``bitcast``
1582-
intrinsic, and the ``inttoptr`` intrinsic.
1593+
Consider the following SIL::
1594+
1595+
struct Element {
1596+
var i: Int
1597+
}
1598+
struct S1 {
1599+
var elt: Element
1600+
}
1601+
struct S2 {
1602+
var elt: Element
1603+
}
1604+
%adr1 = struct_element_addr %ptr1 : $*S1, #S.elt
1605+
%adr2 = struct_element_addr %ptr2 : $*S2, #S.elt
1606+
1607+
The optimizer may assume that ``%adr1`` does not alias with ``%adr2``
1608+
because the values that the addresses are derived from (``%ptr1`` and
1609+
``%ptr2``) have unrelated types. However, in the following example,
1610+
the optimizer cannot assume that ``%adr1`` does not alias with
1611+
``%adr2`` because ``%adr2`` is derived from a cast, and any subsequent
1612+
typed operations on the address will refer to the common ``Element`` type::
1613+
1614+
%adr1 = struct_element_addr %ptr1 : $*S1, #S.elt
1615+
%adr2 = pointer_to_address %ptr2 : $Builtin.RawPointer to $*Element
1616+
1617+
Exceptions to typed access TBAA rules are only allowed for blessed
1618+
alias-introducing operations. This permits limited type-punning. The only
1619+
current exception is the non-struct ``pointer_to_address`` variant. The
1620+
optimizer must be able to defensively determine that none of the *roots* of an
1621+
address are alias-introducing operations. An address root is the operation that
1622+
produces the address prior to applying any typed projections, indexing, or
1623+
casts. The following are valid address roots:
1624+
1625+
* Object allocation that generates an address, such as ``alloc_stack``
1626+
and ``alloc_box``.
1627+
1628+
* Address-type function arguments. These are crucially *not* considered
1629+
alias-introducing operations. It is illegal for the SIL optimizer to
1630+
form a new function argument from an arbitrary address-type
1631+
value. Doing so would require the optimizer to guarantee that the
1632+
new argument is both has a non-alias-introducing address root and
1633+
can be properly represented by the calling convention (address types
1634+
do not have a fixed representation).
1635+
1636+
* A strict cast from an untyped pointer, ``pointer_to_address [strict]``. It is
1637+
illegal for ``pointer_to_address [strict]`` to derive its address from an
1638+
alias-introducing operation's value. A type punned address may only be
1639+
produced from an opaque pointer via a non-strict ``pointer_to_address`` at the
1640+
point of conversion.
1641+
1642+
Address-to-address casts, via ``unchecked_addr_cast``, transparently
1643+
forward their source's address root, just like typed projections.
1644+
1645+
Address-type basic block arguments can be conservatively considered
1646+
aliasing-introducing operations; they are uncommon enough not to
1647+
matter and may eventually be prohibited altogether.
1648+
1649+
Although some pointer producing intrinsics exist, they do not need to be
1650+
considered alias-introducing exceptions to TBAA rules. ``Builtin.inttoptr``
1651+
produces a ``Builtin.RawPointer`` which is not interesting because by definition
1652+
it may alias with everything. Similarly, the LLVM builtins ``Builtin.bitcast``
1653+
and ``Builtin.trunc|sext|zextBitCast`` cannot produce typed pointers. These
1654+
pointer values must be converted to an address via ``pointer_to_address`` before
1655+
typed access can occur. Whether the ``pointer_to_address`` is strict determines
1656+
whether aliasing may occur.
1657+
1658+
Memory may be rebound to an unrelated type. Addresses to unrelated types may
1659+
alias as long as typed access only occurs while memory is bound to the relevant
1660+
type. Consequently, the optimizer cannot outright assume that addresses accessed
1661+
as unrelated types are nonaliasing. For example, pointer comparison cannot be
1662+
eliminated simply because the two addresses derived from those pointers are
1663+
accessed as unrelated types at different program points.
15831664

15841665
Value Dependence
15851666
----------------

0 commit comments

Comments
 (0)