@@ -8,7 +8,7 @@ Swift Intermediate Language (SIL)
8
8
Abstract
9
9
--------
10
10
11
- SIL is a SSA-form IR with high-level semantic information designed to implement
11
+ SIL is an SSA-form IR with high-level semantic information designed to implement
12
12
the Swift programming language. SIL accommodates the following use cases:
13
13
14
14
- High-level optimization passes, including retain/release optimization,
@@ -23,6 +23,10 @@ the Swift programming language. SIL accommodates the following use cases:
23
23
inlineable or generic code with Swift library modules, to be optimized into
24
24
client binaries.
25
25
26
+ In contrast to LLVM IR, SIL is a generally target-independent format
27
+ representation that can be used for code distribution, but can also express
28
+ target-specific concepts as well as Swift can.
29
+
26
30
SIL in the Swift Compiler
27
31
-------------------------
28
32
@@ -31,10 +35,16 @@ At a high level, the Swift compiler follows a strict pipeline architecture:
31
35
- The *Parse * module constructs an AST from Swift source code.
32
36
- The *Sema * module type-checks the AST and annotates it with type information.
33
37
- The *SILGen * module generates "raw" SIL from an AST.
34
- - SIL *Passes * run over the raw SIL to emit diagnostics and apply optimizations
35
- to produce canonical SIL.
38
+ - A series of *Guaranteed Optimization Passes * and *Diagnostic Passes * are run
39
+ over the "raw" to both perform optimizations, but also to emit
40
+ language-specific diagnostics. These are always run, even at -O0, and produce
41
+ "canonical" SIL.
42
+ - General SIL *Optimization Passes * optionally run over the canonical SIL to
43
+ improve performance of the resultant executable. These are enabled and
44
+ controlled by the optimization level and are not run at -O0.
36
45
- *IRGen * lowers optimized SIL to LLVM IR.
37
- - The LLVM backend applies LLVM optimizations and emits binary code.
46
+ - The LLVM backend (optionally) applies LLVM optimizations, runs the LLVM code
47
+ generator and emits binary code.
38
48
39
49
The stages pertaining to SIL processing in particular are as follows:
40
50
@@ -50,7 +60,7 @@ emitted by SILGen has the following properties:
50
60
represents variables as reference-counted "boxes" in the most general case,
51
61
which can be retained, released, and shared.
52
62
- Dataflow requirements, such as definitive assignment, function returns,
53
- switch coverage, etc. have not yet been enforced.
63
+ switch coverage (TBD) , etc. have not yet been enforced.
54
64
- ``always_inline ``, ``always_instantiate ``, and other function optimization
55
65
attributes have not yet been honored.
56
66
@@ -62,7 +72,9 @@ Guaranteed Optimization Passes
62
72
63
73
After SILGen, a deterministic sequence of optimization passes is run over the
64
74
raw SIL. These passes are more concerned with predictability and exposing
65
- dataflow for diagnostic passes than performance.
75
+ dataflow for diagnostic passes than performance. Notably, we do not want the
76
+ diagnostics produced by the compiler to change as the compiler evolves, so these
77
+ passes are intended to be simple and predictable.
66
78
67
79
- Memory promotion: this is implemented as two optimization phases, the first
68
80
of which performs capture analysis to promote alloc_box instructions to
74
86
- Always inline
75
87
- Constant folding/guaranteed simplifications (including constant overflow
76
88
warnings)
89
+ - Basic ARC optimization for acceptable performance at -O0.
77
90
78
91
Diagnostic Passes
79
92
~~~~~~~~~~~~~~~~~
@@ -91,11 +104,18 @@ TODO:
91
104
- Dead code detection/elimination. Non-implicit dead code is an error.
92
105
- Definitive assignment of local variables, and of instance variables in
93
106
constructors.
94
- - Basic ARC optimization for decent performance at -O0.
95
107
96
108
If the diagnostic passes all succeed, the final result is the *canonical SIL *
97
- for the program. Performance optimization, native code generation, and module
98
- distribution are derived from this form.
109
+ for the program. Performance optimization and, native code generation are
110
+ derived from this form, and a module can be built from this (or later) forms.
111
+
112
+ General Optimization Passes
113
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
114
+
115
+ SIL captures language-specific type information, which makes it possible to
116
+ perform high-level optimizations such as generics specialization that are
117
+ difficult to perform on LLVM IR. The details of these have not been fully
118
+ nailed down, but we expect this to be important.
99
119
100
120
Syntax
101
121
------
@@ -131,7 +151,7 @@ Here is an example of a ``.sil`` file::
131
151
%1 = function_ref @_TSsoi1pfTSdSd_Sd
132
152
%2 = struct_extract %0 : $Point, #Point.x
133
153
%3 = struct_extract %0 : $Point, #Point.y
134
- %4 = apply %1(%2 : $Double , %3 : $Double ) : $(Double, Double) -> Double
154
+ %4 = apply %1(%2, %3) : $(Double, Double) -> Double
135
155
%5 = return %4 : Double
136
156
}
137
157
@@ -152,7 +172,7 @@ type grammar. SIL adds some additional kinds of type of its own:
152
172
153
173
Addresses of address-only types (see below) can only be used with
154
174
instructions that manipulate their operands indirectly by address, such
155
- as ``copy_addr ``, ``destroy_addr ``, and ``dealloc_var ``, or as arguments
175
+ as ``copy_addr ``, ``destroy_addr ``, and ``dealloc_stack ``, or as arguments
156
176
to functions. For an address-only type ``T ``, only the SIL address ``$*T ``
157
177
can be formed. ``$T `` for address-only ``T `` is an invalid SIL type.
158
178
@@ -251,7 +271,7 @@ is always followed by a ``:`` and the SIL type of the value. For example::
251
271
%negate = builtin_function_ref #Builtin.neg_Int64
252
272
%five = integer_literal 5 : $Builtin.Int64
253
273
// Use the values as operands
254
- %neg_five = apply %negate(%five : $Builtin.Int64 ) : (Builtin.Int64) -> Builtin.Int64
274
+ %neg_five = apply %negate(%five) : (Builtin.Int64) -> Builtin.Int64
255
275
256
276
In SIL, a single instruction may produce multiple values. Operands that refer
257
277
to multiple-value instructions choose the value by following the ``%name `` with
@@ -326,8 +346,8 @@ received from the function caller::
326
346
sil @bar : $(Int, Int) -> () {
327
347
bb0(%x : $Int, %y : $Int):
328
348
%foo = function_ref @foo
329
- %1 = apply %foo(%x : $Int ) : $(Int) -> Int
330
- %2 = apply %foo(%y : $Int ) : $(Int) -> Int
349
+ %1 = apply %foo(%x) : $(Int) -> Int
350
+ %2 = apply %foo(%y) : $(Int) -> Int
331
351
%3 = tuple ()
332
352
%4 = return %3 : $()
333
353
}
@@ -497,11 +517,10 @@ in the ``apply`` instructions used by callers::
497
517
entry:
498
518
...
499
519
%foo = function_ref @foo : $(x:Int, y:Int) -> ()
500
- %foo_result = apply %foo(%1 : $Int , %2 : $Int ) : $(x:Int, y:Int) -> ()
520
+ %foo_result = apply %foo(%1, %2) : $(x:Int, y:Int) -> ()
501
521
...
502
522
%bar = function_ref @bar : $(x:Int, y:(Int, Int)) -> ()
503
- %bar_result = apply %bar(%4 : $Int, %5 : $Int, %6 : $Int) \
504
- : $(x:Int, y:(Int, Int)) -> ()
523
+ %bar_result = apply %bar(%4, %5, %6) : $(x:Int, y:(Int, Int)) -> ()
505
524
}
506
525
507
526
Calling a function with trivial value types as inputs and outputs
@@ -514,7 +533,7 @@ simply passes the arguments by value. This Swift function::
514
533
gets called in SIL as::
515
534
516
535
%foo = constant_ref $(Int, Float) -> Char, @foo
517
- %z = apply %foo(%x, %y)
536
+ %z = apply %foo(%x, %y) : $(Int, Float) -> Char
518
537
519
538
Reference Counts
520
539
````````````````
@@ -526,15 +545,15 @@ type components each retained and released the same way. This Swift function::
526
545
527
546
class A {}
528
547
529
- func bar(x:A) -> (Int, A)
548
+ func bar(x:A) -> (Int, A) { ... }
530
549
531
550
bar(x)
532
551
533
552
gets called in SIL as::
534
553
535
554
%bar = function_ref @bar : $(A) -> (Int, A)
536
555
retain %x : $A
537
- %z = apply %bar(%x : $A ) : $(A) -> (Int, A)
556
+ %z = apply %bar(%x) : $(A) -> (Int, A)
538
557
// ... use %z ...
539
558
%z_1 = tuple_extract %z : $(Int, A), 1
540
559
release %z_1
@@ -563,11 +582,11 @@ gets called in SIL as::
563
582
%z = alloc_stack $A
564
583
%x_arg = alloc_stack $A
565
584
copy_addr %x : $*A to [initialize] %x_arg : $*A
566
- apply %bas(%z : $*A , %x_arg : $*A , %y : $Int ) : $(A, Int) -> A
585
+ apply %bas(%z, %x_arg, %y) : $(A, Int) -> A
567
586
dealloc_stack %x_arg : $*A // callee consumes %x.arg, caller deallocs
568
587
// ... use %z ...
569
588
destroy_addr %z : $*A
570
- dealloc_var stack %z : $*A
589
+ dealloc_stack stack %z : $*A
571
590
572
591
The implementation of ``@bas `` is then responsible for consuming ``%x_arg `` and
573
592
initializing ``%z ``.
@@ -588,12 +607,11 @@ gets called in SIL as::
588
607
%y_arg = alloc_stack $A
589
608
copy_addr %y : $*A to [initialize] %y_arg : $*A
590
609
%w_0_addr = element_addr %w : $*(A, Int), 0
591
- %w_0_arg = alloc_var stack $A
610
+ %w_0_arg = alloc_stack $A
592
611
copy_addr %w_0_addr : $*A to [initialize] %w_0_arg : $*A
593
612
%w_1_addr = element_addr %w : $*(A, Int), 1
594
613
%w_1 = load %w_1_addr : $*Int
595
- apply %zim(%x : $Int, %y_arg : $*A, %z : $Int, %w_0_arg : $A, %w_1 : $Int) \
596
- : $(x:Int, y:A, (z:Int, w:(A, Int))) -> ()
614
+ apply %zim(%x, %y_arg, %z, %w_0_arg, %w_1) : $(x:Int, y:A, (z:Int, w:(A, Int))) -> ()
597
615
dealloc_stack %w_0_arg
598
616
dealloc_stack %y_arg
599
617
@@ -612,8 +630,7 @@ gets called in SIL as::
612
630
%zang = function_ref @zang : $(x:Int, (y:Int, z:Int...), v:Int, w:Int...) -> ()
613
631
%zs = <<make array from %z1, %z2>>
614
632
%ws = <<make array from %w0, %w1, %w2>>
615
- apply %zang(%x : $Int, %y : $Int, %zs : $Int[], %v : $Int, %ws : $Int[]) \
616
- : $(x:Int, (y:Int, z:Int...), v:Int, w:Int...) -> ()
633
+ apply %zang(%x, %y, %zs, %v, %ws) : $(x:Int, (y:Int, z:Int...), v:Int, w:Int...) -> ()
617
634
618
635
Function Currying
619
636
`````````````````
@@ -664,8 +681,8 @@ alloc_stack
664
681
%1 = alloc_stack $T
665
682
// %1 has type $*T
666
683
667
- Allocates enough uninitialized memory on the stack to contain a value of type
668
- ``T ``. The result of the instruction is the address
684
+ Allocates enough uninitialized memory that is sufficiently aligned on the stack
685
+ to contain a value of type ``T ``. The result of the instruction is the address
669
686
of the allocated memory. ``alloc_stack `` marks the start of the lifetime of
670
687
the value; the allocation must be balanced with a ``dealloc_stack ``
671
688
instruction to mark the end of its lifetime. The memory is not retainable;
@@ -696,7 +713,8 @@ alloc_box
696
713
// %1#1 has type $*T
697
714
698
715
Allocates a reference-counted "box" on the heap large enough to hold a value of
699
- type ``T ``. The result of the instruction is a two-value operand;
716
+ type ``T ``, along with a retain count and any other metadata required by the
717
+ runtime. The result of the instruction is a two-value operand;
700
718
the first value is the reference-counted ``ObjectPointer `` that owns the box,
701
719
and the second value is the address of the value inside the box.
702
720
@@ -785,12 +803,13 @@ store
785
803
`````
786
804
::
787
805
788
- store %0 : $T to %1 : $*T
806
+ store %0 to %1 : $*T
789
807
// $T must be a loadable type
790
808
791
- Stores the value ``%0 `` to memory at address ``%1 ``. ``%0 `` must be of a
792
- loadable type. This will overwrite the memory at ``%1 ``; ``%1 `` must reference
793
- uninitialized or destroyed memory.
809
+ Stores the value ``%0 `` to memory at address ``%1 ``. The type of %1 is ``*T ``
810
+ and the type of ``%0 is ``T ``, which must be of a loadable type. This will
811
+ overwrite the memory at ``%1 ``; ``%1 `` must reference uninitialized or destroyed
812
+ memory.
794
813
795
814
initialize_var
796
815
``````````````
@@ -1245,7 +1264,7 @@ apply
1245
1264
'(' (sil-operand (',' sil-operand)?)? ')'
1246
1265
':' sil-type
1247
1266
1248
- %r = apply %0(%1 : $A , %2 : $B , ...) : $(A, B, ...) -> R
1267
+ %r = apply %0(%1, %2, ...) : $(A, B, ...) -> R
1249
1268
// Note that the type of the callee '%0' is specified *after* the arguments
1250
1269
// %0 must be of a concrete function type $(A, B, ...) -> R
1251
1270
// %1, %2, etc. must be of the argument types $A, $B, etc.
@@ -1270,7 +1289,7 @@ partial_apply
1270
1289
'(' (sil-operand (',' sil-operand)?)? ')'
1271
1290
':' sil-type
1272
1291
1273
- %c = partial_apply %0(%1 : $A , %2 : $B , ...) : $[thin] (T..., A, B, ...) -> R
1292
+ %c = partial_apply %0(%1, %2, ...) : $[thin] (T..., A, B, ...) -> R
1274
1293
// Note that the type of the callee '%0' is specified *after* the arguments
1275
1294
// %0 must be of a thin concrete function type $[thin] (T..., A, B, ...) -> R
1276
1295
// %1, %2, etc. must be of the argument types $A, $B, etc.,
@@ -1295,21 +1314,21 @@ clarity)::
1295
1314
func @foo : $[thin] A -> B -> C -> D -> E {
1296
1315
entry(%a : $A):
1297
1316
%foo_1 = function_ref @foo_1 : $[thin] (B, A) -> C -> D -> E
1298
- %thunk = partial_apply %foo_1(%a : $A ) : $[thin] (B, A) -> C -> D -> E
1317
+ %thunk = partial_apply %foo_1(%a) : $[thin] (B, A) -> C -> D -> E
1299
1318
return %thunk : $B -> C -> D -> E
1300
1319
}
1301
1320
1302
1321
func @foo_1 : $[thin] (B, A) -> C -> D -> E {
1303
1322
entry(%b : $B, %a : $A):
1304
1323
%foo_2 = function_ref @foo_2 : $[thin] (C, B, A) -> D -> E
1305
- %thunk = partial_apply %foo_2(%b : $B , %a : $A ) : $[thin] (C, B, A) -> D -> E
1324
+ %thunk = partial_apply %foo_2(%b, %a) : $[thin] (C, B, A) -> D -> E
1306
1325
return %thunk : $(B, A) -> C -> D -> E
1307
1326
}
1308
1327
1309
1328
func @foo_2 : $[thin] (C, B, A) -> D -> E {
1310
1329
entry(%c : $C, %b : $B, %a : $A):
1311
1330
%foo_3 = function_ref @foo_3 : $[thin] (D, C, B, A) -> E
1312
- %thunk = partial_apply %foo_3(%c : $C , %b : $B , %a : $A ) : $[thin] (D, C, B, A) -> E
1331
+ %thunk = partial_apply %foo_3(%c, %b, %a) : $[thin] (D, C, B, A) -> E
1313
1332
return %thunk : $(C, B, A) -> D -> E
1314
1333
}
1315
1334
@@ -1339,11 +1358,11 @@ lowers to an uncurried entry point and is curried in the enclosing function::
1339
1358
entry(%x : $Int):
1340
1359
// Create the bar closure
1341
1360
%bar_uncurried = function_ref @bar : $(Int, Int) -> Int
1342
- %bar = partial_apply %bar_uncurried(%x : $Int ) : $(Int, Int) -> Int
1361
+ %bar = partial_apply %bar_uncurried(%x) : $(Int, Int) -> Int
1343
1362
1344
1363
// Apply it
1345
1364
%1 = integer_literal $Int, 1
1346
- %ret = apply %bar(%1 : $Int ) : $(Int) -> Int
1365
+ %ret = apply %bar(%1) : $(Int) -> Int
1347
1366
1348
1367
// Clean up
1349
1368
release %bar : $(Int) -> Int
@@ -1676,7 +1695,7 @@ compiles to this SIL sequence::
1676
1695
%bar = protocol_method %foo : $*Foo, #Foo.bar!1
1677
1696
%foo_p = project_existential %foo : $*Foo
1678
1697
%one_two_three = integer_literal $Int, 123
1679
- %_ = apply %bar(%one_two_three : $Int , %foo_p : $Builtin.OpaquePointer ) : $(Int, Builtin.OpaquePointer) -> ()
1698
+ %_ = apply %bar(%one_two_three, %foo_p) : $(Int, Builtin.OpaquePointer) -> ()
1680
1699
1681
1700
It is undefined behavior if the result of ``project_existential `` is used as
1682
1701
anything other than the "this" argument of an instance method reference
@@ -1739,7 +1758,7 @@ compiles to this SIL sequence::
1739
1758
%bar = protocol_method %foo : $Foo, #Foo.bar!1
1740
1759
%foo_p = project_existential_ref %foo : $Foo
1741
1760
%one_two_three = integer_literal $Int, 123
1742
- %_ = apply %bar(%one_two_three : $Int , %foo_p : $Builtin.ObjCPointer ) : $(Int, Builtin.ObjCPointer) -> ()
1761
+ %_ = apply %bar(%one_two_three, %foo_p) : $(Int, Builtin.ObjCPointer) -> ()
1743
1762
1744
1763
It is undefined behavior if the result of ``project_existential_ref `` is used
1745
1764
as anything other than the "this" argument of an instance method reference
@@ -2222,7 +2241,7 @@ For example::
2222
2241
%a = tuple_extract %ab : $(Int, Int), 0
2223
2242
%b = tuple_extract %ab : $(Int, Int), 1
2224
2243
%add = function_ref @add : $(Int, Int) -> Int
2225
- %result = apply %add(%a : $Int , %b : $Int ) : $(Int, Int) -> Int
2244
+ %result = apply %add(%a, %b) : $(Int, Int) -> Int
2226
2245
return %result : $Int
2227
2246
}
2228
2247
0 commit comments