Skip to content

Commit 20344d3

Browse files
committed
[mlir] Add a document detailing the design of the SymbolTable.
Summary: This document provides insight on the rationale and the design of Symbols in MLIR, and why they are necessary. Differential Revision: https://reviews.llvm.org/D73590
1 parent eeb6394 commit 20344d3

File tree

3 files changed

+222
-9
lines changed

3 files changed

+222
-9
lines changed

mlir/docs/LangRef.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1457,15 +1457,16 @@ This attribute can only be held internally by
14571457
[array attributes](#array-attribute) and
14581458
[dictionary attributes](#dictionary-attribute)(including the top-level operation
14591459
attribute dictionary), i.e. no other attribute kinds such as Locations or
1460-
extended attribute kinds. If a reference to a symbol is necessary from outside
1461-
of the symbol table that the symbol is defined in, a
1462-
[string attribute](#string-attribute) can be used to refer to the symbol name.
1460+
extended attribute kinds.
14631461

14641462
**Rationale:** Given that MLIR models global accesses with symbol references, to
14651463
enable efficient multi-threading, it becomes difficult to effectively reason
14661464
about their uses. By restricting the places that can legally hold a symbol
14671465
reference, we can always opaquely reason about a symbols usage characteristics.
14681466

1467+
See [`Symbols And SymbolTables`](SymbolsAndSymbolTables.md) for more
1468+
information.
1469+
14691470
#### Type Attribute
14701471

14711472
Syntax:

mlir/docs/SymbolsAndSymbolTables.md

Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
# Symbols and Symbol Tables
2+
3+
[TOC]
4+
5+
MLIR is a multi-level representation, with [Regions](LangRef.md#regions) the
6+
multi-level aspect is structural in the IR. A lot of infrastructure within the
7+
compiler is built around this nesting structure, including the processing of
8+
operations within the [pass manager](WritingAPass.md#pass-manager). One
9+
advantage of the MLIR design is that it is able to process operations in
10+
parallel, utilizing multiple threads. This is possible due to a property of the
11+
IR known as [`IsolatedFromAbove`](Traits.md#isolatedfromabove).
12+
13+
Without this property, any operation could affect or mutate the use-list of
14+
operations defined above. Making this thread-safe requires expensive locking in
15+
some of the core IR data structures, which becomes quite inefficient. To enable
16+
multi-threaded compilation without this locking, MLIR uses local pools for
17+
constant values as well as `Symbol` accesses for global values and variables.
18+
This document details the design of `Symbol`s, what they are and how they fit
19+
into the system.
20+
21+
The `Symbol` infrastructure essentially provides a non-SSA mechanism in which to
22+
refer to an operation symbolically with a name. This allows for referring to
23+
operations defined above regions that were defined as `IsolatedFromAbove` in a
24+
safe way. It also allows for symbolically referencing operations define below
25+
other regions as well.
26+
27+
## Symbol
28+
29+
A `Symbol` is a named operation that resides immediately within a region that
30+
defines a [`SymbolTable`](#symbol-table). The name of a symbol *must* be unique
31+
within the parent `SymbolTable`. This name is semantically similarly to an SSA
32+
result value, and may be referred to by other operations to provide a symbolic
33+
link, or use, to the symbol. An example of a `Symbol` operation is
34+
[`func`](LangRef.md#functions). `func` defines a symbol name, which is
35+
[referred to](#referencing-a-symbol) by operations like
36+
[`std.call`](Dialects/Standard.md#call).
37+
38+
### Defining a Symbol
39+
40+
A `Symbol` operation may use the `OpTrait::Symbol` trait, but have the following
41+
properties:
42+
43+
* A `StringAttr` attribute named
44+
'SymbolTable::getSymbolAttrName()'(`sym_name`).
45+
- This attribute defines the symbolic 'name' of the operation.
46+
* An optional `StringAttr` attribute named
47+
'SymbolTable::getVisibilityAttrName()'(`sym_visibility`)
48+
- This attribute defines the [visibility](#symbol-visibility) of the
49+
symbol, or more specifically in-which scopes it may be accessed.
50+
* No SSA results
51+
- Intermixing the different ways to `use` an operation quickly becomes
52+
unwieldy and difficult to analyze.
53+
54+
## Symbol Table
55+
56+
Described above are `Symbol`s, which reside within a region of an operation
57+
defining a `SymbolTable`. A `SymbolTable` operation provides the container for
58+
the [`Symbol`](#symbol) operations. It verifies that all `Symbol` operations
59+
have a unique name, and provides facilities for looking up symbols by name.
60+
Operations defining a `SymbolTable` may use the `OpTrait::SymbolTable` trait.
61+
62+
### Referencing a Symbol
63+
64+
`Symbol`s are referenced symbolically by name via the
65+
[`SymbolRefAttr`](LangRef.md#symbol-reference-attribute) attribute. A symbol
66+
reference attribute contains a named reference to an operation that is nested
67+
within a symbol table. It may optionally contain a set of nested references that
68+
further resolve to a symbol nested within a different symbol table. When
69+
resolving a nested reference, each non-leaf reference must refer to a symbol
70+
operation that is also a [symbol table](#symbol-table).
71+
72+
Below is an example of how an operation may reference a symbol operation:
73+
74+
```mlir
75+
// This `func` operation defines a symbol named `symbol`.
76+
func @symbol()
77+
78+
// Our `foo.user` operation contains a SymbolRefAttr with the name of the
79+
// `symbol` func.
80+
"foo.user"() {uses = [@symbol]} : () -> ()
81+
82+
// Symbol references resolve to the nearest parent operation that defines a
83+
// symbol table, so we can have references with arbitrary nesting levels.
84+
func @other_symbol() {
85+
affine.for %i0 = 0 to 10 {
86+
// Our `foo.user` operation resolves to the same `symbol` func as defined
87+
// above.
88+
"foo.user"() {uses = [@symbol]} : () -> ()
89+
}
90+
return
91+
}
92+
93+
// Here we define a nested symbol table. References within this operation will
94+
// not resolve to any symbols defined above.
95+
module {
96+
// Error. We resolve references with respect to the closest parent symbol
97+
// table, so this reference can't be resolved.
98+
"foo.user"() {uses = [@symbol]} : () -> ()
99+
}
100+
101+
// Here we define another nested symbol table, except this time it also defines
102+
// a symbol.
103+
module @module_symbol {
104+
// This `func` operation defines a symbol named `nested_symbol`.
105+
func @nested_symbol()
106+
}
107+
108+
// Our `foo.user` operation may refer to the nested symbol, by resolving through
109+
// the parent.
110+
"foo.user"() {uses = [@module_symbol::@symbol]} : () -> ()
111+
```
112+
113+
Using an attribute, as opposed to an SSA value, has several benefits:
114+
115+
* References may appear in more places than the operand list; including
116+
[nested attribute dictionaries](LangRef.md#dictionary-attribute),
117+
[array attributes](LangRef.md#array-attribute), etc.
118+
119+
* Handling of SSA dominance remains unchanged.
120+
121+
- If we were to use SSA values, we would need to create some mechanism in
122+
which to opt-out of certain properties of it such as dominance.
123+
Attributes allow for referencing the operations irregardless of the
124+
order in which they were defined.
125+
- Attributes simplify referencing operations within nested symbol tables,
126+
which are traditionally not visible outside of the parent region.
127+
128+
The impact of this choice to use attributes as opposed to SSA values is that we
129+
now have two mechanisms with reference operations. This means that some dialects
130+
must either support both `SymbolRefs` and SSA value references, or provide
131+
operations that materialize SSA values from a symbol reference. Each has
132+
different trade offs depending on the situation. A function call may directly
133+
use a `SymbolRef` as the callee, whereas a reference to a global variable might
134+
use a materialization operation so that the variable can be used in other
135+
operations like `std.addi`.
136+
[`llvm.mlir.addressof`](Dialects/LLVM.md#llvmmliraddressof) is one example of
137+
such an operation.
138+
139+
See the `LangRef` definition of the
140+
[`SymbolRefAttr`](LangRef.md#symbol-reference-attribute) for more information
141+
about the structure of this attribute.
142+
143+
### Manipulating a Symbol
144+
145+
As described above, `SymbolRefs` act as an auxiliary way of defining uses of
146+
operations to the traditional SSA use-list. As such, it is imperative to provide
147+
similar functionality to manipulate and inspect the list of uses and the users.
148+
The following are a few of the utilities provided by the `SymbolTable`:
149+
150+
* `SymbolTable::getSymbolUses`
151+
152+
- Access an iterator range over all of the uses on and nested within a
153+
particular operation.
154+
155+
* `SymbolTable::symbolKnownUseEmpty`
156+
157+
- Check if a particular symbol is known to be unused within a specific
158+
section of the IR.
159+
160+
* `SymbolTable::replaceAllSymbolUses`
161+
162+
- Replace all of the uses of one symbol with a new one within a specific
163+
section of the IR.
164+
165+
* `SymbolTable::lookupNearestSymbolFrom`
166+
167+
- Lookup the definition of a symbol in the nearest symbol table from some
168+
anchor operation.
169+
170+
## Symbol Visibility
171+
172+
Along with a name, a `Symbol` also has a `visibility` attached to it. The
173+
`visibility` of a symbol defines its structural reachability within the IR. A
174+
symbol may have one of the following visibilities:
175+
176+
* Public
177+
178+
- The symbol may be referenced from outside of the visible IR. We cannot
179+
assume that all of the uses of this symbol are observable.
180+
181+
* Private
182+
183+
- The symbol may only be referenced from within the current symbol table.
184+
185+
* Nested
186+
187+
- The symbol may be referenced by operations outside of the current symbol
188+
table, but not outside of the visible IR, as long as each symbol table
189+
parent also defines a non-private symbol.
190+
191+
A few examples of what this looks like in the IR are shown below:
192+
193+
```mlir
194+
module @public_module {
195+
// This function can be accessed by 'live.user', but cannot be referenced
196+
// externally; all uses are known to reside within parent regions.
197+
func @nested_function() attributes { sym_visibility = "nested" }
198+
199+
// This function cannot be accessed outside of 'public_module'
200+
func @private_function() attributes { sym_visibility = "private" }
201+
}
202+
203+
// This function can only be accessed from within the top-level module
204+
func @private_function() attributes { sym_visibility = "private" }
205+
206+
// This function may be referenced externally
207+
func @public_function()
208+
209+
"live.user"() {uses = [
210+
@public_module::@nested_function,
211+
@private_function,
212+
@public_function
213+
]} : () -> ()
214+
```

mlir/docs/Traits.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -226,17 +226,15 @@ single block that must terminate with `TerminatorOpType`.
226226

227227
* `OpTrait::Symbol` -- `Symbol`
228228

229-
This trait is used for operations that define a `Symbol`.
230-
231-
TODO(riverriddle) Link to the proper document detailing the design of symbols.
229+
This trait is used for operations that define a
230+
[`Symbol`](SymbolsAndSymbolTables.md#symbol).
232231

233232
### SymbolTable
234233

235234
* `OpTrait::SymbolTable` -- `SymbolTable`
236235

237-
This trait is used for operations that define a `SymbolTable`.
238-
239-
TODO(riverriddle) Link to the proper document detailing the design of symbols.
236+
This trait is used for operations that define a
237+
[`SymbolTable`](SymbolsAndSymbolTables.md#symbol-table).
240238

241239
### Terminator
242240

0 commit comments

Comments
 (0)