12
12
13
13
Region inference module.
14
14
15
+ # Terminology
16
+
17
+ Note that we use the terms region and lifetime interchangeably,
18
+ though the term `lifetime` is preferred.
19
+
15
20
# Introduction
16
21
17
22
Region inference uses a somewhat more involved algorithm than type
@@ -50,10 +55,7 @@ Variables and constraints are created using the following methods:
50
55
the greatest region that is smaller than both R_i and R_j
51
56
52
57
The actual region resolution algorithm is not entirely
53
- obvious, though it is also not overly complex. I'll explain
54
- the algorithm as it currently works, then explain a somewhat
55
- more complex variant that would probably scale better for
56
- large graphs (and possibly all graphs).
58
+ obvious, though it is also not overly complex.
57
59
58
60
## Snapshotting
59
61
@@ -68,10 +70,9 @@ is in progress, but only the root snapshot can "commit".
68
70
69
71
The constraint resolution algorithm is not super complex but also not
70
72
entirely obvious. Here I describe the problem somewhat abstractly,
71
- then describe how the current code works, and finally describe a
72
- better solution that is as of yet unimplemented. There may be other,
73
- smarter ways of doing this with which I am unfamiliar and can't be
74
- bothered to research at the moment. - NDM
73
+ then describe how the current code works. There may be other, smarter
74
+ ways of doing this with which I am unfamiliar and can't be bothered to
75
+ research at the moment. - NDM
75
76
76
77
## The problem
77
78
@@ -120,19 +121,254 @@ its value as the GLB of all its successors. Basically contracting
120
121
nodes ensure that there is overlap between their successors; we will
121
122
ultimately infer the largest overlap possible.
122
123
123
- ### A better algorithm
124
-
125
- Fixed-point iteration is not necessary. What we ought to do is first
124
+ # The Region Hierarchy
125
+
126
+ ## Without closures
127
+
128
+ Let's first consider the region hierarchy without thinking about
129
+ closures, because they add a lot of complications. The region
130
+ hierarchy *basically* mirrors the lexical structure of the code.
131
+ There is a region for every piece of 'evaluation' that occurs, meaning
132
+ every expression, block, and pattern (patterns are considered to
133
+ "execute" by testing the value they are applied to and creating any
134
+ relevant bindings). So, for example:
135
+
136
+ fn foo(x: int, y: int) { // -+
137
+ // +------------+ // |
138
+ // | +-----+ // |
139
+ // | +-+ +-+ +-+ // |
140
+ // | | | | | | | // |
141
+ // v v v v v v v // |
142
+ let z = x + y; // |
143
+ ... // |
144
+ } // -+
145
+
146
+ fn bar() { ... }
147
+
148
+ In this example, there is a region for the fn body block as a whole,
149
+ and then a subregion for the declaration of the local variable.
150
+ Within that, there are sublifetimes for the assignment pattern and
151
+ also the expression `x + y`. The expression itself has sublifetimes
152
+ for evaluating `x` and and `y`.
153
+
154
+ ## Function calls
155
+
156
+ Function calls are a bit tricky. I will describe how we handle them
157
+ *now* and then a bit about how we can improve them (Issue #6268).
158
+
159
+ Consider a function call like `func(expr1, expr2)`, where `func`,
160
+ `arg1`, and `arg2` are all arbitrary expressions. Currently,
161
+ we construct a region hierarchy like:
162
+
163
+ +----------------+
164
+ | |
165
+ +--+ +---+ +---+|
166
+ v v v v v vv
167
+ func(expr1, expr2)
168
+
169
+ Here you can see that the call as a whole has a region and the
170
+ function plus arguments are subregions of that. As a side-effect of
171
+ this, we get a lot of spurious errors around nested calls, in
172
+ particular when combined with `&mut` functions. For example, a call
173
+ like this one
174
+
175
+ self.foo(self.bar())
176
+
177
+ where both `foo` and `bar` are `&mut self` functions will always yield
178
+ an error.
179
+
180
+ Here is a more involved example (which is safe) so we can see what's
181
+ going on:
182
+
183
+ struct Foo { f: uint, g: uint }
184
+ ...
185
+ fn add(p: &mut uint, v: uint) {
186
+ *p += v;
187
+ }
188
+ ...
189
+ fn inc(p: &mut uint) -> uint {
190
+ *p += 1; *p
191
+ }
192
+ fn weird() {
193
+ let mut x: ~Foo = ~Foo { ... };
194
+ 'a: add(&mut (*x).f,
195
+ 'b: inc(&mut (*x).f)) // (*)
196
+ }
197
+
198
+ The important part is the line marked `(*)` which contains a call to
199
+ `add()`. The first argument is a mutable borrow of the field `f`. The
200
+ second argument also borrows the field `f`. Now, in the current borrow
201
+ checker, the first borrow is given the lifetime of the call to
202
+ `add()`, `'a`. The second borrow is given the lifetime of `'b` of the
203
+ call to `inc()`. Because `'b` is considered to be a sublifetime of
204
+ `'a`, an error is reported since there are two co-existing mutable
205
+ borrows of the same data.
206
+
207
+ However, if we were to examine the lifetimes a bit more carefully, we
208
+ can see that this error is unnecessary. Let's examine the lifetimes
209
+ involved with `'a` in detail. We'll break apart all the steps involved
210
+ in a call expression:
211
+
212
+ 'a: {
213
+ 'a_arg1: let a_temp1: ... = add;
214
+ 'a_arg2: let a_temp2: &'a mut uint = &'a mut (*x).f;
215
+ 'a_arg3: let a_temp3: uint = {
216
+ let b_temp1: ... = inc;
217
+ let b_temp2: &'b = &'b mut (*x).f;
218
+ 'b_call: b_temp1(b_temp2)
219
+ };
220
+ 'a_call: a_temp1(a_temp2, a_temp3) // (**)
221
+ }
222
+
223
+ Here we see that the lifetime `'a` includes a number of substatements.
224
+ In particular, there is this lifetime I've called `'a_call` that
225
+ corresponds to the *actual execution of the function `add()`*, after
226
+ all arguments have been evaluated. There is a corresponding lifetime
227
+ `'b_call` for the execution of `inc()`. If we wanted to be precise
228
+ about it, the lifetime of the two borrows should be `'a_call` and
229
+ `'b_call` respectively, since the borrowed pointers that were created
230
+ will not be dereferenced except during the execution itself.
231
+
232
+ However, this model by itself is not sound. The reason is that
233
+ while the two borrowed pointers that are created will never be used
234
+ simultaneously, it is still true that the first borrowed pointer is
235
+ *created* before the second argument is evaluated, and so even though
236
+ it will not be *dereferenced* during the evaluation of the second
237
+ argument, it can still be *invalidated* by that evaluation. Consider
238
+ this similar but unsound example:
239
+
240
+ struct Foo { f: uint, g: uint }
241
+ ...
242
+ fn add(p: &mut uint, v: uint) {
243
+ *p += v;
244
+ }
245
+ ...
246
+ fn consume(x: ~Foo) -> uint {
247
+ x.f + x.g
248
+ }
249
+ fn weird() {
250
+ let mut x: ~Foo = ~Foo { ... };
251
+ 'a: add(&mut (*x).f, consume(x)) // (*)
252
+ }
253
+
254
+ In this case, the second argument to `add` actually consumes `x`, thus
255
+ invalidating the first argument.
256
+
257
+ So, for now, we exclude the `call` lifetimes from our model.
258
+ Eventually I would like to include them, but we will have to make the
259
+ borrow checker handle this situation correctly. In particular, if
260
+ there is a borrowed pointer created whose lifetime does not enclose
261
+ the borrow expression, we must issue sufficient restrictions to ensure
262
+ that the pointee remains valid.
263
+
264
+ ## Adding closures
265
+
266
+ The other significant complication to the region hierarchy is
267
+ closures. I will describe here how closures should work, though some
268
+ of the work to implement this model is ongoing at the time of this
269
+ writing.
270
+
271
+ The body of closures are type-checked along with the function that
272
+ creates them. However, unlike other expressions that appear within the
273
+ function body, it is not entirely obvious when a closure body executes
274
+ with respect to the other expressions. This is because the closure
275
+ body will execute whenever the closure is called; however, we can
276
+ never know precisely when the closure will be called, especially
277
+ without some sort of alias analysis.
278
+
279
+ However, we can place some sort of limits on when the closure
280
+ executes. In particular, the type of every closure `fn:'r K` includes
281
+ a region bound `'r`. This bound indicates the maximum lifetime of that
282
+ closure; once we exit that region, the closure cannot be called
283
+ anymore. Therefore, we say that the lifetime of the closure body is a
284
+ sublifetime of the closure bound, but the closure body itself is unordered
285
+ with respect to other parts of the code.
286
+
287
+ For example, consider the following fragment of code:
288
+
289
+ 'a: {
290
+ let closure: fn:'a() = || 'b: {
291
+ 'c: ...
292
+ };
293
+ 'd: ...
294
+ }
295
+
296
+ Here we have four lifetimes, `'a`, `'b`, `'c`, and `'d`. The closure
297
+ `closure` is bounded by the lifetime `'a`. The lifetime `'b` is the
298
+ lifetime of the closure body, and `'c` is some statement within the
299
+ closure body. Finally, `'d` is a statement within the outer block that
300
+ created the closure.
301
+
302
+ We can say that the closure body `'b` is a sublifetime of `'a` due to
303
+ the closure bound. By the usual lexical scoping conventions, the
304
+ statement `'c` is clearly a sublifetime of `'b`, and `'d` is a
305
+ sublifetime of `'d`. However, there is no ordering between `'c` and
306
+ `'d` per se (this kind of ordering between statements is actually only
307
+ an issue for dataflow; passes like the borrow checker must assume that
308
+ closures could execute at any time from the moment they are created
309
+ until they go out of scope).
310
+
311
+ ### Complications due to closure bound inference
312
+
313
+ There is only one problem with the above model: in general, we do not
314
+ actually *know* the closure bounds during region inference! In fact,
315
+ closure bounds are almost always region variables! This is very tricky
316
+ because the inference system implicitly assumes that we can do things
317
+ like compute the LUB of two scoped lifetimes without needing to know
318
+ the values of any variables.
319
+
320
+ Here is an example to illustrate the problem:
321
+
322
+ fn identify<T>(x: T) -> T { x }
323
+
324
+ fn foo() { // 'foo is the function body
325
+ 'a: {
326
+ let closure = identity(|| 'b: {
327
+ 'c: ...
328
+ });
329
+ 'd: closure();
330
+ }
331
+ 'e: ...;
332
+ }
333
+
334
+ In this example, the closure bound is not explicit. At compile time,
335
+ we will create a region variable (let's call it `V0`) to represent the
336
+ closure bound.
337
+
338
+ The primary difficulty arises during the constraint propagation phase.
339
+ Imagine there is some variable with incoming edges from `'c` and `'d`.
340
+ This means that the value of the variable must be `LUB('c,
341
+ 'd)`. However, without knowing what the closure bound `V0` is, we
342
+ can't compute the LUB of `'c` and `'d`! Any we don't know the closure
343
+ bound until inference is done.
344
+
345
+ The solution is to rely on the fixed point nature of inference.
346
+ Basically, when we must compute `LUB('c, 'd)`, we just use the current
347
+ value for `V0` as the closure's bound. If `V0`'s binding should
348
+ change, then we will do another round of inference, and the result of
349
+ `LUB('c, 'd)` will change.
350
+
351
+ One minor implication of this is that the graph does not in fact track
352
+ the full set of dependencies between edges. We cannot easily know
353
+ whether the result of a LUB computation will change, since there may
354
+ be indirect dependencies on other variables that are not reflected on
355
+ the graph. Therefore, we must *always* iterate over all edges when
356
+ doing the fixed point calculation, not just those adjacent to nodes
357
+ whose values have changed.
358
+
359
+ Were it not for this requirement, we could in fact avoid fixed-point
360
+ iteration altogether. In that universe, we could instead first
126
361
identify and remove strongly connected components (SCC) in the graph.
127
362
Note that such components must consist solely of region variables; all
128
363
of these variables can effectively be unified into a single variable.
129
-
130
- Once SCCs are removed, we are left with a DAG. At this point, we can
131
- walk the DAG in toplogical order once to compute the expanding nodes,
132
- and again in reverse topological order to compute the contracting
133
- nodes. The main reason I did not write it this way is that I did not
134
- feel like implementing the SCC and toplogical sort algorithms at the
135
- moment.
364
+ Once SCCs are removed, we are left with a DAG. At this point, we
365
+ could walk the DAG in toplogical order once to compute the expanding
366
+ nodes, and again in reverse topological order to compute the
367
+ contracting nodes. However, as I said, this does not work given the
368
+ current treatment of closure bounds, but perhaps in the future we can
369
+ address this problem somehow and make region inference somewhat more
370
+ efficient. Note that this is solely a matter of performance, not
371
+ expressiveness.
136
372
137
373
# Skolemization and functions
138
374
0 commit comments