Skip to content

Commit 36dbd8c

Browse files
committed
---
yaml --- r: 195534 b: refs/heads/master c: a53a22e h: refs/heads/master v: v3
1 parent 7341380 commit 36dbd8c

File tree

4 files changed

+111
-141
lines changed

4 files changed

+111
-141
lines changed

[refs]

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
refs/heads/master: bc959fdb28c0b1456e9db431f68570a552a8ea50
2+
refs/heads/master: a53a22eab94f31df31cb936755c9c47784771759
33
refs/heads/snap-stage1: e33de59e47c5076a89eadeb38f4934f58a3618a6
44
refs/heads/snap-stage3: b3317d68910900f135f9f38e43a7a699bc736b4a
55
refs/heads/try: 961e0358e1a5c0faaef606e31e9965742c1643bf

trunk/src/librustc/middle/infer/region_inference/README.md

Lines changed: 55 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -249,114 +249,61 @@ there is a reference created whose lifetime does not enclose
249249
the borrow expression, we must issue sufficient restrictions to ensure
250250
that the pointee remains valid.
251251

252-
## Adding closures
253-
254-
The other significant complication to the region hierarchy is
255-
closures. I will describe here how closures should work, though some
256-
of the work to implement this model is ongoing at the time of this
257-
writing.
258-
259-
The body of closures are type-checked along with the function that
260-
creates them. However, unlike other expressions that appear within the
261-
function body, it is not entirely obvious when a closure body executes
262-
with respect to the other expressions. This is because the closure
263-
body will execute whenever the closure is called; however, we can
264-
never know precisely when the closure will be called, especially
265-
without some sort of alias analysis.
266-
267-
However, we can place some sort of limits on when the closure
268-
executes. In particular, the type of every closure `fn:'r K` includes
269-
a region bound `'r`. This bound indicates the maximum lifetime of that
270-
closure; once we exit that region, the closure cannot be called
271-
anymore. Therefore, we say that the lifetime of the closure body is a
272-
sublifetime of the closure bound, but the closure body itself is unordered
273-
with respect to other parts of the code.
274-
275-
For example, consider the following fragment of code:
276-
277-
'a: {
278-
let closure: fn:'a() = || 'b: {
279-
'c: ...
280-
};
281-
'd: ...
282-
}
283-
284-
Here we have four lifetimes, `'a`, `'b`, `'c`, and `'d`. The closure
285-
`closure` is bounded by the lifetime `'a`. The lifetime `'b` is the
286-
lifetime of the closure body, and `'c` is some statement within the
287-
closure body. Finally, `'d` is a statement within the outer block that
288-
created the closure.
289-
290-
We can say that the closure body `'b` is a sublifetime of `'a` due to
291-
the closure bound. By the usual lexical scoping conventions, the
292-
statement `'c` is clearly a sublifetime of `'b`, and `'d` is a
293-
sublifetime of `'d`. However, there is no ordering between `'c` and
294-
`'d` per se (this kind of ordering between statements is actually only
295-
an issue for dataflow; passes like the borrow checker must assume that
296-
closures could execute at any time from the moment they are created
297-
until they go out of scope).
298-
299-
### Complications due to closure bound inference
300-
301-
There is only one problem with the above model: in general, we do not
302-
actually *know* the closure bounds during region inference! In fact,
303-
closure bounds are almost always region variables! This is very tricky
304-
because the inference system implicitly assumes that we can do things
305-
like compute the LUB of two scoped lifetimes without needing to know
306-
the values of any variables.
307-
308-
Here is an example to illustrate the problem:
309-
310-
fn identify<T>(x: T) -> T { x }
311-
312-
fn foo() { // 'foo is the function body
313-
'a: {
314-
let closure = identity(|| 'b: {
315-
'c: ...
316-
});
317-
'd: closure();
318-
}
319-
'e: ...;
320-
}
321-
322-
In this example, the closure bound is not explicit. At compile time,
323-
we will create a region variable (let's call it `V0`) to represent the
324-
closure bound.
325-
326-
The primary difficulty arises during the constraint propagation phase.
327-
Imagine there is some variable with incoming edges from `'c` and `'d`.
328-
This means that the value of the variable must be `LUB('c,
329-
'd)`. However, without knowing what the closure bound `V0` is, we
330-
can't compute the LUB of `'c` and `'d`! Any we don't know the closure
331-
bound until inference is done.
332-
333-
The solution is to rely on the fixed point nature of inference.
334-
Basically, when we must compute `LUB('c, 'd)`, we just use the current
335-
value for `V0` as the closure's bound. If `V0`'s binding should
336-
change, then we will do another round of inference, and the result of
337-
`LUB('c, 'd)` will change.
338-
339-
One minor implication of this is that the graph does not in fact track
340-
the full set of dependencies between edges. We cannot easily know
341-
whether the result of a LUB computation will change, since there may
342-
be indirect dependencies on other variables that are not reflected on
343-
the graph. Therefore, we must *always* iterate over all edges when
344-
doing the fixed point calculation, not just those adjacent to nodes
345-
whose values have changed.
346-
347-
Were it not for this requirement, we could in fact avoid fixed-point
348-
iteration altogether. In that universe, we could instead first
349-
identify and remove strongly connected components (SCC) in the graph.
350-
Note that such components must consist solely of region variables; all
351-
of these variables can effectively be unified into a single variable.
352-
Once SCCs are removed, we are left with a DAG. At this point, we
353-
could walk the DAG in topological order once to compute the expanding
354-
nodes, and again in reverse topological order to compute the
355-
contracting nodes. However, as I said, this does not work given the
356-
current treatment of closure bounds, but perhaps in the future we can
357-
address this problem somehow and make region inference somewhat more
358-
efficient. Note that this is solely a matter of performance, not
359-
expressiveness.
252+
## Modeling closures
253+
254+
Integrating closures properly into the model is a bit of
255+
work-in-progress. In an ideal world, we would model closures as
256+
closely as possible after their desugared equivalents. That is, a
257+
closure type would be modeled as a struct, and the region hierarchy of
258+
different closure bodies would be completely distinct from all other
259+
fns. We are generally moving in that direction but there are
260+
complications in terms of the implementation.
261+
262+
In practice what we currently do is somewhat different. The basis for
263+
the current approach is the observation that the only time that
264+
regions from distinct fn bodies interact with one another is through
265+
an upvar or the type of a fn parameter (since closures live in the fn
266+
body namespace, they can in fact have fn parameters whose types
267+
include regions from the surrounding fn body). For these cases, there
268+
are separate mechanisms which ensure that the regions that appear in
269+
upvars/parameters outlive the dynamic extent of each call to the
270+
closure:
271+
272+
1. Types must outlive the region of any expression where they are used.
273+
For a closure type `C` to outlive a region `'r`, that implies that the
274+
types of all its upvars must outlive `'r`.
275+
2. Parameters must outlive the region of any fn that they are passed to.
276+
277+
Therefore, we can -- sort of -- assume that when we are asked to
278+
compare a region `'a` from a closure with a region `'b` from the fn
279+
that encloses it, in fact `'b` is the larger region. And that is
280+
precisely what we do: when building the region hierarchy, each region
281+
lives in its own distinct subtree, but if we are asked to compute the
282+
`LUB(r1, r2)` of two regions, and those regions are in disjoint
283+
subtrees, we compare the lexical nesting of the two regions.
284+
285+
*Ideas for improving the situation:* The correct argument here is
286+
subtle and a bit hand-wavy. The ideal, as stated earlier, would be to
287+
model things in such a way that it corresponds more closely to the
288+
desugared code. The best approach for doing this is a bit unclear: it
289+
may in fact be possible to *actually* desugar before we start, but I
290+
don't think so. The main option that I've been thinking through is
291+
imposing a "view shift" as we enter the fn body, so that regions
292+
appearing in the types of fn parameters and upvars are translated from
293+
being regions in the outer fn into free region parameters, just as
294+
they would be if we applied the desugaring. The challenge here is that
295+
type inference may not have fully run, so the types may not be fully
296+
known: we could probably do this translation lazilly, as type
297+
variables are instantiated. We would also have to apply a kind of
298+
inverse translation to the return value. This would be a good idea
299+
anyway, as right now it is possible for free regions instantiated
300+
within the closure to leak into the parent: this currently leads to
301+
type errors, since those regions cannot outlive any expressions within
302+
the parent hierarchy. Much like the current handling of closures,
303+
there are no known cases where this leads to a type-checking accepting
304+
incorrect code (though it sometimes rejects what might be considered
305+
correct code; see rust-lang/rust#22557), but it still doesn't feel
306+
like the right approach.
360307

361308
### Skolemization
362309

trunk/src/librustc/middle/region.rs

Lines changed: 54 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -262,7 +262,9 @@ pub struct RegionMaps {
262262
/// fn, if any. Thus the map structures the fn bodies into a
263263
/// hierarchy based on their lexical mapping. This is used to
264264
/// handle the relationships between regions in a fn and in a
265-
/// closure defined by that fn.
265+
/// closure defined by that fn. See the "Modeling closures"
266+
/// section of the README in middle::infer::region_inference for
267+
/// more details.
266268
fn_tree: RefCell<NodeMap<ast::NodeId>>,
267269
}
268270

@@ -337,7 +339,9 @@ pub struct Context {
337339
/// the root of the current region tree. This is typically the id
338340
/// of the innermost fn body. Each fn forms its own disjoint tree
339341
/// in the region hierarchy. These fn bodies are themselves
340-
/// arranged into a tree.
342+
/// arranged into a tree. See the "Modeling closures" section of
343+
/// the README in middle::infer::region_inference for more
344+
/// details.
341345
root_id: Option<ast::NodeId>,
342346

343347
/// the scope that contains any new variables declared
@@ -411,7 +415,18 @@ impl RegionMaps {
411415
assert!(previous.is_none());
412416
}
413417

414-
fn record_encl_scope(&self, sub: CodeExtent, sup: CodeExtent) {
418+
fn fn_is_enclosed_by(&self, mut sub_fn: ast::NodeId, sup_fn: ast::NodeId) -> bool {
419+
let fn_tree = self.fn_tree.borrow();
420+
loop {
421+
if sub_fn == sup_fn { return true; }
422+
match fn_tree.get(&sub_fn) {
423+
Some(&s) => { sub_fn = s; }
424+
None => { return false; }
425+
}
426+
}
427+
}
428+
429+
pub fn record_encl_scope(&self, sub: CodeExtent, sup: CodeExtent) {
415430
debug!("record_encl_scope(sub={:?}, sup={:?})", sub, sup);
416431
assert!(sub != sup);
417432
self.scope_map.borrow_mut().insert(sub, sup);
@@ -600,7 +615,7 @@ impl RegionMaps {
600615
let mut a_index = a_ancestors.len() - 1;
601616
let mut b_index = b_ancestors.len() - 1;
602617

603-
// Here, ~[ab]_ancestors is a vector going from narrow to broad.
618+
// Here, [ab]_ancestors is a vector going from narrow to broad.
604619
// The end of each vector will be the item where the scope is
605620
// defined; if there are any common ancestors, then the tails of
606621
// the vector will be the same. So basically we want to walk
@@ -609,7 +624,32 @@ impl RegionMaps {
609624
// then the corresponding scope is a superscope of the other.
610625

611626
if a_ancestors[a_index] != b_ancestors[b_index] {
612-
return None;
627+
// In this case, the two regions belong to completely
628+
// different functions. Compare those fn for lexical
629+
// nesting. The reasoning behind this is subtle. See the
630+
// "Modeling closures" section of the README in
631+
// middle::infer::region_inference for more details.
632+
let a_root_scope = a_ancestors[a_index];
633+
let b_root_scope = a_ancestors[a_index];
634+
return match (a_root_scope, b_root_scope) {
635+
(CodeExtent::DestructionScope(a_root_id),
636+
CodeExtent::DestructionScope(b_root_id)) => {
637+
if self.fn_is_enclosed_by(a_root_id, b_root_id) {
638+
// `a` is enclosed by `b`, hence `b` is the ancestor of everything in `a`
639+
Some(scope_b)
640+
} else if self.fn_is_enclosed_by(b_root_id, a_root_id) {
641+
// `b` is enclosed by `a`, hence `a` is the ancestor of everything in `b`
642+
Some(scope_a)
643+
} else {
644+
// neither fn encloses the other
645+
None
646+
}
647+
}
648+
_ => {
649+
// root ids are always Misc right now
650+
unreachable!()
651+
}
652+
};
613653
}
614654

615655
loop {
@@ -675,6 +715,7 @@ fn resolve_block(visitor: &mut RegionResolutionVisitor, blk: &ast::Block) {
675715
let prev_cx = visitor.cx;
676716

677717
let blk_scope = CodeExtent::Misc(blk.id);
718+
678719
// If block was previously marked as a terminating scope during
679720
// the recursive visit of its parent node in the AST, then we need
680721
// to account for the destruction scope representing the extent of
@@ -1144,7 +1185,7 @@ fn resolve_item(visitor: &mut RegionResolutionVisitor, item: &ast::Item) {
11441185
}
11451186

11461187
fn resolve_fn(visitor: &mut RegionResolutionVisitor,
1147-
fk: FnKind,
1188+
_: FnKind,
11481189
decl: &ast::FnDecl,
11491190
body: &ast::Block,
11501191
sp: Span,
@@ -1180,32 +1221,13 @@ fn resolve_fn(visitor: &mut RegionResolutionVisitor,
11801221
};
11811222
visit::walk_fn_decl(visitor, decl);
11821223

1183-
// The body of the fn itself is either a root scope (top-level fn)
1184-
// or it continues with the inherited scope (closures).
1185-
match fk {
1186-
visit::FkItemFn(..) | visit::FkMethod(..) => {
1187-
visitor.cx = Context {
1188-
root_id: Some(body.id),
1189-
parent: InnermostEnclosingExpr::None,
1190-
var_parent: InnermostDeclaringBlock::None
1191-
};
1192-
visitor.visit_block(body);
1193-
}
1194-
visit::FkFnBlock(..) => {
1195-
// FIXME(#3696) -- at present we are place the closure body
1196-
// within the region hierarchy exactly where it appears lexically.
1197-
// This is wrong because the closure may live longer
1198-
// than the enclosing expression. We should probably fix this,
1199-
// but the correct fix is a bit subtle, and I am also not sure
1200-
// that the present approach is unsound -- it may not permit
1201-
// any illegal programs. See issue for more details.
1202-
visitor.cx = Context {
1203-
root_id: Some(body.id),
1204-
..outer_cx
1205-
};
1206-
visitor.visit_block(body);
1207-
}
1208-
}
1224+
// The body of the every fn is a root scope.
1225+
visitor.cx = Context {
1226+
root_id: Some(body.id),
1227+
parent: InnermostEnclosingExpr::None,
1228+
var_parent: InnermostDeclaringBlock::None
1229+
};
1230+
visitor.visit_block(body);
12091231

12101232
// Restore context we had at the start.
12111233
visitor.cx = outer_cx;

trunk/src/librustc_driver/test.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -588,6 +588,7 @@ fn lub_free_free() {
588588
fn lub_returning_scope() {
589589
test_env(EMPTY_SOURCE_STR,
590590
errors(&["cannot infer an appropriate lifetime"]), |env| {
591+
env.create_simple_region_hierarchy();
591592
let t_rptr_scope10 = env.t_rptr_scope(10);
592593
let t_rptr_scope11 = env.t_rptr_scope(11);
593594

0 commit comments

Comments
 (0)