1
1
# Subtyping and Variance
2
2
3
+ Subtyping is a relationship between types that allows statically typed
4
+ languages to be a bit more flexible and permissive.
5
+
6
+ The most common and easy to understand example of this can be found in
7
+ languages with inheritance. Consider an Animal type which has an ` eat() `
8
+ method, and a Cat type which extends Animal, adding a ` meow() ` method.
9
+ Without subtyping, if someone were to write a ` feed(Animal) ` function, they
10
+ wouldn't be able to pass a Cat to this function, because a Cat isn't * exactly*
11
+ an Animal. But being able to pass a Cat where an Animal is expected seems
12
+ fairly reasonable. After all, a Cat is just an Animal * and more* . Something
13
+ having extra features that can be ignored shouldn't be any impediment to
14
+ using it!
15
+
16
+ This is exactly what subtyping lets us do. Because a Cat is an Animal * and more*
17
+ we say that Cat is a * subtype* of Animal. We then say that anywhere a value of
18
+ a certain type is expected, a value with a subtype can also be supplied. Ok
19
+ actually it's a lot more complicated and subtle than that, but that's the
20
+ basic intuition that gets you by in 99% of the cases. We'll cover why it's
21
+ * only* 99% later in this section.
22
+
3
23
Although Rust doesn't have any notion of structural inheritance, it * does*
4
24
include subtyping. In Rust, subtyping derives entirely from lifetimes. Since
5
- lifetimes are scopes, we can partially order them based on the * contains*
6
- (outlives) relationship. We can even express this as a generic bound.
7
-
8
- Subtyping on lifetimes is in terms of that relationship: if ` 'a: 'b ` ("a contains
9
- b" or "a outlives b"), then ` 'a ` is a subtype of ` 'b ` . This is a large source of
10
- confusion, because it seems intuitively backwards to many: the bigger scope is a
11
- * subtype* of the smaller scope.
12
-
13
- This does in fact make sense, though. The intuitive reason for this is that if
14
- you expect an ` &'a u8 ` (for some concrete ` 'a ` that you have already chosen),
15
- then it's totally fine for me to hand you an ` &'static u8 ` even if `'static !=
16
- 'a`, in the same way that if you expect an Animal in Java, it's totally fine
17
- for me to hand you a Cat. Cats are just Animals * and more* , just as ` 'static `
18
- is just ` 'a ` * and more* .
19
-
20
- (Note, the subtyping relationship and typed-ness of lifetimes is a fairly
21
- arbitrary construct that some disagree with. However it simplifies our analysis
22
- to treat lifetimes and types uniformly.)
25
+ lifetimes are regions of code, we can partially order them based on the
26
+ * contains* (outlives) relationship.
27
+
28
+ Subtyping on lifetimes is in terms of that relationship: if ` 'big: 'small `
29
+ ("big contains small" or "big outlives small"), then ` 'big ` is a subtype
30
+ of ` 'small ` . This is a large source of confusion, because it seems backwards
31
+ to many: the bigger region is a * subtype* of the smaller region. But it makes
32
+ sense if you consider our Animal example: * Cat* is an Animal * and more* ,
33
+ just as ` 'big ` is ` 'small ` * and more* .
34
+
35
+ Put another way, if someone wants a reference that lives for ` 'small ` ,
36
+ usually what they actually mean is that they want a reference that lives
37
+ for * at least* ` 'small ` . They don't actually care if the lifetimes match
38
+ exactly. For this reason ` 'static ` , the forever lifetime, is a subtype
39
+ of every lifetime.
23
40
24
41
Higher-ranked lifetimes are also subtypes of every concrete lifetime. This is
25
42
because taking an arbitrary lifetime is strictly more general than taking a
26
43
specific one.
27
44
45
+ (The typed-ness of lifetimes is a fairly arbitrary construct that some
46
+ disagree with. However it simplifies our analysis to treat lifetimes
47
+ and types uniformly.)
48
+
49
+ However you can't write a function that takes a value of type ` 'a ` ! Lifetimes
50
+ are always just part of another type, so we need a way of handling that.
51
+ To handle it, we need to talk about * variance* .
52
+
53
+
54
+
28
55
29
56
30
57
# Variance
@@ -38,41 +65,39 @@ For instance `Vec` is a type constructor that takes a `T` and returns a
38
65
lifetime, and a type to point to.
39
66
40
67
A type constructor's * variance* is how the subtyping of its inputs affects the
41
- subtyping of its outputs. There are two kinds of variance in Rust:
68
+ subtyping of its outputs. There are three kinds of variance in Rust:
42
69
43
- * F is * variant * over ` T ` if ` T ` being a subtype of ` U ` implies
70
+ * F is * covariant * over ` T ` if ` T ` being a subtype of ` U ` implies
44
71
` F<T> ` is a subtype of ` F<U> ` (subtyping "passes through")
72
+ * F is * contravariant* over ` T ` if ` T ` being a subtype of ` U ` implies
73
+ ` F<U> ` is a subtype of ` F<U> ` (subtyping is "inverted")
45
74
* F is * invariant* over ` T ` otherwise (no subtyping relation can be derived)
46
75
47
- (For those of you who are familiar with variance from other languages, what we
48
- refer to as "just" variance is in fact * covariance* . Rust has * contravariance*
49
- for functions. The future of contravariance is uncertain and it may be
50
- scrapped. For now, ` fn(T) ` is contravariant in ` T ` , which is used in matching
51
- methods in trait implementations to the trait definition. Traits don't have
52
- inferred variance, so ` Fn(T) ` is invariant in ` T ` ).
76
+ It should be noted that covariance is * far* more common and important than
77
+ contravariance in Rust. The existence of contravariance in Rust can mostly
78
+ be ignored.
53
79
54
- Some important variances:
80
+ Some important variances (which we will explain in detail below) :
55
81
56
- * ` &'a T ` is variant over ` 'a ` and ` T ` (as is ` *const T ` by metaphor)
57
- * ` &'a mut T ` is variant over ` 'a ` but invariant over ` T `
58
- * ` Fn (T) -> U` is invariant over ` T ` , but variant over ` U `
59
- * ` Box ` , ` Vec ` , and all other collections are variant over the types of
82
+ * ` &'a T ` is covariant over ` 'a ` and ` T ` (as is ` *const T ` by metaphor)
83
+ * ` &'a mut T ` is covariant over ` 'a ` but invariant over ` T `
84
+ * ` fn (T) -> U` is ** contra ** variant over ` T ` , but covariant over ` U `
85
+ * ` Box ` , ` Vec ` , and all other collections are covariant over the types of
60
86
their contents
61
87
* ` UnsafeCell<T> ` , ` Cell<T> ` , ` RefCell<T> ` , ` Mutex<T> ` and all other
62
88
interior mutability types are invariant over T (as is ` *mut T ` by metaphor)
63
89
64
90
To understand why these variances are correct and desirable, we will consider
65
91
several examples.
66
92
67
-
68
- We have already covered why ` &'a T ` should be variant over ` 'a ` when
93
+ We have already covered why ` &'a T ` should be covariant over ` 'a ` when
69
94
introducing subtyping: it's desirable to be able to pass longer-lived things
70
95
where shorter-lived things are needed.
71
96
72
- Similar reasoning applies to why it should be variant over T. It is reasonable
97
+ Similar reasoning applies to why it should be covariant over T: it's reasonable
73
98
to be able to pass ` &&'static str ` where an ` &&'a str ` is expected. The
74
- additional level of indirection does not change the desire to be able to pass
75
- longer lived things where shorted lived things are expected.
99
+ additional level of indirection doesn't change the desire to be able to pass
100
+ longer lived things where shorter lived things are expected.
76
101
77
102
However this logic doesn't apply to ` &mut ` . To see why ` &mut ` should
78
103
be invariant over T, consider the following code:
@@ -94,66 +119,75 @@ fn main() {
94
119
```
95
120
96
121
The signature of ` overwrite ` is clearly valid: it takes mutable references to
97
- two values of the same type, and overwrites one with the other. If ` &mut T ` was
98
- variant over T, then ` &mut &'static str ` would be a subtype of ` &mut &'a str ` ,
99
- since ` &'static str ` is a subtype of ` &'a str ` . Therefore the lifetime of
100
- ` forever_str ` would successfully be "shrunk" down to the shorter lifetime of
101
- ` string ` , and ` overwrite ` would be called successfully. ` string ` would
102
- subsequently be dropped, and ` forever_str ` would point to freed memory when we
103
- print it! Therefore ` &mut ` should be invariant.
122
+ two values of the same type, and overwrites one with the other.
123
+
124
+ But, if ` &mut T ` was covariant over T, then ` &mut &'static str ` would be a
125
+ subtype of ` &mut &'a str ` , since ` &'static str ` is a subtype of ` &'a str ` .
126
+ Therefore the lifetime of ` forever_str ` would successfully be "shrunk" down
127
+ to the shorter lifetime of ` string ` , and ` overwrite ` would be called successfully.
128
+ ` string ` would subsequently be dropped, and ` forever_str ` would point to
129
+ freed memory when we print it! Therefore ` &mut ` should be invariant.
104
130
105
131
This is the general theme of variance vs invariance: if variance would allow you
106
- to store a short-lived value into a longer-lived slot, then you must be
107
- invariant.
132
+ to store a short-lived value in a longer-lived slot, then invariance must be used.
108
133
109
- However it * is* sound for ` &'a mut T ` to be variant over ` 'a ` . The key difference
134
+ More generally, the soundness of subtyping and variance is based on the idea that its ok to
135
+ forget details, but with mutable references there's always someone (the original
136
+ value being referenced) that remembers the forgotten details and will assume
137
+ that those details haven't changed. If we do something to invalidate those details,
138
+ the original location can behave unsoundly.
139
+
140
+ However it * is* sound for ` &'a mut T ` to be covariant over ` 'a ` . The key difference
110
141
between ` 'a ` and T is that ` 'a ` is a property of the reference itself,
111
142
while T is something the reference is borrowing. If you change T's type, then
112
143
the source still remembers the original type. However if you change the
113
144
lifetime's type, no one but the reference knows this information, so it's fine.
114
145
Put another way: ` &'a mut T ` owns ` 'a ` , but only * borrows* T.
115
146
116
- ` Box ` and ` Vec ` are interesting cases because they're variant, but you can
117
- definitely store values in them! This is where Rust gets really clever: it's
118
- fine for them to be variant because you can only store values
119
- in them * via a mutable reference* ! The mutable reference makes the whole type
120
- invariant, and therefore prevents you from smuggling a short-lived type into
121
- them.
122
-
123
- Being variant allows ` Box ` and ` Vec ` to be weakened when shared
124
- immutably. So you can pass a ` &Box<&'static str> ` where a ` &Box<&'a str> ` is
125
- expected.
147
+ ` Box ` and ` Vec ` are interesting cases because they're covariant, but you can
148
+ definitely store values in them! This is where Rust's typesystem allows it to
149
+ be a bit more clever than others. To understand why it's sound for owning
150
+ containers to be covariant over their contents, we must consider
151
+ the two ways in which a mutation may occur: by-value or by-reference.
126
152
127
- However what should happen when passing * by-value* is less obvious. It turns out
128
- that, yes, you can use subtyping when passing by-value. That is, this works:
153
+ If mutation is by-value, then the old location that remembers extra details is
154
+ moved out of, meaning it can't use the value anymore. So we simply don't need to
155
+ worry about anyone remembering dangerous details. Put another way, applying
156
+ subtyping when passing by-value * destroys details forever* . For example, this
157
+ compiles and is fine:
129
158
130
159
``` rust
131
160
fn get_box <'a >(str : & 'a str ) -> Box <& 'a str > {
132
- // string literals are `&'static str`s
161
+ // String literals are `&'static str`s, but it's fine for us to
162
+ // "forget" this and let the caller think the string won't live that long.
133
163
Box :: new (" hello" )
134
164
}
135
165
```
136
166
137
- Weakening when you pass by-value is fine because there's no one else who
138
- "remembers" the old lifetime in the Box. The reason a variant ` &mut ` was
139
- trouble was because there's always someone else who remembers the original
140
- subtype: the actual owner.
167
+ If mutation is by-reference, then our container is passed as ` &mut Vec<T> ` . But
168
+ ` &mut ` is invariant over its value, so ` &mut Vec<T> ` is actually invariant over ` T ` .
169
+ So the fact that ` Vec<T> ` is covariant over ` T ` doesn't matter at all when
170
+ mutating by-reference.
171
+
172
+ But being covariant still allows ` Box ` and ` Vec ` to be weakened when shared
173
+ immutably. So you can pass a ` &Vec<&'static str> ` where a ` &Vec<&'a str> ` is
174
+ expected.
141
175
142
176
The invariance of the cell types can be seen as follows: ` & ` is like an ` &mut `
143
177
for a cell, because you can still store values in them through an ` & ` . Therefore
144
178
cells must be invariant to avoid lifetime smuggling.
145
179
146
- ` Fn ` is the most subtle case because it has mixed variance. To see why
147
- ` Fn (T) -> U` should be invariant over T, consider the following function
148
- signature:
180
+ ` fn ` is the most subtle case because they have mixed variance, and in fact are
181
+ the only source of ** contra ** variance. To see why ` fn (T) -> U` should be contravariant
182
+ over T, consider the following function signature:
149
183
150
184
``` rust,ignore
151
185
// 'a is derived from some parent scope
152
186
fn foo(&'a str) -> usize;
153
187
```
154
188
155
189
This signature claims that it can handle any ` &str ` that lives at least as
156
- long as ` 'a ` . Now if this signature was variant over ` &'a str ` , that
190
+ long as ` 'a ` . Now if this signature was ** co ** variant over ` &'a str ` , that
157
191
would mean
158
192
159
193
``` rust,ignore
@@ -163,10 +197,27 @@ fn foo(&'static str) -> usize;
163
197
could be provided in its place, as it would be a subtype. However this function
164
198
has a stronger requirement: it says that it can only handle ` &'static str ` s,
165
199
and nothing else. Giving ` &'a str ` s to it would be unsound, as it's free to
166
- assume that what it's given lives forever. Therefore functions are not variant
167
- over their arguments.
200
+ assume that what it's given lives forever. Therefore functions definitely shouldn't
201
+ be ** co** variant over their arguments.
202
+
203
+ However if we flip it around and use ** contra** variance, it * does* work! If
204
+ something expects a function which can handle strings that live forever,
205
+ it makes perfect sense to instead provide a function that can handle
206
+ strings that live for * less* than forever. So
168
207
169
- To see why ` Fn(T) -> U ` should be variant over U, consider the following
208
+ ``` rust,ignore
209
+ fn foo(&'a str) -> usize;
210
+ ```
211
+
212
+ can be passed where
213
+
214
+ ``` rust,ignore
215
+ fn foo(&'static str) -> usize;
216
+ ```
217
+
218
+ is expected.
219
+
220
+ To see why ` fn(T) -> U ` should be ** co** variant over U, consider the following
170
221
function signature:
171
222
172
223
``` rust,ignore
@@ -181,7 +232,8 @@ therefore completely reasonable to provide
181
232
fn foo(usize) -> &'static str;
182
233
```
183
234
184
- in its place. Therefore functions are variant over their return type.
235
+ in its place, as it does indeed return things that outlive ` 'a ` . Therefore
236
+ functions are covariant over their return type.
185
237
186
238
` *const ` has the exact same semantics as ` & ` , so variance follows. ` *mut ` on the
187
239
other hand can dereference to an ` &mut ` whether shared or not, so it is marked
@@ -191,24 +243,32 @@ This is all well and good for the types the standard library provides, but
191
243
how is variance determined for type that * you* define? A struct, informally
192
244
speaking, inherits the variance of its fields. If a struct ` Foo `
193
245
has a generic argument ` A ` that is used in a field ` a ` , then Foo's variance
194
- over ` A ` is exactly ` a ` 's variance. However this is complicated if ` A ` is used
195
- in multiple fields.
246
+ over ` A ` is exactly ` a ` 's variance. However if ` A ` is used in multiple fields:
196
247
197
- * If all uses of A are variant, then Foo is variant over A
248
+ * If all uses of A are covariant, then Foo is covariant over A
249
+ * If all uses of A are contravariant, then Foo is contravariant over A
198
250
* Otherwise, Foo is invariant over A
199
251
200
252
``` rust
201
253
use std :: cell :: Cell ;
202
254
203
- struct Foo <'a , 'b , A : 'a , B : 'b , C , D , E , F , G , H > {
204
- a : & 'a A , // variant over 'a and A
205
- b : & 'b mut B , // variant over 'b and invariant over B
206
- c : * const C , // variant over C
255
+ struct Foo <'a , 'b , A : 'a , B : 'b , C , D , E , F , G , H , In , Out , Mixed > {
256
+ a : & 'a A , // covariant over 'a and A
257
+ b : & 'b mut B , // covariant over 'b and invariant over B
258
+
259
+ c : * const C , // covariant over C
207
260
d : * mut D , // invariant over D
208
- e : Vec <E >, // variant over E
209
- f : Cell <F >, // invariant over F
210
- g : G , // variant over G
261
+
262
+ e : E , // covariant over E
263
+ f : Vec <F >, // covariant over F
264
+ g : Cell <G >, // invariant over G
265
+
211
266
h1 : H , // would also be variant over H except...
212
- h2 : Cell <H >, // invariant over H, because invariance wins
267
+ h2 : Cell <H >, // invariant over H, because invariance wins all conflicts
268
+
269
+ i : fn (In ) -> Out , // contravariant over In, covariant over Out
270
+
271
+ k1 : fn (Mixed ) -> usize , // would be contravariant over Mixed except..
272
+ k2 : Mixed , // invariant over Mixed, because invariance wins all conflicts
213
273
}
214
274
```
0 commit comments