Skip to content

Commit a2f64d6

Browse files
committed
---
yaml --- r: 236012 b: refs/heads/stable c: cc5b4d3 h: refs/heads/master v: v3
1 parent d46f065 commit a2f64d6

File tree

4 files changed

+372
-1
lines changed

4 files changed

+372
-1
lines changed

[refs]

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ refs/heads/tmp: afae2ff723393b3ab4ccffef6ac7c6d1809e2da0
2929
refs/tags/1.0.0-alpha.2: 4c705f6bc559886632d3871b04f58aab093bfa2f
3030
refs/tags/homu-tmp: f859507de8c410b648d934d8f5ec1c52daac971d
3131
refs/tags/1.0.0-beta: 8cbb92b53468ee2b0c2d3eeb8567005953d40828
32-
refs/heads/stable: 89df25f400844092e2f667f954e84f839e3731bc
32+
refs/heads/stable: cc5b4d314cef29aa94c5ab508f13726d934ba142
3333
refs/tags/1.0.0: 55bd4f8ff2b323f317ae89e254ce87162d52a375
3434
refs/tags/1.1.0: bc3c16f09287e5545c1d3f76b7abd54f2eca868b
3535
refs/tags/1.2.0: f557861f822c34f07270347b94b5280de20a597e

branches/stable/lifetimes.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
% Advanced Lifetimes
2+
3+
Lifetimes are the breakout feature of Rust.
4+
5+
# Safe Rust
6+
7+
* no aliasing of &mut
8+
9+
# Unsafe Rust
10+
11+
* Splitting lifetimes into disjoint regions
12+
* Creating lifetimes from raw pointers
13+
*

branches/stable/raii.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
% The Perils Of RAII
2+
3+
Ownership Based Resource Management (AKA RAII: Resource Acquisition is Initialization) is
4+
something you'll interact with a lot in Rust. Especially if you use the standard library.
5+
6+
Roughly speaking the pattern is as follows: to acquire a resource, you create an object that
7+
manages it. To release the resource, you simply destroy the object, and it cleans up the
8+
resource for you. The most common "resource"
9+
this pattern manages is simply *memory*. `Box`, `Rc`, and basically everything in
10+
`std::collections` is a convenience to enable correctly managing memory. This is particularly
11+
important in Rust because we have no pervasive GC to rely on for memory management. Which is the
12+
point, really: Rust is about control. However we are not limited to just memory.
13+
Pretty much every other system resource like a thread, file, or socket is exposed through
14+
this kind of API.
15+
16+
So, how does RAII work in Rust? Unlike C++, Rust does not come with a slew on builtin
17+
kinds of constructor. There are no Copy, Default, Assignment, Move, or whatever constructors.
18+
This largely has to do with Rust's philosophy of being explicit.
19+
20+
Move constructors are meaningless in Rust because we don't enable types to "care" about their
21+
location in memory. Every type must be ready for it to be blindly memcopied to somewhere else
22+
in memory. This means pure on-the-stack-but-still-movable intrusive linked lists are simply
23+
not happening in Rust (safely).
24+
25+
Assignment and copy constructors similarly don't exist because move semantics are the *default*
26+
in rust. At most `x = y` just moves the bits of y into the x variable. Rust does provide two
27+
facilities for going back to C++'s copy-oriented semantics: `Copy` and `Clone`. Clone is our
28+
moral equivalent of copy constructor, but it's never implicitly invoked. You have to explicitly
29+
call `clone` on an element you want to be cloned. Copy is a special case of Clone where the
30+
implementation is just "duplicate the bitwise representation". Copy types *are* implicitely
31+
cloned whenever they're moved, but because of the definition of Copy this just means *not*
32+
treating the old copy as uninitialized; a no-op.
33+
34+
While Rust provides a `Default` trait for specifying the moral equivalent of a default
35+
constructor, it's incredibly rare for this trait to be used. This is because variables
36+
aren't implicitely initialized (see [working with uninitialized memory][uninit] for details).
37+
Default is basically only useful for generic programming.
38+
39+
More often than not, in a concrete case a type will provide a static `new` method for any
40+
kind of "default" constructor. This has no relation to `new` in other languages and has no
41+
special meaning. It's just a naming convention.
42+
43+
What the language *does* provide is full-blown automatic destructors through the `Drop` trait,
44+
which provides the following method:
45+
46+
```rust
47+
fn drop(&mut self);
48+
```
49+
50+
This method gives the type time to somehow finish what it was doing. **After `drop` is run,
51+
Rust will recursively try to drop all of the fields of the `self` struct**. This is a
52+
convenience feature so that you don't have to write "destructor boilerplate" dropping
53+
children. **There is no way to prevent this in Rust 1.0**. Also note that `&mut self` means
54+
that even if you *could* supress recursive Drop, Rust will prevent you from e.g. moving fields
55+
out of self. For most types, this is totally fine: they own all their data, there's no
56+
additional state passed into drop to try to send it to, and `self` is about to be marked as
57+
uninitialized (and therefore inaccessible).
58+
59+
For instance, a custom implementation of `Box` might write `Drop` like this:
60+
61+
```rust
62+
struct Box<T>{ ptr: *mut T }
63+
64+
impl<T> Drop for Box<T> {
65+
fn drop(&mut self) {
66+
unsafe {
67+
(*self.ptr).drop();
68+
heap::deallocate(self.ptr);
69+
}
70+
}
71+
}
72+
```
73+
74+
and this works fine because when Rust goes to drop the `ptr` field it just sees a *mut that
75+
has no actual `Drop` implementation. Similarly nothing can use-after-free the `ptr` because
76+
the Box is completely gone.
77+
78+
However this wouldn't work:
79+
80+
```rust
81+
struct Box<T>{ ptr: *mut T }
82+
83+
impl<T> Drop for Box<T> {
84+
fn drop(&mut self) {
85+
unsafe {
86+
(*self.ptr).drop();
87+
heap::deallocate(self.ptr);
88+
}
89+
}
90+
}
91+
92+
struct SuperBox<T> { box: Box<T> }
93+
94+
impl<T> Drop for SuperBox<T> {
95+
fn drop(&mut self) {
96+
unsafe {
97+
// Hyper-optimized: deallocate the box's contents for it
98+
// without `drop`ing the contents
99+
heap::deallocate(self.box.ptr);
100+
}
101+
}
102+
}
103+
```
104+
105+
because after we deallocate the `box`'s ptr in SuperBox's destructor, Rust will
106+
happily proceed to tell the box to Drop itself and everything will blow up with
107+
use-after-frees and double-frees.
108+
109+
Note that the recursive drop behaviour applies to *all* structs and enums
110+
regardless of whether they implement Drop. Therefore something like
111+
112+
```rust
113+
struct Boxy<T> {
114+
data1: Box<T>,
115+
data2: Box<T>,
116+
info: u32,
117+
}
118+
```
119+
120+
will have its data1 and data2's fields destructors whenever it "would" be
121+
dropped, even though it itself doesn't implement Drop. We say that such a type
122+
*needs Drop*, even though it is not itself Drop.
123+
124+
Similarly,
125+
126+
```rust
127+
enum Link {
128+
Next(Box<Link>),
129+
None,
130+
}
131+
```
132+
133+
will have its inner Box field dropped *if and only if* a value stores the Next variant.
134+
135+
In general this works really nice because you don't need to worry about adding/removing
136+
dtors when you refactor your data layout. Still there's certainly many valid usecases for
137+
needing to do trickier things with destructors.
138+
139+
The classic safe solution to blocking recursive drop semantics and allowing moving out
140+
of Self is to use an Option:
141+
142+
```rust
143+
struct Box<T>{ ptr: *mut T }
144+
145+
impl<T> Drop for Box<T> {
146+
fn drop(&mut self) {
147+
unsafe {
148+
(*self.ptr).drop();
149+
heap::deallocate(self.ptr);
150+
}
151+
}
152+
}
153+
154+
struct SuperBox<T> { box: Option<Box<T>> }
155+
156+
impl<T> Drop for SuperBox<T> {
157+
fn drop(&mut self) {
158+
unsafe {
159+
// Hyper-optimized: deallocate the box's contents for it
160+
// without `drop`ing the contents. Need to set the `box`
161+
// fields as `None` to prevent Rust from trying to Drop it.
162+
heap::deallocate(self.box.take().unwrap().ptr);
163+
}
164+
}
165+
}
166+
```
167+
168+
However this has fairly odd semantics: you're saying that a field that *should* always be Some
169+
may be None, just because that happens in the dtor. Of course this conversely makes a lot of sense:
170+
you can call arbitrary methods on self during the destructor, and this should prevent you from
171+
ever doing so after deinitializing the field. Not that it will prevent you from producing any other
172+
arbitrarily invalid state in there.
173+
174+
On balance this is an ok choice. Certainly if you're just getting started.
175+
176+
In the future, we expect there to be a first-class way to announce that a field
177+
should be automatically dropped.
178+
179+
[uninit]:

branches/stable/uninitialized.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
% Working With Uninitialized Memory
2+
3+
All runtime-allocated memory in a Rust program begins its life as *uninitialized*. In this state the value of the memory is an indeterminate pile of bits that may or may not even reflect a valid state for the type that is supposed to inhabit that location of memory. Attempting to interpret this memory as a value of *any* type will cause Undefined Behaviour. Do Not Do This.
4+
5+
Like C, all stack variables in Rust begin their life as uninitialized until a value is explicitly assigned to them. Unlike C, Rust statically prevents you from ever reading them until you do:
6+
7+
```rust
8+
fn main() {
9+
let x: i32;
10+
println!("{}", x);
11+
}
12+
```
13+
14+
```text
15+
src/main.rs:3:20: 3:21 error: use of possibly uninitialized variable: `x`
16+
src/main.rs:3 println!("{}", x);
17+
^
18+
```
19+
20+
This is based off of a basic branch analysis: every branch must assign a value to `x` before it
21+
is first used. Interestingly, Rust doesn't require the variable to be mutable to perform a delayed initialization if every branch assigns exactly once. However the analysis does not take advantage of constant analysis or anything like that. So this compiles:
22+
23+
```rust
24+
fn main() {
25+
let x: i32;
26+
let y: i32;
27+
28+
y = 1;
29+
30+
if true {
31+
x = 1;
32+
} else {
33+
x = 2;
34+
}
35+
36+
println!("{} {}", x, y);
37+
}
38+
```
39+
40+
but this doesn't:
41+
42+
```rust
43+
fn main() {
44+
let x: i32;
45+
if true {
46+
x = 1;
47+
}
48+
println!("{}", x);
49+
}
50+
```
51+
52+
```text
53+
src/main.rs:6:17: 6:18 error: use of possibly uninitialized variable: `x`
54+
src/main.rs:6 println!("{}", x);
55+
```
56+
57+
while this does:
58+
59+
```rust
60+
fn main() {
61+
let x: i32;
62+
if true {
63+
x = 1;
64+
println!("{}", x);
65+
}
66+
// Don't care that there are branches where it's not initialized
67+
// since we don't use the value in those branches
68+
}
69+
```
70+
71+
If a value is moved out of a variable, that variable becomes logically uninitialized if the type
72+
of the value isn't Copy. That is:
73+
74+
```rust
75+
fn main() {
76+
let x = 0;
77+
let y = Box::new(0);
78+
let z1 = x; // x is still valid because i32 is Copy
79+
let z2 = y; // y has once more become logically uninitialized, since Box is not Copy
80+
}
81+
```
82+
83+
However reassigning `y` in this example *would* require `y` to be marked as mutable, as a
84+
Safe Rust program could observe that the value of `y` changed. Otherwise the variable is
85+
exactly like new.
86+
87+
This raises an interesting question with respect to `Drop`: where does Rust
88+
try to call the destructor of a variable that is conditionally initialized?
89+
It turns out that Rust actually tracks whether a type should be dropped or not *at runtime*. As a
90+
variable becomes initialized and uninitialized, a *drop flag* for that variable is set and unset.
91+
When a variable goes out of scope or is assigned it evaluates whether the current value of the
92+
variable should be dropped. Of course, static analysis can remove these checks. If the compiler
93+
can prove that a value is guaranteed to be either initialized or not, then it can theoretically
94+
generate more efficient code! As such it may be desirable to structure code to have *static drop
95+
semantics* when possible.
96+
97+
As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a secret field of any type
98+
that implements Drop. The language sets the drop flag by overwriting the entire struct with a
99+
particular value. This is pretty obviously Not The Fastest and causes a bunch of trouble with
100+
optimizing code. As such work is currently under way to move the flags out onto the stack frame
101+
where they more reasonably belong. Unfortunately this work will take some time as it requires
102+
fairly substantial changes to the compiler.
103+
104+
So in general, Rust programs don't need to worry about uninitialized values on the stack for
105+
correctness. Although they might care for performance. Thankfully, Rust makes it easy to take
106+
control here! Uninitialized values are there, and Safe Rust lets you work with them, but you're
107+
never in trouble.
108+
109+
One interesting exception to this rule is working with arrays. Safe Rust doesn't permit you to
110+
partially initialize an array. When you initialize an array, you can either set every value to the
111+
same thing with `let x = [val; N]`, or you can specify each member individually with
112+
`let x = [val1, val2, val3]`. Unfortunately this is pretty rigid, especially if you need
113+
to initialize your array in a more incremental or dynamic way.
114+
115+
Unsafe Rust gives us a powerful tool to handle this problem: `std::mem::uninitialized`.
116+
This function pretends to return a value when really it does nothing at all. Using it, we can
117+
convince Rust that we have initialized a variable, allowing us to do trickier things with
118+
conditional and incremental initialization.
119+
120+
Unfortunately, this raises a tricky problem. Assignment has a different meaning to Rust based on
121+
whether it believes that a variable is initialized or not. If it's uninitialized, then Rust will
122+
semantically just memcopy the bits over the uninit ones, and do nothing else. However if Rust
123+
believes a value to be initialized, it will try to `Drop` the old value! Since we've tricked Rust
124+
into believing that the value is initialized, we can no longer safely use normal assignment.
125+
126+
This is also a problem if you're working with a raw system allocator, which of course returns a
127+
pointer to uninitialized memory.
128+
129+
To handle this, we must use the `std::ptr` module. In particular, it provides three functions that
130+
allow us to assign bytes to a location in memory without evaluating the old value: `write`, `copy`, and `copy_nonoverlapping`.
131+
132+
* `ptr::write(ptr, val)` takes a `val` and moves it into the address pointed to by `ptr`.
133+
* `ptr::copy(src, dest, count)` copies the bits that `count` T's would occupy from src to dest. (this is equivalent to memmove -- note that the argument order is reversed!)
134+
* `ptr::copy_nonoverlapping(src, dest, count)` does what `copy` does, but a little faster on the
135+
assumption that the two ranges of memory don't overlap. (this is equivalent to memcopy -- note that the argument order is reversed!)
136+
137+
It should go without saying that these functions, if misused, will cause serious havoc or just
138+
straight up Undefined Behaviour. The only things that these functions *themselves* require is that
139+
the locations you want to read and write are allocated. However the ways writing arbitrary bit
140+
patterns to arbitrary locations of memory can break things are basically uncountable!
141+
142+
Putting this all together, we get the following:
143+
144+
```rust
145+
fn main() {
146+
use std::mem;
147+
148+
// size of the array is hard-coded but easy to change. This means we can't
149+
// use [a, b, c] syntax to initialize the array, though!
150+
const SIZE = 10;
151+
152+
let x: [Box<u32>; SIZE];
153+
154+
unsafe {
155+
// convince Rust that x is Totally Initialized
156+
x = mem::uninitialized();
157+
for i in 0..SIZE {
158+
// very carefully overwrite each index without reading it
159+
ptr::write(&mut x[i], Box::new(i));
160+
}
161+
}
162+
163+
println!("{}", x);
164+
}
165+
```
166+
167+
It's worth noting that you don't need to worry about ptr::write-style shenanigans with
168+
Plain Old Data (POD; types which don't implement Drop, nor contain Drop types),
169+
because Rust knows not to try to Drop them. Similarly you should be able to assign the POD
170+
fields of partially initialized structs directly.
171+
172+
However when working with uninitialized memory you need to be ever vigilant for Rust trying to
173+
Drop values you make like this before they're fully initialized. So every control path through
174+
that variable's scope must initialize the value before it ends. *This includes code panicking*.
175+
Again, POD types need not worry.
176+
177+
And that's about it for working with uninitialized memory! Basically nothing anywhere expects
178+
to be handed uninitialized memory, so if you're going to pass it around at all, be sure to be
179+
*really* careful.

0 commit comments

Comments
 (0)