Skip to content

Commit 7588957

Browse files
committed
rewrap uninit
1 parent 14d11b3 commit 7588957

File tree

1 file changed

+98
-76
lines changed

1 file changed

+98
-76
lines changed

uninitialized.md

Lines changed: 98 additions & 76 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,14 @@
11
% Working With Uninitialized Memory
22

3-
All runtime-allocated memory in a Rust program begins its life as *uninitialized*. In this state the value of the memory is an indeterminate pile of bits that may or may not even reflect a valid state for the type that is supposed to inhabit that location of memory. Attempting to interpret this memory as a value of *any* type will cause Undefined Behaviour. Do Not Do This.
3+
All runtime-allocated memory in a Rust program begins its life as
4+
*uninitialized*. In this state the value of the memory is an indeterminate pile
5+
of bits that may or may not even reflect a valid state for the type that is
6+
supposed to inhabit that location of memory. Attempting to interpret this memory
7+
as a value of *any* type will cause Undefined Behaviour. Do Not Do This.
48

5-
Like C, all stack variables in Rust begin their life as uninitialized until a value is explicitly assigned to them. Unlike C, Rust statically prevents you from ever reading them until you do:
9+
Like C, all stack variables in Rust begin their life as uninitialized until a
10+
value is explicitly assigned to them. Unlike C, Rust statically prevents you
11+
from ever reading them until you do:
612

713
```rust
814
fn main() {
@@ -17,8 +23,11 @@ src/main.rs:3 println!("{}", x);
1723
^
1824
```
1925

20-
This is based off of a basic branch analysis: every branch must assign a value to `x` before it
21-
is first used. Interestingly, Rust doesn't require the variable to be mutable to perform a delayed initialization if every branch assigns exactly once. However the analysis does not take advantage of constant analysis or anything like that. So this compiles:
26+
This is based off of a basic branch analysis: every branch must assign a value
27+
to `x` before it is first used. Interestingly, Rust doesn't require the variable
28+
to be mutable to perform a delayed initialization if every branch assigns
29+
exactly once. However the analysis does not take advantage of constant analysis
30+
or anything like that. So this compiles:
2231

2332
```rust
2433
fn main() {
@@ -68,76 +77,88 @@ fn main() {
6877
}
6978
```
7079

71-
If a value is moved out of a variable, that variable becomes logically uninitialized if the type
72-
of the value isn't Copy. That is:
80+
If a value is moved out of a variable, that variable becomes logically
81+
uninitialized if the type of the value isn't Copy. That is:
7382

7483
```rust
7584
fn main() {
7685
let x = 0;
7786
let y = Box::new(0);
7887
let z1 = x; // x is still valid because i32 is Copy
79-
let z2 = y; // y has once more become logically uninitialized, since Box is not Copy
88+
let z2 = y; // y is now logically uninitialized because Box isn't Copy
8089
}
8190
```
8291

83-
However reassigning `y` in this example *would* require `y` to be marked as mutable, as a
84-
Safe Rust program could observe that the value of `y` changed. Otherwise the variable is
85-
exactly like new.
86-
87-
This raises an interesting question with respect to `Drop`: where does Rust
88-
try to call the destructor of a variable that is conditionally initialized?
89-
It turns out that Rust actually tracks whether a type should be dropped or not *at runtime*. As a
90-
variable becomes initialized and uninitialized, a *drop flag* for that variable is set and unset.
91-
When a variable goes out of scope or is assigned it evaluates whether the current value of the
92-
variable should be dropped. Of course, static analysis can remove these checks. If the compiler
93-
can prove that a value is guaranteed to be either initialized or not, then it can theoretically
94-
generate more efficient code! As such it may be desirable to structure code to have *static drop
95-
semantics* when possible.
96-
97-
As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a secret field of any type
98-
that implements Drop. The language sets the drop flag by overwriting the entire struct with a
99-
particular value. This is pretty obviously Not The Fastest and causes a bunch of trouble with
100-
optimizing code. As such work is currently under way to move the flags out onto the stack frame
101-
where they more reasonably belong. Unfortunately this work will take some time as it requires
102-
fairly substantial changes to the compiler.
103-
104-
So in general, Rust programs don't need to worry about uninitialized values on the stack for
105-
correctness. Although they might care for performance. Thankfully, Rust makes it easy to take
106-
control here! Uninitialized values are there, and Safe Rust lets you work with them, but you're
107-
never in trouble.
108-
109-
One interesting exception to this rule is working with arrays. Safe Rust doesn't permit you to
110-
partially initialize an array. When you initialize an array, you can either set every value to the
111-
same thing with `let x = [val; N]`, or you can specify each member individually with
112-
`let x = [val1, val2, val3]`. Unfortunately this is pretty rigid, especially if you need
113-
to initialize your array in a more incremental or dynamic way.
114-
115-
Unsafe Rust gives us a powerful tool to handle this problem: `std::mem::uninitialized`.
116-
This function pretends to return a value when really it does nothing at all. Using it, we can
117-
convince Rust that we have initialized a variable, allowing us to do trickier things with
118-
conditional and incremental initialization.
119-
120-
Unfortunately, this raises a tricky problem. Assignment has a different meaning to Rust based on
121-
whether it believes that a variable is initialized or not. If it's uninitialized, then Rust will
122-
semantically just memcopy the bits over the uninit ones, and do nothing else. However if Rust
123-
believes a value to be initialized, it will try to `Drop` the old value! Since we've tricked Rust
124-
into believing that the value is initialized, we can no longer safely use normal assignment.
125-
126-
This is also a problem if you're working with a raw system allocator, which of course returns a
127-
pointer to uninitialized memory.
128-
129-
To handle this, we must use the `std::ptr` module. In particular, it provides three functions that
130-
allow us to assign bytes to a location in memory without evaluating the old value: `write`, `copy`, and `copy_nonoverlapping`.
131-
132-
* `ptr::write(ptr, val)` takes a `val` and moves it into the address pointed to by `ptr`.
133-
* `ptr::copy(src, dest, count)` copies the bits that `count` T's would occupy from src to dest. (this is equivalent to memmove -- note that the argument order is reversed!)
134-
* `ptr::copy_nonoverlapping(src, dest, count)` does what `copy` does, but a little faster on the
135-
assumption that the two ranges of memory don't overlap. (this is equivalent to memcopy -- note that the argument order is reversed!)
136-
137-
It should go without saying that these functions, if misused, will cause serious havoc or just
138-
straight up Undefined Behaviour. The only things that these functions *themselves* require is that
139-
the locations you want to read and write are allocated. However the ways writing arbitrary bit
140-
patterns to arbitrary locations of memory can break things are basically uncountable!
92+
However reassigning `y` in this example *would* require `y` to be marked as
93+
mutable, as a Safe Rust program could observe that the value of `y` changed.
94+
Otherwise the variable is exactly like new.
95+
96+
This raises an interesting question with respect to `Drop`: where does Rust try
97+
to call the destructor of a variable that is conditionally initialized? It turns
98+
out that Rust actually tracks whether a type should be dropped or not *at
99+
runtime*. As a variable becomes initialized and uninitialized, a *drop flag* for
100+
that variable is set and unset. When a variable goes out of scope or is assigned
101+
it evaluates whether the current value of the variable should be dropped. Of
102+
course, static analysis can remove these checks. If the compiler can prove that
103+
a value is guaranteed to be either initialized or not, then it can theoretically
104+
generate more efficient code! As such it may be desirable to structure code to
105+
have *static drop semantics* when possible.
106+
107+
As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a secret
108+
field of any type that implements Drop. The language sets the drop flag by
109+
overwriting the entire struct with a particular value. This is pretty obviously
110+
Not The Fastest and causes a bunch of trouble with optimizing code. As such work
111+
is currently under way to move the flags out onto the stack frame where they
112+
more reasonably belong. Unfortunately this work will take some time as it
113+
requires fairly substantial changes to the compiler.
114+
115+
So in general, Rust programs don't need to worry about uninitialized values on
116+
the stack for correctness. Although they might care for performance. Thankfully,
117+
Rust makes it easy to take control here! Uninitialized values are there, and
118+
Safe Rust lets you work with them, but you're never in trouble.
119+
120+
One interesting exception to this rule is working with arrays. Safe Rust doesn't
121+
permit you to partially initialize an array. When you initialize an array, you
122+
can either set every value to the same thing with `let x = [val; N]`, or you can
123+
specify each member individually with `let x = [val1, val2, val3]`.
124+
Unfortunately this is pretty rigid, especially if you need to initialize your
125+
array in a more incremental or dynamic way.
126+
127+
Unsafe Rust gives us a powerful tool to handle this problem:
128+
`std::mem::uninitialized`. This function pretends to return a value when really
129+
it does nothing at all. Using it, we can convince Rust that we have initialized
130+
a variable, allowing us to do trickier things with conditional and incremental
131+
initialization.
132+
133+
Unfortunately, this raises a tricky problem. Assignment has a different meaning
134+
to Rust based on whether it believes that a variable is initialized or not. If
135+
it's uninitialized, then Rust will semantically just memcopy the bits over the
136+
uninit ones, and do nothing else. However if Rust believes a value to be
137+
initialized, it will try to `Drop` the old value! Since we've tricked Rust into
138+
believing that the value is initialized, we can no longer safely use normal
139+
assignment.
140+
141+
This is also a problem if you're working with a raw system allocator, which of
142+
course returns a pointer to uninitialized memory.
143+
144+
To handle this, we must use the `std::ptr` module. In particular, it provides
145+
three functions that allow us to assign bytes to a location in memory without
146+
evaluating the old value: `write`, `copy`, and `copy_nonoverlapping`.
147+
148+
* `ptr::write(ptr, val)` takes a `val` and moves it into the address pointed
149+
to by `ptr`.
150+
* `ptr::copy(src, dest, count)` copies the bits that `count` T's would occupy
151+
from src to dest. (this is equivalent to memmove -- note that the argument
152+
order is reversed!)
153+
* `ptr::copy_nonoverlapping(src, dest, count)` does what `copy` does, but a
154+
little faster on the assumption that the two ranges of memory don't overlap.
155+
(this is equivalent to memcopy -- note that the argument order is reversed!)
156+
157+
It should go without saying that these functions, if misused, will cause serious
158+
havoc or just straight up Undefined Behaviour. The only things that these
159+
functions *themselves* require is that the locations you want to read and write
160+
are allocated. However the ways writing arbitrary bit patterns to arbitrary
161+
locations of memory can break things are basically uncountable!
141162

142163
Putting this all together, we get the following:
143164

@@ -164,16 +185,17 @@ fn main() {
164185
}
165186
```
166187

167-
It's worth noting that you don't need to worry about ptr::write-style shenanigans with
168-
Plain Old Data (POD; types which don't implement Drop, nor contain Drop types),
169-
because Rust knows not to try to Drop them. Similarly you should be able to assign the POD
170-
fields of partially initialized structs directly.
188+
It's worth noting that you don't need to worry about ptr::write-style
189+
shenanigans with Plain Old Data (POD; types which don't implement Drop, nor
190+
contain Drop types), because Rust knows not to try to Drop them. Similarly you
191+
should be able to assign the POD fields of partially initialized structs
192+
directly.
171193

172-
However when working with uninitialized memory you need to be ever vigilant for Rust trying to
173-
Drop values you make like this before they're fully initialized. So every control path through
174-
that variable's scope must initialize the value before it ends. *This includes code panicking*.
175-
Again, POD types need not worry.
194+
However when working with uninitialized memory you need to be ever vigilant for
195+
Rust trying to Drop values you make like this before they're fully initialized.
196+
So every control path through that variable's scope must initialize the value
197+
before it ends. *This includes code panicking*. Again, POD types need not worry.
176198

177-
And that's about it for working with uninitialized memory! Basically nothing anywhere expects
178-
to be handed uninitialized memory, so if you're going to pass it around at all, be sure to be
179-
*really* careful.
199+
And that's about it for working with uninitialized memory! Basically nothing
200+
anywhere expects to be handed uninitialized memory, so if you're going to pass
201+
it around at all, be sure to be *really* careful.

0 commit comments

Comments
 (0)