|
| 1 | +% Working With Uninitialized Memory |
| 2 | + |
| 3 | +All runtime-allocated memory in a Rust program begins its life as *uninitialized*. In this state the value of the memory is an indeterminate pile of bits that may or may not even reflect a valid state for the type that is supposed to inhabit that location of memory. Attempting to interpret this memory as a value of *any* type will cause Undefined Behaviour. Do Not Do This. |
| 4 | + |
| 5 | +Like C, all stack variables in Rust begin their life as uninitialized until a value is explicitly assigned to them. Unlike C, Rust statically prevents you from ever reading them until you do: |
| 6 | + |
| 7 | +```rust |
| 8 | +fn main() { |
| 9 | + let x: i32; |
| 10 | + println!("{}", x); |
| 11 | +} |
| 12 | +``` |
| 13 | + |
| 14 | +```text |
| 15 | +src/main.rs:3:20: 3:21 error: use of possibly uninitialized variable: `x` |
| 16 | +src/main.rs:3 println!("{}", x); |
| 17 | + ^ |
| 18 | +``` |
| 19 | + |
| 20 | +This is based off of a basic branch analysis: every branch must assign a value to `x` before it |
| 21 | +is first used. Interestingly, Rust doesn't require the variable to be mutable to perform a delayed initialization if every branch assigns exactly once. However the analysis does not take advantage of constant analysis or anything like that. So this compiles: |
| 22 | + |
| 23 | +```rust |
| 24 | +fn main() { |
| 25 | + let x: i32; |
| 26 | + let y: i32; |
| 27 | + |
| 28 | + y = 1; |
| 29 | + |
| 30 | + if true { |
| 31 | + x = 1; |
| 32 | + } else { |
| 33 | + x = 2; |
| 34 | + } |
| 35 | + |
| 36 | + println!("{} {}", x, y); |
| 37 | +} |
| 38 | +``` |
| 39 | + |
| 40 | +but this doesn't: |
| 41 | + |
| 42 | +```rust |
| 43 | +fn main() { |
| 44 | + let x: i32; |
| 45 | + if true { |
| 46 | + x = 1; |
| 47 | + } |
| 48 | + println!("{}", x); |
| 49 | +} |
| 50 | +``` |
| 51 | + |
| 52 | +```text |
| 53 | +src/main.rs:6:17: 6:18 error: use of possibly uninitialized variable: `x` |
| 54 | +src/main.rs:6 println!("{}", x); |
| 55 | +``` |
| 56 | + |
| 57 | +while this does: |
| 58 | + |
| 59 | +```rust |
| 60 | +fn main() { |
| 61 | + let x: i32; |
| 62 | + if true { |
| 63 | + x = 1; |
| 64 | + println!("{}", x); |
| 65 | + } |
| 66 | + // Don't care that there are branches where it's not initialized |
| 67 | + // since we don't use the value in those branches |
| 68 | +} |
| 69 | +``` |
| 70 | + |
| 71 | +If a value is moved out of a variable, that variable becomes logically uninitialized if the type |
| 72 | +of the value isn't Copy. That is: |
| 73 | + |
| 74 | +```rust |
| 75 | +fn main() { |
| 76 | + let x = 0; |
| 77 | + let y = Box::new(0); |
| 78 | + let z1 = x; // x is still valid because i32 is Copy |
| 79 | + let z2 = y; // y has once more become logically uninitialized, since Box is not Copy |
| 80 | +} |
| 81 | +``` |
| 82 | + |
| 83 | +However reassigning `y` in this example *would* require `y` to be marked as mutable, as a |
| 84 | +Safe Rust program could observe that the value of `y` changed. Otherwise the variable is |
| 85 | +exactly like new. |
| 86 | + |
| 87 | +This raises an interesting question with respect to `Drop`: where does Rust |
| 88 | +try to call the destructor of a variable that is conditionally initialized? |
| 89 | +It turns out that Rust actually tracks whether a type should be dropped or not *at runtime*. As a |
| 90 | +variable becomes initialized and uninitialized, a *drop flag* for that variable is set and unset. |
| 91 | +When a variable goes out of scope or is assigned it evaluates whether the current value of the |
| 92 | +variable should be dropped. Of course, static analysis can remove these checks. If the compiler |
| 93 | +can prove that a value is guaranteed to be either initialized or not, then it can theoretically |
| 94 | +generate more efficient code! As such it may be desirable to structure code to have *static drop |
| 95 | +semantics* when possible. |
| 96 | + |
| 97 | +As of Rust 1.0, the drop flags are actually not-so-secretly stashed in a secret field of any type |
| 98 | +that implements Drop. The language sets the drop flag by overwriting the entire struct with a |
| 99 | +particular value. This is pretty obviously Not The Fastest and causes a bunch of trouble with |
| 100 | +optimizing code. As such work is currently under way to move the flags out onto the stack frame |
| 101 | +where they more reasonably belong. Unfortunately this work will take some time as it requires |
| 102 | +fairly substantial changes to the compiler. |
| 103 | + |
| 104 | +So in general, Rust programs don't need to worry about uninitialized values on the stack for |
| 105 | +correctness. Although they might care for performance. Thankfully, Rust makes it easy to take |
| 106 | +control here! Uninitialized values are there, and Safe Rust lets you work with them, but you're |
| 107 | +never in trouble. |
| 108 | + |
| 109 | +One interesting exception to this rule is working with arrays. Safe Rust doesn't permit you to |
| 110 | +partially initialize an array. When you initialize an array, you can either set every value to the |
| 111 | +same thing with `let x = [val; N]`, or you can specify each member individually with |
| 112 | +`let x = [val1, val2, val3]`. Unfortunately this is pretty rigid, especially if you need |
| 113 | +to initialize your array in a more incremental or dynamic way. |
| 114 | + |
| 115 | +Unsafe Rust gives us a powerful tool to handle this problem: `std::mem::uninitialized`. |
| 116 | +This function pretends to return a value when really it does nothing at all. Using it, we can |
| 117 | +convince Rust that we have initialized a variable, allowing us to do trickier things with |
| 118 | +conditional and incremental initialization. |
| 119 | + |
| 120 | +Unfortunately, this raises a tricky problem. Assignment has a different meaning to Rust based on |
| 121 | +whether it believes that a variable is initialized or not. If it's uninitialized, then Rust will |
| 122 | +semantically just memcopy the bits over the uninit ones, and do nothing else. However if Rust |
| 123 | +believes a value to be initialized, it will try to `Drop` the old value! Since we've tricked Rust |
| 124 | +into believing that the value is initialized, we can no longer safely use normal assignment. |
| 125 | + |
| 126 | +This is also a problem if you're working with a raw system allocator, which of course returns a |
| 127 | +pointer to uninitialized memory. |
| 128 | + |
| 129 | +To handle this, we must use the `std::ptr` module. In particular, it provides three functions that |
| 130 | +allow us to assign bytes to a location in memory without evaluating the old value: `write`, `copy`, and `copy_nonoverlapping`. |
| 131 | + |
| 132 | +* `ptr::write(ptr, val)` takes a `val` and moves it into the address pointed to by `ptr`. |
| 133 | +* `ptr::copy(src, dest, count)` copies the bits that `count` T's would occupy from src to dest. (this is equivalent to memmove -- note that the argument order is reversed!) |
| 134 | +* `ptr::copy_nonoverlapping(src, dest, count)` does what `copy` does, but a little faster on the |
| 135 | +assumption that the two ranges of memory don't overlap. (this is equivalent to memcopy -- note that the argument order is reversed!) |
| 136 | + |
| 137 | +It should go without saying that these functions, if misused, will cause serious havoc or just |
| 138 | +straight up Undefined Behaviour. The only things that these functions *themselves* require is that |
| 139 | +the locations you want to read and write are allocated. However the ways writing arbitrary bit |
| 140 | +patterns to arbitrary locations of memory can break things are basically uncountable! |
| 141 | + |
| 142 | +Putting this all together, we get the following: |
| 143 | + |
| 144 | +```rust |
| 145 | +fn main() { |
| 146 | + use std::mem; |
| 147 | + |
| 148 | + // size of the array is hard-coded but easy to change. This means we can't |
| 149 | + // use [a, b, c] syntax to initialize the array, though! |
| 150 | + const SIZE = 10; |
| 151 | + |
| 152 | + let x: [Box<u32>; SIZE]; |
| 153 | + |
| 154 | + unsafe { |
| 155 | + // convince Rust that x is Totally Initialized |
| 156 | + x = mem::uninitialized(); |
| 157 | + for i in 0..SIZE { |
| 158 | + // very carefully overwrite each index without reading it |
| 159 | + ptr::write(&mut x[i], Box::new(i)); |
| 160 | + } |
| 161 | + } |
| 162 | + |
| 163 | + println!("{}", x); |
| 164 | +} |
| 165 | +``` |
| 166 | + |
| 167 | +It's worth noting that you don't need to worry about ptr::write-style shenanigans with |
| 168 | +Plain Old Data (POD; types which don't implement Drop, nor contain Drop types), |
| 169 | +because Rust knows not to try to Drop them. Similarly you should be able to assign the POD |
| 170 | +fields of partially initialized structs directly. |
| 171 | + |
| 172 | +However when working with uninitialized memory you need to be ever vigilant for Rust trying to |
| 173 | +Drop values you make like this before they're fully initialized. So every control path through |
| 174 | +that variable's scope must initialize the value before it ends. *This includes code panicking*. |
| 175 | +Again, POD types need not worry. |
| 176 | + |
| 177 | +And that's about it for working with uninitialized memory! Basically nothing anywhere expects |
| 178 | +to be handed uninitialized memory, so if you're going to pass it around at all, be sure to be |
| 179 | +*really* careful. |
0 commit comments