Skip to content

Commit d4268f9

Browse files
committed
shard out and clean up unwinding
1 parent bdc62e0 commit d4268f9

File tree

4 files changed

+264
-256
lines changed

4 files changed

+264
-256
lines changed

SUMMARY.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@
3030
* [Destructors](destructors.md)
3131
* [Leaking](leaking.md)
3232
* [Unwinding](unwinding.md)
33+
* [Exception Safety](exception-safety.md)
34+
* [Poisoning](poisoning.md)
3335
* [Concurrency](concurrency.md)
3436
* [Races](races.md)
3537
* [Send and Sync](send-and-sync.md)

exception-safety.md

Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
% Exception Safety
2+
3+
Although programs should use unwinding sparingly, there's *a lot* of code that
4+
*can* panic. If you unwrap a None, index out of bounds, or divide by 0, your
5+
program *will* panic. On debug builds, *every* arithmetic operation can panic
6+
if it overflows. Unless you are very careful and tightly control what code runs,
7+
pretty much everything can unwind, and you need to be ready for it.
8+
9+
Being ready for unwinding is often referred to as *exception safety*
10+
in the broader programming world. In Rust, their are two levels of exception
11+
safety that one may concern themselves with:
12+
13+
* In unsafe code, we *must* be exception safe to the point of not violating
14+
memory safety. We'll call this *minimal* exception safety.
15+
16+
* In safe code, it is *good* to be exception safe to the point of your program
17+
doing the right thing. We'll call this *maximal* exception safety.
18+
19+
As is the case in many places in Rust, Unsafe code must be ready to deal with
20+
bad Safe code when it comes to unwinding. Code that transiently creates
21+
unsound states must be careful that a panic does not cause that state to be
22+
used. Generally this means ensuring that only non-panicking code is run while
23+
these states exist, or making a guard that cleans up the state in the case of
24+
a panic. This does not necessarily mean that the state a panic witnesses is a
25+
fully *coherent* state. We need only guarantee that it's a *safe* state.
26+
27+
Most Unsafe code is leaf-like, and therefore fairly easy to make exception-safe.
28+
It controls all the code that runs, and most of that code can't panic. However
29+
it is not uncommon for Unsafe code to work with arrays of temporarily
30+
uninitialized data while repeatedly invoking caller-provided code. Such code
31+
needs to be careful and consider exception safety.
32+
33+
34+
35+
36+
37+
## Vec::push_all
38+
39+
`Vec::push_all` is a temporary hack to get extending a Vec by a slice reliably
40+
effecient without specialization. Here's a simple implementation:
41+
42+
```rust,ignore
43+
impl<T: Clone> Vec<T> {
44+
fn push_all(&mut self, to_push: &[T]) {
45+
self.reserve(to_push.len());
46+
unsafe {
47+
// can't overflow because we just reserved this
48+
self.set_len(self.len() + to_push.len());
49+
50+
for (i, x) in to_push.iter().enumerate() {
51+
self.ptr().offset(i as isize).write(x.clone());
52+
}
53+
}
54+
}
55+
}
56+
```
57+
58+
We bypass `push` in order to avoid redundant capacity and `len` checks on the
59+
Vec that we definitely know has capacity. The logic is totally correct, except
60+
there's a subtle problem with our code: it's not exception-safe! `set_len`,
61+
`offset`, and `write` are all fine, but *clone* is the panic bomb we over-looked.
62+
63+
Clone is completely out of our control, and is totally free to panic. If it does,
64+
our function will exit early with the length of the Vec set too large. If
65+
the Vec is looked at or dropped, uninitialized memory will be read!
66+
67+
The fix in this case is fairly simple. If we want to guarantee that the values
68+
we *did* clone are dropped we can set the len *in* the loop. If we just want to
69+
guarantee that uninitialized memory can't be observed, we can set the len *after*
70+
the loop.
71+
72+
73+
74+
75+
76+
## BinaryHeap::sift_up
77+
78+
Bubbling an element up a heap is a bit more complicated than extending a Vec.
79+
The pseudocode is as follows:
80+
81+
```text
82+
bubble_up(heap, index):
83+
while index != 0 && heap[index] < heap[parent(index)]:
84+
heap.swap(index, parent(index))
85+
index = parent(index)
86+
87+
```
88+
89+
A literal transcription of this code to Rust is totally fine, but has an annoying
90+
performance characteristic: the `self` element is swapped over and over again
91+
uselessly. We would *rather* have the following:
92+
93+
```text
94+
bubble_up(heap, index):
95+
let elem = heap[index]
96+
while index != 0 && element < heap[parent(index)]:
97+
heap[index] = heap[parent(index)]
98+
index = parent(index)
99+
heap[index] = elem
100+
```
101+
102+
This code ensures that each element is copied as little as possible (it is in
103+
fact necessary that elem be copied twice in general). However it now exposes
104+
some exception safety trouble! At all times, there exists two copies of one
105+
value. If we panic in this function something will be double-dropped.
106+
Unfortunately, we also don't have full control of the code: that comparison is
107+
user-defined!
108+
109+
Unlike Vec, the fix isn't as easy here. One option is to break the user-defined
110+
code and the unsafe code into two separate phases:
111+
112+
```text
113+
bubble_up(heap, index):
114+
let end_index = index;
115+
while end_index != 0 && heap[end_index] < heap[parent(end_index)]:
116+
end_index = parent(end_index)
117+
118+
let elem = heap[index]
119+
while index != end_index:
120+
heap[index] = heap[parent(index)]
121+
index = parent(index)
122+
heap[index] = elem
123+
```
124+
125+
If the user-defined code blows up, that's no problem anymore, because we haven't
126+
actually touched the state of the heap yet. Once we do start messing with the
127+
heap, we're working with only data and functions that we trust, so there's no
128+
concern of panics.
129+
130+
Perhaps you're not happy with this design. Surely, it's cheating! And we have
131+
to do the complex heap traversal *twice*! Alright, let's bite the bullet. Let's
132+
intermix untrusted and unsafe code *for reals*.
133+
134+
If Rust had `try` and `finally` like in Java, we could do the following:
135+
136+
```text
137+
bubble_up(heap, index):
138+
let elem = heap[index]
139+
try:
140+
while index != 0 && element < heap[parent(index)]:
141+
heap[index] = heap[parent(index)]
142+
index = parent(index)
143+
finally:
144+
heap[index] = elem
145+
```
146+
147+
The basic idea is simple: if the comparison panics, we just toss the loose
148+
element in the logically uninitialized index and bail out. Anyone who observes
149+
the heap will see a potentially *inconsistent* heap, but at least it won't
150+
cause any double-drops! If the algorithm terminates normally, then this
151+
operation happens to coincide precisely with the how we finish up regardless.
152+
153+
Sadly, Rust has no such construct, so we're going to need to roll our own! The
154+
way to do this is to store the algorithm's state in a separate struct with a
155+
destructor for the "finally" logic. Whether we panic or not, that destructor
156+
will run and clean up after us.
157+
158+
```rust
159+
struct Hole<'a, T: 'a> {
160+
data: &'a mut [T],
161+
/// `elt` is always `Some` from new until drop.
162+
elt: Option<T>,
163+
pos: usize,
164+
}
165+
166+
impl<'a, T> Hole<'a, T> {
167+
fn new(data: &'a mut [T], pos: usize) -> Self {
168+
unsafe {
169+
let elt = ptr::read(&data[pos]);
170+
Hole {
171+
data: data,
172+
elt: Some(elt),
173+
pos: pos,
174+
}
175+
}
176+
}
177+
178+
fn pos(&self) -> usize { self.pos }
179+
180+
fn removed(&self) -> &T { self.elt.as_ref().unwrap() }
181+
182+
unsafe fn get(&self, index: usize) -> &T { &self.data[index] }
183+
184+
unsafe fn move_to(&mut self, index: usize) {
185+
let index_ptr: *const _ = &self.data[index];
186+
let hole_ptr = &mut self.data[self.pos];
187+
ptr::copy_nonoverlapping(index_ptr, hole_ptr, 1);
188+
self.pos = index;
189+
}
190+
}
191+
192+
impl<'a, T> Drop for Hole<'a, T> {
193+
fn drop(&mut self) {
194+
// fill the hole again
195+
unsafe {
196+
let pos = self.pos;
197+
ptr::write(&mut self.data[pos], self.elt.take().unwrap());
198+
}
199+
}
200+
}
201+
202+
impl<T: Ord> BinaryHeap<T> {
203+
fn sift_up(&mut self, pos: usize) {
204+
unsafe {
205+
// Take out the value at `pos` and create a hole.
206+
let mut hole = Hole::new(&mut self.data, pos);
207+
208+
while hole.pos() != 0 {
209+
let parent = parent(hole.pos());
210+
if hole.removed() <= hole.get(parent) { break }
211+
hole.move_to(parent);
212+
}
213+
// Hole will be unconditionally filled here; panic or not!
214+
}
215+
}
216+
}
217+
```

poisoning.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
% Poisoning
2+
3+
Although all unsafe code *must* ensure it has minimal exception safety, not all
4+
types ensure *maximal* exception safety. Even if the type does, your code may
5+
ascribe additional meaning to it. For instance, an integer is certainly
6+
exception-safe, but has no semantics on its own. It's possible that code that
7+
panics could fail to correctly update the integer, producing an inconsistent
8+
program state.
9+
10+
This is *usually* fine, because anything that witnesses an exception is about
11+
to get destroyed. For instance, if you send a Vec to another thread and that
12+
thread panics, it doesn't matter if the Vec is in a weird state. It will be
13+
dropped and go away forever. However some types are especially good at smuggling
14+
values across the panic boundary.
15+
16+
These types may choose to explicitly *poison* themselves if they witness a panic.
17+
Poisoning doesn't entail anything in particular. Generally it just means
18+
preventing normal usage from proceeding. The most notable example of this is the
19+
standard library's Mutex type. A Mutex will poison itself if one of its
20+
MutexGuards (the thing it returns when a lock is obtained) is dropped during a
21+
panic. Any future attempts to lock the Mutex will return an `Err` or panic.
22+
23+
Mutex poisons not for *true* safety in the sense that Rust normally cares about. It
24+
poisons as a safety-guard against blindly using the data that comes out of a Mutex
25+
that has witnessed a panic while locked. The data in such a Mutex was likely in the
26+
middle of being modified, and as such may be in an inconsistent or incomplete state.
27+
It is important to note that one cannot violate memory safety with such a type
28+
if it is correctly written. After all, it must be minimally exception-safe!
29+
30+
However if the Mutex contained, say, a BinaryHeap that does not actually have the
31+
heap property, it's unlikely that any code that uses it will do
32+
what the author intended. As such, the program should not proceed normally.
33+
Still, if you're double-plus-sure that you can do *something* with the value,
34+
the Mutex exposes a method to get the lock anyway. It *is* safe, after all.
35+
Just maybe nonsense.

0 commit comments

Comments
 (0)