Skip to content

Commit 5d4f854

Browse files
committed
so much unwinding
1 parent 10af239 commit 5d4f854

File tree

1 file changed

+228
-16
lines changed

1 file changed

+228
-16
lines changed

unwinding.md

Lines changed: 228 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -28,10 +28,9 @@ Rust very poor for long-running systems!
2828
As the Rust we know today came to be, this style of programming grew out of
2929
fashion in the push for less-and-less abstraction. Light-weight tasks were
3030
killed in the name of heavy-weight OS threads. Still, panics could only be
31-
caught by the parent thread. This meant catching a panic required spinning up
32-
an entire OS thread! Although Rust maintains the philosophy that panics should
33-
not be used for "basic" error-handling like C++ or Java, it is still desirable
34-
to not have the entire program crash in the face of a panic.
31+
caught by the parent thread. This means catching a panic requires spinning up
32+
an entire OS thread! This unfortunately stands in conflict to Rust's philosophy
33+
of zero-cost abstractions.
3534

3635
In the near future there will be a stable interface for catching panics in an
3736
arbitrary location, though we would encourage you to still only do this
@@ -40,14 +39,14 @@ optimized for the "doesn't unwind" case. If a program doesn't unwind, there
4039
should be no runtime cost for the program being *ready* to unwind. As a
4140
consequence, *actually* unwinding will be more expensive than in e.g. Java.
4241
Don't build your programs to unwind under normal circumstances. Ideally, you
43-
should only panic for programming errors.
42+
should only panic for programming errors or *extreme* problems.
4443

4544

4645

4746

4847
# Exception Safety
4948

50-
Being ready for unwinding is often referred to as "exception safety"
49+
Being ready for unwinding is often referred to as *exception safety*
5150
in the broader programming world. In Rust, their are two levels of exception
5251
safety that one may concern themselves with:
5352

@@ -60,23 +59,236 @@ safety that one may concern themselves with:
6059
As is the case in many places in Rust, unsafe code must be ready to deal with
6160
bad safe code, and that includes code that panics. Code that transiently creates
6261
unsound states must be careful that a panic does not cause that state to be
63-
used. Generally this means ensuring that only non-panicing code is run while
62+
used. Generally this means ensuring that only non-panicking code is run while
6463
these states exist, or making a guard that cleans up the state in the case of
6564
a panic. This does not necessarily mean that the state a panic witnesses is a
6665
fully *coherent* state. We need only guarantee that it's a *safe* state.
6766

68-
For instance, consider extending a Vec:
67+
Most unsafe code is leaf-like, and therefore fairly easy to make exception-safe.
68+
It controls all the code that runs, and most of that code can't panic. However
69+
it is often the case that code that works with arrays works with temporarily
70+
uninitialized data while repeatedly invoking caller-provided code. Such code
71+
needs to be careful, and consider exception-safety.
72+
73+
74+
75+
76+
77+
## Vec::push_all
78+
79+
`Vec::push_all` is a temporary hack to get extending a Vec by a slice reliably
80+
effecient without specialization. Here's a simple implementation:
81+
82+
```rust,ignore
83+
impl<T: Clone> Vec<T> {
84+
fn push_all(&mut self, to_push: &[T]) {
85+
self.reserve(to_push.len());
86+
unsafe {
87+
// can't overflow because we just reserved this
88+
self.set_len(self.len() + to_push.len());
89+
90+
for (i, x) in to_push.iter().enumerate() {
91+
self.ptr().offset(i as isize).write(x.clone());
92+
}
93+
}
94+
}
95+
}
96+
```
97+
98+
We bypass `push` in order to avoid redundant capacity and `len` checks on the
99+
Vec that we definitely know has capacity. The logic is totally correct, except
100+
there's a subtle problem with our code: it's not exception-safe! `set_len`,
101+
`offset`, and `write` are all fine, but *clone* is the panic bomb we over-looked.
102+
103+
Clone is completely out of our control, and is totally free to panic. If it does,
104+
our function will exit early with the length of the Vec set too large. If
105+
the Vec is looked at or dropped, uninitialized memory will be read!
106+
107+
The fix in this case is fairly simple. If we want to guarantee that the values
108+
we *did* clone are dropped we can set the len *in* the loop. If we just want to
109+
guarantee that uninitialized memory can't be observed, we can set the len *after*
110+
the loop.
111+
112+
113+
114+
115+
116+
## BinaryHeap::sift_up
117+
118+
Bubbling an element up a heap is a bit more complicated than extending a Vec.
119+
The pseudocode is as follows:
120+
121+
```text
122+
bubble_up(heap, index):
123+
while index != 0 && heap[index] < heap[parent(index)]:
124+
heap.swap(index, parent(index))
125+
index = parent(index)
126+
127+
```
128+
129+
A literal transcription of this code to Rust is totally fine, but has an annoying
130+
performance characteristic: the `self` element is swapped over and over again
131+
uselessly. We would *rather* have the following:
132+
133+
```text
134+
bubble_up(heap, index):
135+
let elem = heap[index]
136+
while index != 0 && element < heap[parent(index)]:
137+
heap[index] = heap[parent(index)]
138+
index = parent(index)
139+
heap[index] = elem
140+
```
141+
142+
This code ensures that each element is copied as little as possible (it is in
143+
fact necessary that elem be copied twice in general). However it now exposes
144+
some exception-safety trouble! At all times, there exists two copies of one
145+
value. If we panic in this function something will be double-dropped.
146+
Unfortunately, we also don't have full control of the code: that comparison is
147+
user-defined!
148+
149+
Unlike Vec, the fix isn't as easy here. One option is to break the user-defined
150+
code and the unsafe code into two separate phases:
151+
152+
```text
153+
bubble_up(heap, index):
154+
let end_index = index;
155+
while end_index != 0 && heap[end_index] < heap[parent(end_index)]:
156+
end_index = parent(end_index)
157+
158+
let elem = heap[index]
159+
while index != end_index:
160+
heap[index] = heap[parent(index)]
161+
index = parent(index)
162+
heap[index] = elem
163+
```
164+
165+
If the user-defined code blows up, that's no problem anymore, because we haven't
166+
actually touched the state of the heap yet. Once we do start messing with the
167+
heap, we're working with only data and functions that we trust, so there's no
168+
concern of panics.
169+
170+
Perhaps you're not happy with this design. Surely, it's cheating! And we have
171+
to do the complex heap traversal *twice*! Alright, let's bite the bullet. Let's
172+
intermix untrusted and unsafe code *for reals*.
173+
174+
If Rust had `try` and `finally` like in Java, we could do the following:
175+
176+
```text
177+
bubble_up(heap, index):
178+
let elem = heap[index]
179+
try:
180+
while index != 0 && element < heap[parent(index)]:
181+
heap[index] = heap[parent(index)]
182+
index = parent(index)
183+
finally:
184+
heap[index] = elem
185+
```
186+
187+
The basic idea is simple: if the comparison panics, we just toss the loose
188+
element in the logically uninitialized index and bail out. Anyone who observes
189+
the heap will see a potentially *inconsistent* heap, but at least it won't
190+
cause any double-drops! If the algorithm terminates normally, then this
191+
operation happens to coincide precisely with the how we finish up regardless.
192+
193+
Sadly, Rust has no such construct, so we're going to need to roll our own! The
194+
way to do this is to store the algorithm's state in a separate struct with a
195+
destructor for the "finally" logic. Whether we panic or not, that destructor
196+
will run and clean up after us.
69197

70198
```rust
199+
struct Hole<'a, T: 'a> {
200+
data: &'a mut [T],
201+
/// `elt` is always `Some` from new until drop.
202+
elt: Option<T>,
203+
pos: usize,
204+
}
71205

72-
impl Extend<T> for Vec<T> {
73-
fn extend<I: IntoIter<Item=T>>(&mut self, iterable: I) {
74-
let mut iter = iterable.into_iter();
75-
let size_hint = iter.size_hint().0;
76-
self.reserve(size_hint);
77-
self.set_len(self.len() + size_hint());
206+
impl<'a, T> Hole<'a, T> {
207+
fn new(data: &'a mut [T], pos: usize) -> Self {
208+
unsafe {
209+
let elt = ptr::read(&data[pos]);
210+
Hole {
211+
data: data,
212+
elt: Some(elt),
213+
pos: pos,
214+
}
215+
}
216+
}
78217

79-
for
80-
}
218+
fn pos(&self) -> usize { self.pos }
219+
220+
fn removed(&self) -> &T { self.elt.as_ref().unwrap() }
221+
222+
unsafe fn get(&self, index: usize) -> &T { &self.data[index] }
223+
224+
unsafe fn move_to(&mut self, index: usize) {
225+
let index_ptr: *const _ = &self.data[index];
226+
let hole_ptr = &mut self.data[self.pos];
227+
ptr::copy_nonoverlapping(index_ptr, hole_ptr, 1);
228+
self.pos = index;
229+
}
81230
}
82231

232+
impl<'a, T> Drop for Hole<'a, T> {
233+
fn drop(&mut self) {
234+
// fill the hole again
235+
unsafe {
236+
let pos = self.pos;
237+
ptr::write(&mut self.data[pos], self.elt.take().unwrap());
238+
}
239+
}
240+
}
241+
242+
impl<T: Ord> BinaryHeap<T> {
243+
fn sift_up(&mut self, pos: usize) {
244+
unsafe {
245+
// Take out the value at `pos` and create a hole.
246+
let mut hole = Hole::new(&mut self.data, pos);
247+
248+
while hole.pos() != 0 {
249+
let parent = parent(hole.pos());
250+
if hole.removed() <= hole.get(parent) { break }
251+
hole.move_to(parent);
252+
}
253+
// Hole will be unconditionally filled here; panic or not!
254+
}
255+
}
256+
}
257+
```
258+
259+
260+
261+
262+
## Poisoning
263+
264+
Although all unsafe code *must* ensure some minimal level of exception safety,
265+
some types may choose to explicitly *poison* themselves if they witness a panic.
266+
The most notable example of this is the standard library's Mutex type. A Mutex
267+
will poison itself if one of its MutexGuards (the thing it returns when a lock
268+
is obtained) is dropped during a panic. Any future attempts to lock the Mutex
269+
will return an `Err`.
270+
271+
Mutex poisons not for *true* safety in the sense that Rust normally cares about. It
272+
poisons as a safety-guard against blindly using the data that comes out of a Mutex
273+
that has witnessed a panic while locked. The data in such a Mutex was likely in the
274+
middle of being modified, and as such may be in an inconsistent or incomplete state.
275+
It is important to note that one cannot violate memory safety with such a type
276+
if it is correctly written. After all, it must be exception safe!
277+
278+
However if the Mutex contained, say, a BinaryHeap that does not actually have the
279+
heap property, it's unlikely that any code that uses it will do
280+
what the author intended. As such, the program should not proceed normally.
281+
Still, if you're double-plus-sure that you can do *something* with the value,
282+
the Err exposes a method to get the lock anyway. It *is* safe, after all.
283+
284+
285+
286+
# FFI
287+
288+
Rust's unwinding strategy is not specified to be fundamentally compatible
289+
with any other language's unwinding. As such, unwinding into Rust from another
290+
language, or unwinding into another language from Rust is Undefined Behaviour.
291+
What you do at that point is up to you, but you must *absolutely* catch any
292+
panics at the FFI boundary! At best, your application will crash and burn. At
293+
worst, your application *won't* crash and burn, and will proceed with completely
294+
clobbered state.

0 commit comments

Comments
 (0)