|
1 | 1 | % Concurrency and Paralellism
|
2 | 2 |
|
3 |
| - |
4 |
| - |
5 |
| -# Data Races and Race Conditions |
6 |
| - |
7 |
| -Safe Rust guarantees an absence of data races, which are defined as: |
8 |
| - |
9 |
| -* two or more threads concurrently accessing a location of memory |
10 |
| -* one of them is a write |
11 |
| -* one of them is unsynchronized |
12 |
| - |
13 |
| -A data race has Undefined Behaviour, and is therefore impossible to perform |
14 |
| -in Safe Rust. Data races are *mostly* prevented through rust's ownership system: |
15 |
| -it's impossible to alias a mutable reference, so it's impossible to perform a |
16 |
| -data race. Interior mutability makes this more complicated, which is largely why |
17 |
| -we have the Send and Sync traits (see below). |
18 |
| - |
19 |
| -However Rust *does not* prevent general race conditions. This is |
20 |
| -pretty fundamentally impossible, and probably honestly undesirable. Your hardware |
21 |
| -is racy, your OS is racy, the other programs on your computer are racy, and the |
22 |
| -world this all runs in is racy. Any system that could genuinely claim to prevent |
23 |
| -*all* race conditions would be pretty awful to use, if not just incorrect. |
24 |
| - |
25 |
| -So it's perfectly "fine" for a Safe Rust program to get deadlocked or do |
26 |
| -something incredibly stupid with incorrect synchronization. Obviously such a |
27 |
| -program isn't very good, but Rust can only hold your hand so far. Still, a |
28 |
| -race condition can't violate memory safety in a Rust program on |
29 |
| -its own. Only in conjunction with some other unsafe code can a race condition |
30 |
| -actually violate memory safety. For instance: |
31 |
| - |
32 |
| -```rust |
33 |
| -use std::thread; |
34 |
| -use std::sync::atomic::{AtomicUsize, Ordering}; |
35 |
| -use std::sync::Arc; |
36 |
| - |
37 |
| -let data = vec![1, 2, 3, 4]; |
38 |
| -// Arc so that the memory the AtomicUsize is stored in still exists for |
39 |
| -// the other thread to increment, even if we completely finish executing |
40 |
| -// before it. Rust won't compile the program without it, because of the |
41 |
| -// lifetime requirements of thread::spawn! |
42 |
| -let idx = Arc::new(AtomicUsize::new(0)); |
43 |
| -let other_idx = idx.clone(); |
44 |
| - |
45 |
| -// `move` captures other_idx by-value, moving it into this thread |
46 |
| -thread::spawn(move || { |
47 |
| - // It's ok to mutate idx because this value |
48 |
| - // is an atomic, so it can't cause a Data Race. |
49 |
| - other_idx.fetch_add(10, Ordering::SeqCst); |
50 |
| -}); |
51 |
| - |
52 |
| -// Index with the value loaded from the atomic. This is safe because we |
53 |
| -// read the atomic memory only once, and then pass a *copy* of that value |
54 |
| -// to the Vec's indexing implementation. This indexing will be correctly |
55 |
| -// bounds checked, and there's no chance of the value getting changed |
56 |
| -// in the middle. However our program may panic if the thread we spawned |
57 |
| -// managed to increment before this ran. A race condition because correct |
58 |
| -// program execution (panicing is rarely correct) depends on order of |
59 |
| -// thread execution. |
60 |
| -println!("{}", data[idx.load(Ordering::SeqCst)]); |
61 |
| - |
62 |
| -if idx.load(Ordering::SeqCst) < data.len() { |
63 |
| - unsafe { |
64 |
| - // Incorrectly loading the idx *after* we did the bounds check. |
65 |
| - // It could have changed. This is a race condition, *and dangerous* |
66 |
| - // because we decided to do `get_unchecked`, which is `unsafe`. |
67 |
| - println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst))); |
68 |
| - } |
69 |
| -} |
70 |
| -``` |
71 |
| - |
72 |
| - |
73 |
| - |
74 |
| - |
75 |
| -# Send and Sync |
76 |
| - |
77 |
| -Not everything obeys inherited mutability, though. Some types allow you to multiply |
78 |
| -alias a location in memory while mutating it. Unless these types use synchronization |
79 |
| -to manage this access, they are absolutely not thread safe. Rust captures this with |
80 |
| -through the `Send` and `Sync` traits. |
81 |
| - |
82 |
| -* A type is Send if it is safe to send it to another thread. |
83 |
| -* A type is Sync if it is safe to share between threads (`&T` is Send). |
84 |
| - |
85 |
| -Send and Sync are *very* fundamental to Rust's concurrency story. As such, a |
86 |
| -substantial amount of special tooling exists to make them work right. First and |
87 |
| -foremost, they're *unsafe traits*. This means that they are unsafe *to implement*, |
88 |
| -and other unsafe code can *trust* that they are correctly implemented. Since |
89 |
| -they're *marker traits* (they have no associated items like methods), correctly |
90 |
| -implemented simply means that they have the intrinsic properties an implementor |
91 |
| -should have. Incorrectly implementing Send or Sync can cause Undefined Behaviour. |
92 |
| - |
93 |
| -Send and Sync are also what Rust calls *opt-in builtin traits*. |
94 |
| -This means that, unlike every other trait, they are *automatically* derived: |
95 |
| -if a type is composed entirely of Send or Sync types, then it is Send or Sync. |
96 |
| -Almost all primitives are Send and Sync, and as a consequence pretty much |
97 |
| -all types you'll ever interact with are Send and Sync. |
98 |
| - |
99 |
| -Major exceptions include: |
100 |
| - |
101 |
| -* raw pointers are neither Send nor Sync (because they have no safety guards) |
102 |
| -* `UnsafeCell` isn't Sync (and therefore `Cell` and `RefCell` aren't) |
103 |
| -* `Rc` isn't Send or Sync (because the refcount is shared and unsynchronized) |
104 |
| - |
105 |
| -`Rc` and `UnsafeCell` are very fundamentally not thread-safe: they enable |
106 |
| -unsynchronized shared mutable state. However raw pointers are, strictly speaking, |
107 |
| -marked as thread-unsafe as more of a *lint*. Doing anything useful |
108 |
| -with a raw pointer requires dereferencing it, which is already unsafe. In that |
109 |
| -sense, one could argue that it would be "fine" for them to be marked as thread safe. |
110 |
| - |
111 |
| -However it's important that they aren't thread safe to prevent types that |
112 |
| -*contain them* from being automatically marked as thread safe. These types have |
113 |
| -non-trivial untracked ownership, and it's unlikely that their author was |
114 |
| -necessarily thinking hard about thread safety. In the case of Rc, we have a nice |
115 |
| -example of a type that contains a `*mut` that is *definitely* not thread safe. |
116 |
| - |
117 |
| -Types that aren't automatically derived can *opt-in* to Send and Sync by simply |
118 |
| -implementing them: |
119 |
| - |
120 |
| -```rust |
121 |
| -struct MyBox(*mut u8); |
122 |
| - |
123 |
| -unsafe impl Send for MyBox {} |
124 |
| -unsafe impl Sync for MyBox {} |
125 |
| -``` |
126 |
| - |
127 |
| -In the *incredibly rare* case that a type is *inappropriately* automatically |
128 |
| -derived to be Send or Sync, then one can also *unimplement* Send and Sync: |
129 |
| - |
130 |
| -```rust |
131 |
| -struct SpecialThreadToken(u8); |
132 |
| - |
133 |
| -impl !Send for SpecialThreadToken {} |
134 |
| -impl !Sync for SpecialThreadToken {} |
135 |
| -``` |
136 |
| - |
137 |
| -Note that *in and of itself* it is impossible to incorrectly derive Send and Sync. |
138 |
| -Only types that are ascribed special meaning by other unsafe code can possible cause |
139 |
| -trouble by being incorrectly Send or Sync. |
140 |
| - |
141 |
| -Most uses of raw pointers should be encapsulated behind a sufficient abstraction |
142 |
| -that Send and Sync can be derived. For instance all of Rust's standard |
143 |
| -collections are Send and Sync (when they contain Send and Sync types) |
144 |
| -in spite of their pervasive use raw pointers to |
145 |
| -manage allocations and complex ownership. Similarly, most iterators into these |
146 |
| -collections are Send and Sync because they largely behave like an `&` or `&mut` |
147 |
| -into the collection. |
148 |
| - |
149 |
| -TODO: better explain what can or can't be Send or Sync. Sufficient to appeal |
150 |
| -only to data races? |
151 |
| - |
152 |
| - |
153 |
| - |
154 |
| - |
155 |
| -# Atomics |
156 |
| - |
157 |
| -Rust pretty blatantly just inherits C11's memory model for atomics. This is not |
158 |
| -due this model being particularly excellent or easy to understand. Indeed, this |
159 |
| -model is quite complex and known to have [several flaws][C11-busted]. Rather, |
160 |
| -it is a pragmatic concession to the fact that *everyone* is pretty bad at modeling |
161 |
| -atomics. At very least, we can benefit from existing tooling and research around |
162 |
| -C. |
163 |
| - |
164 |
| -Trying to fully explain the model is fairly hopeless. If you want all the |
165 |
| -nitty-gritty details, you should check out [C's specification][C11-model]. |
166 |
| -Still, we'll try to cover the basics and some of the problems Rust developers |
167 |
| -face. |
168 |
| - |
169 |
| -The C11 memory model is fundamentally about trying to bridge the gap between C's |
170 |
| -single-threaded semantics, common compiler optimizations, and hardware peculiarities |
171 |
| -in the face of a multi-threaded environment. It does this by splitting memory |
172 |
| -accesses into two worlds: data accesses, and atomic accesses. |
173 |
| - |
174 |
| -Data accesses are the bread-and-butter of the programming world. They are |
175 |
| -fundamentally unsynchronized and compilers are free to aggressively optimize |
176 |
| -them. In particular data accesses are free to be reordered by the compiler |
177 |
| -on the assumption that the program is single-threaded. The hardware is also free |
178 |
| -to propagate the changes made in data accesses as lazily and inconsistently as |
179 |
| -it wants to other threads. Mostly critically, data accesses are where we get data |
180 |
| -races. These are pretty clearly awful semantics to try to write a multi-threaded |
181 |
| -program with. |
182 |
| - |
183 |
| -Atomic accesses are the answer to this. Each atomic access can be marked with |
184 |
| -an *ordering*. The set of orderings Rust exposes are: |
185 |
| - |
186 |
| -* Sequentially Consistent (SeqCst) |
187 |
| -* Release |
188 |
| -* Acquire |
189 |
| -* Relaxed |
190 |
| - |
191 |
| -(Note: We explicitly do not expose the C11 *consume* ordering) |
192 |
| - |
193 |
| -TODO: give simple "basic" explanation of these |
194 |
| -TODO: implementing Arc example (why does Drop need the trailing barrier?) |
195 |
| - |
196 |
| - |
197 |
| - |
198 |
| - |
199 |
| -# Actually Doing Things Concurrently |
200 |
| - |
201 | 3 | Rust as a language doesn't *really* have an opinion on how to do concurrency or
|
202 | 4 | parallelism. The standard library exposes OS threads and blocking sys-calls
|
203 |
| -because *everyone* has those and they're uniform enough that you can provide |
| 5 | +because *everyone* has those, and they're uniform enough that you can provide |
204 | 6 | an abstraction over them in a relatively uncontroversial way. Message passing,
|
205 | 7 | green threads, and async APIs are all diverse enough that any abstraction over
|
206 | 8 | them tends to involve trade-offs that we weren't willing to commit to for 1.0.
|
207 | 9 |
|
208 |
| -However Rust's current design is setup so that you can set up your own |
209 |
| -concurrent paradigm or library as you see fit. Just require the right |
210 |
| -lifetimes and Send and Sync where appropriate and everything should Just Work |
211 |
| -with everyone else's stuff. |
212 |
| - |
213 |
| - |
214 |
| - |
215 |
| - |
216 |
| -[C11-busted]: http://plv.mpi-sws.org/c11comp/popl15.pdf |
217 |
| -[C11-model]: http://en.cppreference.com/w/c/atomic/memory_order |
| 10 | +However the way Rust models concurrency makes it relatively easy design your own |
| 11 | +concurrency paradigm as a library and have *everyone else's* code Just Work |
| 12 | +with yours. Just require the right lifetimes and Send and Sync where appropriate |
| 13 | +and you're off to the races. Or rather, not having races. Races are bad. |
0 commit comments