|
| 1 | +% Concurrency and Paralellism |
| 2 | + |
| 3 | +```Not sure if I want this |
| 4 | +Safe Rust features *a ton* of tooling to make concurrency and parallelism totally |
| 5 | +safe, easy, and fearless. This is a case where we'll really just |
| 6 | +[defer to TRPL][trpl-conc] for the basics. |
| 7 | +
|
| 8 | +TL;DR: The `Send` and `Sync` traits in conjunction with Rust's ownership model and |
| 9 | +normal generic bounds make using concurrent APIs really easy and painless for |
| 10 | +a user of Safe Rust. |
| 11 | +``` |
| 12 | + |
| 13 | +## Data Races and Race Conditions |
| 14 | + |
| 15 | +Safe Rust guarantees an absence of data races, which are defined as: |
| 16 | + |
| 17 | +* two or more threads concurrently accessing a location of memory |
| 18 | +* one of them is a write |
| 19 | +* one of them is unsynchronized |
| 20 | + |
| 21 | +A data race has Undefined Behaviour, and is therefore impossible to perform |
| 22 | +in Safe Rust. Data races are *mostly* prevented through rust's ownership system: |
| 23 | +it's impossible to alias a mutable reference, so it's impossible to perform a |
| 24 | +data race. Interior mutability makes this more complicated, which is largely why |
| 25 | +we have the Send and Sync traits (see below). |
| 26 | + |
| 27 | +However Rust *does not* prevent general race conditions. This is |
| 28 | +pretty fundamentally impossible, and probably honestly undesirable. Your hardware |
| 29 | +is racy, your OS is racy, the other programs on your computer are racy, and the |
| 30 | +world this all runs in is racy. Any system that could genuinely claim to prevent |
| 31 | +*all* race conditions would be pretty awful to use, if not just incorrect. |
| 32 | + |
| 33 | +So it's perfectly "fine" for a Safe Rust program to get deadlocked or do |
| 34 | +something incredibly stupid with incorrect synchronization. Obviously such a |
| 35 | +program isn't very good, but Rust can only hold your hand so far. Still, a |
| 36 | +race condition can't violate memory safety in a Rust program on |
| 37 | +its own. Only in conjunction with some other unsafe code can a race condition |
| 38 | +actually violate memory safety. For instance: |
| 39 | + |
| 40 | +```rust |
| 41 | +use std::thread; |
| 42 | +use std::sync::atomic::{AtomicUsize, Ordering}; |
| 43 | +use std::sync::Arc; |
| 44 | + |
| 45 | +let data = vec![1, 2, 3, 4]; |
| 46 | +// Arc so that the memory the AtomicUsize is stored in still exists for |
| 47 | +// the other thread to increment, even if we completely finish executing |
| 48 | +// before it. Rust won't compile the program without it, because of the |
| 49 | +// lifetime requirements of thread::spawn! |
| 50 | +let idx = Arc::new(AtomicUsize::new(0)); |
| 51 | +let other_idx = idx.clone(); |
| 52 | + |
| 53 | +// `move` captures other_idx by-value, moving it into this thread |
| 54 | +thread::spawn(move || { |
| 55 | + // It's ok to mutate idx because this value |
| 56 | + // is an atomic, so it can't cause a Data Race. |
| 57 | + other_idx.fetch_add(10, Ordering::SeqCst); |
| 58 | +}); |
| 59 | + |
| 60 | +// Index with the value loaded from the atomic. This is safe because we |
| 61 | +// read the atomic memory only once, and then pass a *copy* of that value |
| 62 | +// to the Vec's indexing implementation. This indexing will be correctly |
| 63 | +// bounds checked, and there's no chance of the value getting changed |
| 64 | +// in the middle. However our program may panic if the thread we spawned |
| 65 | +// managed to increment before this ran. A race condition because correct |
| 66 | +// program execution (panicing is rarely correct) depends on order of |
| 67 | +// thread execution. |
| 68 | +println!("{}", data[idx.load(Ordering::SeqCst)]); |
| 69 | + |
| 70 | +if idx.load(Ordering::SeqCst) < data.len() { |
| 71 | + unsafe { |
| 72 | + // Incorrectly loading the idx *after* we did the bounds check. |
| 73 | + // It could have changed. This is a race condition, *and dangerous* |
| 74 | + // because we decided to do `get_unchecked`, which is `unsafe`. |
| 75 | + println!("{}", data.get_unchecked(idx.load(Ordering::SeqCst))); |
| 76 | + } |
| 77 | +} |
| 78 | +``` |
| 79 | + |
| 80 | +## Send and Sync |
| 81 | + |
| 82 | +Not everything obeys inherited mutability, though. Some types allow you to multiply |
| 83 | +alias a location in memory while mutating it. Unless these types use synchronization |
| 84 | +to manage this access, they are absolutely not thread safe. Rust captures this with |
| 85 | +through the `Send` and `Sync` traits. |
| 86 | + |
| 87 | +* A type is Send if it is safe to send it to another thread. |
| 88 | +* A type is Sync if it is safe to share between threads (`&T` is Send). |
| 89 | + |
| 90 | +Send and Sync are *very* fundamental to Rust's concurrency story. As such, a |
| 91 | +substantial amount of special tooling exists to make them work right. First and |
| 92 | +foremost, they're *unsafe traits*. This means that they are unsafe *to implement*, |
| 93 | +and other unsafe code can *trust* that they are correctly implemented. Since |
| 94 | +they're *marker traits* (they have no associated items like methods), correctly |
| 95 | +implemented simply means that they have the intrinsic properties an implementor |
| 96 | +should have. Incorrectly implementing Send or Sync can cause Undefined Behaviour. |
| 97 | + |
| 98 | +Send and Sync are also what Rust calls *opt-in builtin traits*. |
| 99 | +This means that, unlike every other trait, they are *automatically* derived: |
| 100 | +if a type is composed entirely of Send or Sync types, then it is Send or Sync. |
| 101 | +Almost all primitives are Send and Sync, and as a consequence pretty much |
| 102 | +all types you'll ever interact with are Send and Sync. |
| 103 | + |
| 104 | +Major exceptions include: |
| 105 | +* raw pointers are neither Send nor Sync (because they have no safety guards) |
| 106 | +* `UnsafeCell` isn't Sync (and therefore `Cell` and `RefCell` aren't) |
| 107 | +* `Rc` isn't Send or Sync (because the refcount is shared and unsynchronized) |
| 108 | + |
| 109 | +`Rc` and `UnsafeCell` are very fundamentally not thread-safe: they enable |
| 110 | +unsynchronized shared mutable state. However raw pointers are, strictly speaking, |
| 111 | +marked as thread-unsafe as more of a *lint*. Doing anything useful |
| 112 | +with a raw pointer requires dereferencing it, which is already unsafe. In that |
| 113 | +sense, one could argue that it would be "fine" for them to be marked as thread safe. |
| 114 | + |
| 115 | +However it's important that they aren't thread safe to prevent types that |
| 116 | +*contain them* from being automatically marked as thread safe. These types have |
| 117 | +non-trivial untracked ownership, and it's unlikely that their author was |
| 118 | +necessarily thinking hard about thread safety. In the case of Rc, we have a nice |
| 119 | +example of a type that contains a `*mut` that is *definitely* not thread safe. |
| 120 | + |
| 121 | +Types that aren't automatically derived can *opt-in* to Send and Sync by simply |
| 122 | +implementing them: |
| 123 | + |
| 124 | +```rust |
| 125 | +struct MyBox(*mut u8); |
| 126 | + |
| 127 | +unsafe impl Send for MyBox {} |
| 128 | +unsafe impl Sync for MyBox {} |
| 129 | +``` |
| 130 | + |
| 131 | +In the *incredibly rare* case that a type is *inappropriately* automatically |
| 132 | +derived to be Send or Sync, then one can also *unimplement* Send and Sync: |
| 133 | + |
| 134 | +```rust |
| 135 | +struct SpecialThreadToken(u8); |
| 136 | + |
| 137 | +impl !Send for SpecialThreadToken {} |
| 138 | +impl !Sync for SpecialThreadToken {} |
| 139 | +``` |
| 140 | + |
| 141 | +Note that *in and of itself* it is impossible to incorrectly derive Send and Sync. |
| 142 | +Only types that are ascribed special meaning by other unsafe code can possible cause |
| 143 | +trouble by being incorrectly Send or Sync. |
| 144 | + |
| 145 | +Most uses of raw pointers should be encapsulated behind a sufficient abstraction |
| 146 | +that Send and Sync can be derived. For instance all of Rust's standard |
| 147 | +collections are Send and Sync (when they contain Send and Sync types) |
| 148 | +in spite of their pervasive use raw pointers to |
| 149 | +manage allocations and complex ownership. Similarly, most iterators into these |
| 150 | +collections are Send and Sync because they largely behave like an `&` or `&mut` |
| 151 | +into the collection. |
| 152 | + |
| 153 | +TODO: better explain what can or can't be Send or Sync. Sufficient to appeal |
| 154 | +only to data races? |
| 155 | + |
| 156 | +## Atomics |
| 157 | + |
| 158 | +Rust pretty blatantly just inherits LLVM's model for atomics, which in turn is |
| 159 | +largely based off of the C11 model for atomics. This is not due these models |
| 160 | +being particularly excellent or easy to understand. Indeed, these models are |
| 161 | +quite complex and are known to have several flaws. Rather, it is a pragmatic |
| 162 | +concession to the fact that *everyone* is pretty bad at modeling atomics. At very |
| 163 | +least, we can benefit from existing tooling and research around C's model. |
| 164 | + |
| 165 | +Trying to fully explain these models is fairly hopeless, so we're just going to |
| 166 | +drop that problem in LLVM's lap. |
| 167 | + |
| 168 | +## Actually Doing Things Concurrently |
| 169 | + |
| 170 | +Rust as a language doesn't *really* have an opinion on how to do concurrency or |
| 171 | +parallelism. The standard library exposes OS threads and blocking sys-calls |
| 172 | +because *everyone* has those and they're uniform enough that you can provide |
| 173 | +an abstraction over them in a relatively uncontroversial way. Message passing, |
| 174 | +green threads, and async APIs are all diverse enough that any abstraction over |
| 175 | +them tends to involve trade-offs that we weren't willing to commit to for 1.0. |
| 176 | + |
| 177 | +However Rust's current design is setup so that you can set up your own |
| 178 | +concurrent paradigm or library as you see fit. Just require the right |
| 179 | +lifetimes and Send and Sync where appropriate and everything should Just Work |
| 180 | +with everyone else's stuff. |
| 181 | + |
| 182 | + |
| 183 | + |
| 184 | + |
| 185 | +[llvm-conc]: http://llvm.org/docs/Atomics.html |
| 186 | +[trpl-conc]: https://doc.rust-lang.org/book/concurrency.html |
0 commit comments