Skip to content

Commit e2b5f4f

Browse files
committed
move everything into the Rust tree
2 parents dd46cf8 + a54e64b commit e2b5f4f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+4697
-0
lines changed

src/doc/tarpl/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
% The Advanced Rust Programming Language
2+
3+
# NOTE: This is a draft document, and may contain serious errors
4+
5+
So you've played around with Rust a bit. You've written a few simple programs and
6+
you think you grok the basics. Maybe you've even read through
7+
*[The Rust Programming Language][trpl]*. Now you want to get neck-deep in all the
8+
nitty-gritty details of the language. You want to know those weird corner-cases.
9+
You want to know what the heck `unsafe` really means, and how to properly use it.
10+
This is the book for you.
11+
12+
To be clear, this book goes into *serious* detail. We're going to dig into
13+
exception-safety and pointer aliasing. We're going to talk about memory
14+
models. We're even going to do some type-theory. This is stuff that you
15+
absolutely *don't* need to know to write fast and safe Rust programs.
16+
You could probably close this book *right now* and still have a productive
17+
and happy career in Rust.
18+
19+
However if you intend to write unsafe code -- or just *really* want to dig into
20+
the guts of the language -- this book contains *invaluable* information.
21+
22+
Unlike *The Rust Programming Language* we *will* be assuming considerable prior
23+
knowledge. In particular, you should be comfortable with:
24+
25+
* Basic Systems Programming:
26+
* Pointers
27+
* [The stack and heap][]
28+
* The memory hierarchy (caches)
29+
* Threads
30+
31+
* [Basic Rust][]
32+
33+
Due to the nature of advanced Rust programming, we will be spending a lot of time
34+
talking about *safety* and *guarantees*. In particular, a significant portion of
35+
the book will be dedicated to correctly writing and understanding Unsafe Rust.
36+
37+
[trpl]: https://doc.rust-lang.org/book/
38+
[The stack and heap]: https://doc.rust-lang.org/book/the-stack-and-the-heap.html
39+
[Basic Rust]: https://doc.rust-lang.org/book/syntax-and-semantics.html

src/doc/tarpl/SUMMARY.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Summary
2+
3+
* [Meet Safe and Unsafe](meet-safe-and-unsafe.md)
4+
* [How Safe and Unsafe Interact](safe-unsafe-meaning.md)
5+
* [Working with Unsafe](working-with-unsafe.md)
6+
* [Data Layout](data.md)
7+
* [repr(Rust)](repr-rust.md)
8+
* [Exotically Sized Types](exotic-sizes.md)
9+
* [Other reprs](other-reprs.md)
10+
* [Ownership](ownership.md)
11+
* [References](references.md)
12+
* [Lifetimes](lifetimes.md)
13+
* [Limits of lifetimes](lifetime-mismatch.md)
14+
* [Lifetime Elision](lifetime-elision.md)
15+
* [Unbounded Lifetimes](unbounded-lifetimes.md)
16+
* [Higher-Rank Trait Bounds](hrtb.md)
17+
* [Subtyping and Variance](subtyping.md)
18+
* [Misc](lifetime-misc.md)
19+
* [Type Conversions](conversions.md)
20+
* [Coercions](coercions.md)
21+
* [The Dot Operator](dot-operator.md)
22+
* [Casts](casts.md)
23+
* [Transmutes](transmutes.md)
24+
* [Uninitialized Memory](uninitialized.md)
25+
* [Checked](checked-uninit.md)
26+
* [Drop Flags](drop-flags.md)
27+
* [Unchecked](unchecked-uninit.md)
28+
* [Ownership-Oriented Resource Management](raii.md)
29+
* [Constructors](constructors.md)
30+
* [Destructors](destructors.md)
31+
* [Leaking](leaking.md)
32+
* [Unwinding](unwinding.md)
33+
* [Exception Safety](exception-safety.md)
34+
* [Poisoning](poisoning.md)
35+
* [Concurrency](concurrency.md)
36+
* [Races](races.md)
37+
* [Send and Sync](send-and-sync.md)
38+
* [Atomics](atomics.md)
39+
* [Implementing Vec](vec.md)
40+
* [Layout](vec-layout.md)
41+
* [Allocating](vec-alloc.md)
42+
* [Push and Pop](vec-push-pop.md)
43+
* [Deallocating](vec-dealloc.md)
44+
* [Deref](vec-deref.md)
45+
* [Insert and Remove](vec-insert-remove.md)
46+
* [IntoIter](vec-into-iter.md)
47+
* [Drain](vec-drain.md)
48+
* [Final Code](vec-final.md)
49+
* [Implementing Arc and Mutex](arc-and-mutex.md)

src/doc/tarpl/arc-and-mutex.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
% Implementing Arc and Mutex
2+
3+
Knowing the theory is all fine and good, but the *best* was to understand
4+
something is to use it. To better understand atomics and interior mutability,
5+
we'll be implementing versions of the standard library's Arc and Mutex types.
6+
7+
TODO: ALL OF THIS OMG

src/doc/tarpl/atomics.md

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
% Atomics
2+
3+
Rust pretty blatantly just inherits C11's memory model for atomics. This is not
4+
due this model being particularly excellent or easy to understand. Indeed, this
5+
model is quite complex and known to have [several flaws][C11-busted]. Rather,
6+
it is a pragmatic concession to the fact that *everyone* is pretty bad at modeling
7+
atomics. At very least, we can benefit from existing tooling and research around
8+
C.
9+
10+
Trying to fully explain the model in this book is fairly hopeless. It's defined
11+
in terms of madness-inducing causality graphs that require a full book to properly
12+
understand in a practical way. If you want all the nitty-gritty details, you
13+
should check out [C's specification (Section 7.17)][C11-model]. Still, we'll try
14+
to cover the basics and some of the problems Rust developers face.
15+
16+
The C11 memory model is fundamentally about trying to bridge the gap between
17+
the semantics we want, the optimizations compilers want, and the inconsistent
18+
chaos our hardware wants. *We* would like to just write programs and have them
19+
do exactly what we said but, you know, *fast*. Wouldn't that be great?
20+
21+
22+
23+
24+
# Compiler Reordering
25+
26+
Compilers fundamentally want to be able to do all sorts of crazy transformations
27+
to reduce data dependencies and eliminate dead code. In particular, they may
28+
radically change the actual order of events, or make events never occur! If we
29+
write something like
30+
31+
```rust,ignore
32+
x = 1;
33+
y = 3;
34+
x = 2;
35+
```
36+
37+
The compiler may conclude that it would *really* be best if your program did
38+
39+
```rust,ignore
40+
x = 2;
41+
y = 3;
42+
```
43+
44+
This has inverted the order of events *and* completely eliminated one event. From
45+
a single-threaded perspective this is completely unobservable: after all the
46+
statements have executed we are in exactly the same state. But if our program is
47+
multi-threaded, we may have been relying on `x` to *actually* be assigned to 1 before
48+
`y` was assigned. We would *really* like the compiler to be able to make these kinds
49+
of optimizations, because they can seriously improve performance. On the other hand,
50+
we'd really like to be able to depend on our program *doing the thing we said*.
51+
52+
53+
54+
55+
# Hardware Reordering
56+
57+
On the other hand, even if the compiler totally understood what we wanted and
58+
respected our wishes, our *hardware* might instead get us in trouble. Trouble comes
59+
from CPUs in the form of memory hierarchies. There is indeed a global shared memory
60+
space somewhere in your hardware, but from the perspective of each CPU core it is
61+
*so very far away* and *so very slow*. Each CPU would rather work with its local
62+
cache of the data and only go through all the *anguish* of talking to shared
63+
memory *only* when it doesn't actually have that memory in cache.
64+
65+
After all, that's the whole *point* of the cache, right? If every read from the
66+
cache had to run back to shared memory to double check that it hadn't changed,
67+
what would the point be? The end result is that the hardware doesn't guarantee
68+
that events that occur in the same order on *one* thread, occur in the same order
69+
on *another* thread. To guarantee this, we must issue special instructions to
70+
the CPU telling it to be a bit less smart.
71+
72+
For instance, say we convince the compiler to emit this logic:
73+
74+
```text
75+
initial state: x = 0, y = 1
76+
77+
THREAD 1 THREAD2
78+
y = 3; if x == 1 {
79+
x = 1; y *= 2;
80+
}
81+
```
82+
83+
Ideally this program has 2 possible final states:
84+
85+
* `y = 3`: (thread 2 did the check before thread 1 completed)
86+
* `y = 6`: (thread 2 did the check after thread 1 completed)
87+
88+
However there's a third potential state that the hardware enables:
89+
90+
* `y = 2`: (thread 2 saw `x = 2`, but not `y = 3`, and then overwrote `y = 3`)
91+
92+
It's worth noting that different kinds of CPU provide different guarantees. It
93+
is common to seperate hardware into two categories: strongly-ordered and weakly-
94+
ordered. Most notably x86/64 provides strong ordering guarantees, while ARM and
95+
provides weak ordering guarantees. This has two consequences for
96+
concurrent programming:
97+
98+
* Asking for stronger guarantees on strongly-ordered hardware may be cheap or
99+
even *free* because they already provide strong guarantees unconditionally.
100+
Weaker guarantees may only yield performance wins on weakly-ordered hardware.
101+
102+
* Asking for guarantees that are *too* weak on strongly-ordered hardware
103+
is more likely to *happen* to work, even though your program is strictly
104+
incorrect. If possible, concurrent algorithms should be tested on
105+
weakly-ordered hardware.
106+
107+
108+
109+
110+
111+
# Data Accesses
112+
113+
The C11 memory model attempts to bridge the gap by allowing us to talk about
114+
the *causality* of our program. Generally, this is by establishing a
115+
*happens before* relationships between parts of the program and the threads
116+
that are running them. This gives the hardware and compiler room to optimize the
117+
program more aggressively where a strict happens-before relationship isn't
118+
established, but forces them to be more careful where one *is* established.
119+
The way we communicate these relationships are through *data accesses* and
120+
*atomic accesses*.
121+
122+
Data accesses are the bread-and-butter of the programming world. They are
123+
fundamentally unsynchronized and compilers are free to aggressively optimize
124+
them. In particular, data accesses are free to be reordered by the compiler
125+
on the assumption that the program is single-threaded. The hardware is also free
126+
to propagate the changes made in data accesses to other threads
127+
as lazily and inconsistently as it wants. Mostly critically, data accesses are
128+
how data races happen. Data accesses are very friendly to the hardware and
129+
compiler, but as we've seen they offer *awful* semantics to try to
130+
write synchronized code with. Actually, that's too weak. *It is literally
131+
impossible to write correct synchronized code using only data accesses*.
132+
133+
Atomic accesses are how we tell the hardware and compiler that our program is
134+
multi-threaded. Each atomic access can be marked with
135+
an *ordering* that specifies what kind of relationship it establishes with
136+
other accesses. In practice, this boils down to telling the compiler and hardware
137+
certain things they *can't* do. For the compiler, this largely revolves
138+
around re-ordering of instructions. For the hardware, this largely revolves
139+
around how writes are propagated to other threads. The set of orderings Rust
140+
exposes are:
141+
142+
* Sequentially Consistent (SeqCst)
143+
* Release
144+
* Acquire
145+
* Relaxed
146+
147+
(Note: We explicitly do not expose the C11 *consume* ordering)
148+
149+
TODO: negative reasoning vs positive reasoning?
150+
TODO: "can't forget to synchronize"
151+
152+
153+
154+
# Sequentially Consistent
155+
156+
Sequentially Consistent is the most powerful of all, implying the restrictions
157+
of all other orderings. Intuitively, a sequentially consistent operation *cannot*
158+
be reordered: all accesses on one thread that happen before and after it *stay*
159+
before and after it. A data-race-free program that uses only sequentially consistent
160+
atomics and data accesses has the very nice property that there is a single global
161+
execution of the program's instructions that all threads agree on. This execution
162+
is also particularly nice to reason about: it's just an interleaving of each thread's
163+
individual executions. This *does not* hold if you start using the weaker atomic
164+
orderings.
165+
166+
The relative developer-friendliness of sequential consistency doesn't come for
167+
free. Even on strongly-ordered platforms sequential consistency involves
168+
emitting memory fences.
169+
170+
In practice, sequential consistency is rarely necessary for program correctness.
171+
However sequential consistency is definitely the right choice if you're not
172+
confident about the other memory orders. Having your program run a bit slower
173+
than it needs to is certainly better than it running incorrectly! It's also
174+
*mechanically* trivial to downgrade atomic operations to have a weaker
175+
consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of
176+
course, proving that this transformation is *correct* is whole other matter.
177+
178+
179+
180+
181+
# Acquire-Release
182+
183+
Acquire and Release are largely intended to be paired. Their names hint at
184+
their use case: they're perfectly suited for acquiring and releasing locks,
185+
and ensuring that critical sections don't overlap.
186+
187+
Intuitively, an acquire access ensures that every access after it *stays* after
188+
it. However operations that occur before an acquire are free to be reordered to
189+
occur after it. Similarly, a release access ensures that every access before it
190+
*stays* before it. However operations that occur after a release are free to
191+
be reordered to occur before it.
192+
193+
When thread A releases a location in memory and then thread B subsequently
194+
acquires *the same* location in memory, causality is established. Every write
195+
that happened *before* A's release will be observed by B *after* it's release.
196+
However no causality is established with any other threads. Similarly, no
197+
causality is established if A and B access *different* locations in memory.
198+
199+
Basic use of release-acquire is therefore simple: you acquire a location of
200+
memory to begin the critical section, and then release that location to end it.
201+
For instance, a simple spinlock might look like:
202+
203+
```rust
204+
use std::sync::Arc;
205+
use std::sync::atomic::{AtomicBool, Ordering};
206+
use std::thread;
207+
208+
fn main() {
209+
let lock = Arc::new(AtomicBool::new(true)); // value answers "am I locked?"
210+
211+
// ... distribute lock to threads somehow ...
212+
213+
// Try to acquire the lock by setting it to false
214+
while !lock.compare_and_swap(true, false, Ordering::Acquire) { }
215+
// broke out of the loop, so we successfully acquired the lock!
216+
217+
// ... scary data accesses ...
218+
219+
// ok we're done, release the lock
220+
lock.store(true, Ordering::Release);
221+
}
222+
```
223+
224+
On strongly-ordered platforms most accesses have release or acquire semantics,
225+
making release and acquire often totally free. This is not the case on
226+
weakly-ordered platforms.
227+
228+
229+
230+
231+
# Relaxed
232+
233+
Relaxed accesses are the absolute weakest. They can be freely re-ordered and
234+
provide no happens-before relationship. Still, relaxed operations *are* still
235+
atomic. That is, they don't count as data accesses and any read-modify-write
236+
operations done to them occur atomically. Relaxed operations are appropriate for
237+
things that you definitely want to happen, but don't particularly otherwise care
238+
about. For instance, incrementing a counter can be safely done by multiple
239+
threads using a relaxed `fetch_add` if you're not using the counter to
240+
synchronize any other accesses.
241+
242+
There's rarely a benefit in making an operation relaxed on strongly-ordered
243+
platforms, since they usually provide release-acquire semantics anyway. However
244+
relaxed operations can be cheaper on weakly-ordered platforms.
245+
246+
247+
248+
249+
250+
[C11-busted]: http://plv.mpi-sws.org/c11comp/popl15.pdf
251+
[C11-model]: http://www.open-std.org/jtc1/sc22/wg14/www/standards.html#9899

0 commit comments

Comments
 (0)