@@ -13,26 +13,66 @@ Where TRPL introduces the language and teaches the basics, TURPL dives deep into
13
13
the specification of the language, and all the nasty bits necessary to write
14
14
Unsafe Rust. TURPL does not assume you have read TRPL, but does assume you know
15
15
the basics of the language and systems programming. We will not explain the
16
- stack or heap, we will not explain the syntax.
16
+ stack or heap, we will not explain the basic syntax.
17
17
18
18
19
19
20
20
21
- # A Tale Of Two Languages
22
21
22
+ # Meet Safe and Unsafe
23
+
24
+ Safe and Unsafe are Rust's chief engineers.
25
+
26
+ TODO: ADORABLE PICTURES OMG
27
+
28
+ Unsafe handles all the dangerous internal stuff. They build the foundations
29
+ and handle all the dangerous materials. By all accounts, Unsafe is really a bit
30
+ unproductive, because the nature of their work means that they have to spend a
31
+ lot of time checking and double-checking everything. What if there's an earthquake
32
+ on a leap year? Are we ready for that? Unsafe better be, because if they get
33
+ * anything* wrong, everything will blow up! What Unsafe brings to the table is
34
+ * quality* , not quantity. Still, nothing would ever get done if everything was
35
+ built to Unsafe's standards!
36
+
37
+ That's where Safe comes in. Safe has to handle * everything else* . Since Safe needs
38
+ to * get work done* , they've grown to be fairly carless and clumsy! Safe doesn't worry
39
+ about all the crazy eventualities that Unsafe does, because life is too short to deal
40
+ with leap-year-earthquakes. Of course, this means there's some jobs that Safe just
41
+ can't handle. Safe is all about quantity over quality.
42
+
43
+ Unsafe loves Safe to bits, but knows that tey * can never trust them to do the
44
+ right thing* . Still, Unsafe acknowledges that not every problem needs quite the
45
+ attention to detail that they apply. Indeed, Unsafe would * love* if Safe could do
46
+ * everything* for them. To accomplish this, Unsafe spends most of their time
47
+ building * safe abstractions* . These abstractions handle all the nitty-gritty
48
+ details for Safe, and choose good defaults so that the simplest solution (which
49
+ Safe will inevitably use) is usually the * right* one. Once a safe abstraction is
50
+ built, Unsafe ideally needs to never work on it again, and Safe can blindly use
51
+ it in all their work.
52
+
53
+ Unsafe's attention to detail means that all the things that they mark as ok for
54
+ Safe to use can be combined in arbitrarily ridiculous ways, and all the rules
55
+ that Unsafe is forced to uphold will never be violated. If they * can* be violated
56
+ by Safe, that means * Unsafe* 's the one in the wrong. Safe can work carelessly,
57
+ knowing that if anything blows up, it's not * their* fault. Safe can also call in
58
+ Unsafe at any time if there's a hard problem they can't quite work out, or if they
59
+ can't meet the client's quality demands. Of course, Unsafe will beg and plead Safe
60
+ to try their latest safe abstraction first!
61
+
62
+ In addition to being adorable, Safe and Unsafe are what makes Rust possible.
23
63
Rust can be thought of as two different languages: Safe Rust, and Unsafe Rust.
24
64
Any time someone opines the guarantees of Rust, they are almost surely talking about
25
- Safe Rust . However Safe Rust is not sufficient to write every program. For that,
26
- we need the Unsafe Rust superset.
65
+ Safe. However Safe is not sufficient to write every program. For that,
66
+ we need the Unsafe superset.
27
67
28
68
Most fundamentally, writing bindings to other languages
29
69
(such as the C exposed by your operating system) is never going to be safe. Rust
30
- can't control what other languages do to program execution! However Unsafe Rust is
70
+ can't control what other languages do to program execution! However Unsafe is
31
71
also necessary to construct fundamental abstractions where the type system is not
32
72
sufficient to automatically prove what you're doing is sound.
33
73
34
74
Indeed, the Rust standard library is implemented in Rust, and it makes substantial
35
- use of Unsafe Rust for implementing IO, memory allocation, collections,
75
+ use of Unsafe for implementing IO, memory allocation, collections,
36
76
synchronization, and other low-level computational primitives.
37
77
38
78
Upon hearing this, many wonder why they would not simply just use C or C++ in place of
@@ -47,46 +87,40 @@ one does not have to suddenly worry about indexing out of bounds on `y`.
47
87
C and C++, by contrast, have pervasive unsafety baked into the language. Even the
48
88
modern best practices like ` unique_ptr ` have various safety pitfalls.
49
89
50
- It should also be noted that writing Unsafe Rust should be regarded as an exceptional
51
- action . Unsafe Rust is often the domain of * fundamental libraries* . Anything that needs
90
+ It cannot be emphasized enough that Unsafe should be regarded as an exceptional
91
+ thing, not a normal one . Unsafe is often the domain of * fundamental libraries* : anything that needs
52
92
to make FFI bindings or define core abstractions. These fundamental libraries then expose
53
- a * safe* interface for intermediate libraries and applications to build upon. And these
93
+ a safe interface for intermediate libraries and applications to build upon. And these
54
94
safe interfaces make an important promise: if your application segfaults, it's not your
55
95
fault. * They* have a bug.
56
96
57
97
And really, how is that different from * any* safe language? Python, Ruby, and Java libraries
58
98
can internally do all sorts of nasty things. The languages themselves are no
59
- different. Safe languages regularly have bugs that cause critical vulnerabilities.
60
- The fact that Rust is written with a healthy spoonful of Unsafe Rust is no different.
99
+ different. Safe languages * regularly* have bugs that cause critical vulnerabilities.
100
+ The fact that Rust is written with a healthy spoonful of Unsafe is no different.
61
101
However it * does* mean that Rust doesn't need to fall back to the pervasive unsafety of
62
102
C to do the nasty things that need to get done.
63
103
64
104
65
105
66
106
67
- # What does ` unsafe ` mean?
68
107
69
- Rust tries to model memory safety through the ` unsafe ` keyword. Interestingly,
70
- the meaning of ` unsafe ` largely revolves around what
71
- its * absence* means. If the ` unsafe ` keyword is absent from a program, it should
72
- not be possible to violate memory safety under * any* conditions. The presence
73
- of ` unsafe ` means that there are conditions under which this code * could*
74
- violate memory safety.
108
+ # What do Safe and Unsafe really mean?
75
109
76
- To be more concrete, Rust cares about preventing the following things:
110
+ Rust cares about preventing the following things:
77
111
78
- * Dereferencing null/dangling pointers
79
- * Reading uninitialized memory
80
- * Breaking the pointer aliasing rules (TBD) (llvm rules + noalias on &mut and & w/o UnsafeCell)
81
- * Invoking Undefined Behaviour (in e.g. compiler intrinsics)
112
+ * Dereferencing null or dangling pointers
113
+ * Reading [ uninitialized memory] [ ]
114
+ * Breaking the [ pointer aliasing rules] [ ]
82
115
* Producing invalid primitive values:
83
116
* dangling/null references
84
117
* a ` bool ` that isn't 0 or 1
85
118
* an undefined ` enum ` discriminant
86
- * a ` char ` larger than char::MAX
119
+ * a ` char ` larger than char::MAX (TODO: check if stronger restrictions apply)
87
120
* A non-utf8 ` str `
88
- * Unwinding into an FFI function
89
- * Causing a data race
121
+ * Unwinding into another language
122
+ * Causing a [ data race] [ ]
123
+ * Invoking Misc. Undefined Behaviour (in e.g. compiler intrinsics)
90
124
91
125
That's it. That's all the Undefined Behaviour in Rust. Libraries are free to
92
126
declare arbitrary requirements if they could transitively cause memory safety
@@ -95,15 +129,17 @@ quite permisive with respect to other dubious operations. Rust considers it
95
129
"safe" to:
96
130
97
131
* Deadlock
132
+ * Have a Race Condition
98
133
* Leak memory
99
134
* Fail to call destructors
100
135
* Overflow integers
101
136
* Delete the production database
102
137
103
- However any program that does such a thing is * probably* incorrect. Rust just isn't
104
- interested in modeling these problems, as they are much harder to prevent in general,
105
- and it's literally impossible to prevent incorrect programs from getting written .
138
+ However any program that does such a thing is * probably* incorrect. Rust
139
+ provides lots of tools to make doing these things rare, but these problems are
140
+ considered impractical to categorically prevent .
106
141
142
+ Rust models the seperation between Safe and Unsafe with the ` unsafe ` keyword.
107
143
There are several places ` unsafe ` can appear in Rust today, which can largely be
108
144
grouped into two categories:
109
145
@@ -112,7 +148,7 @@ you to write `unsafe` elsewhere:
112
148
* On functions, ` unsafe ` is declaring the function to be unsafe to call. Users
113
149
of the function must check the documentation to determine what this means,
114
150
and then have to write ` unsafe ` somewhere to identify that they're aware of
115
- the danger.
151
+ the danger.
116
152
* On trait declarations, ` unsafe ` is declaring that * implementing* the trait
117
153
is an unsafe operation, as it has contracts that other unsafe code is free to
118
154
trust blindly.
@@ -126,19 +162,19 @@ unchecked contracts:
126
162
127
163
There is also ` #[unsafe_no_drop_flag] ` , which is a special case that exists for
128
164
historical reasons and is in the process of being phased out. See the section on
129
- destructors for details.
165
+ [ destructors] [ ] for details.
130
166
131
167
Some examples of unsafe functions:
132
168
133
169
* ` slice::get_unchecked ` will perform unchecked indexing, allowing memory
134
170
safety to be freely violated.
135
- * ` ptr::offset ` in an intrinsic that invokes Undefined Behaviour if it is
171
+ * ` ptr::offset ` is an intrinsic that invokes Undefined Behaviour if it is
136
172
not "in bounds" as defined by LLVM (see the lifetimes section for details).
137
173
* ` mem::transmute ` reinterprets some value as having the given type,
138
- bypassing type safety in arbitrary ways. (see the conversions section for details)
174
+ bypassing type safety in arbitrary ways. (see [ conversions] [ ] for details)
139
175
* All FFI functions are ` unsafe ` because they can do arbitrary things.
140
176
C being an obvious culprit, but generally any language can do something
141
- that Rust isn't happy about. (see the FFI section for details)
177
+ that Rust isn't happy about.
142
178
143
179
As of Rust 1.0 there are exactly two unsafe traits:
144
180
@@ -147,25 +183,60 @@ As of Rust 1.0 there are exactly two unsafe traits:
147
183
* ` Sync ` is a marker trait that promises that threads can safely share
148
184
implementors through a shared reference.
149
185
150
- All other traits that declare any kind of contract * really* can't be trusted
151
- to adhere to their contract when memory-safety is at stake. For instance Rust has
152
- ` PartialOrd ` and ` Ord ` to differentiate between types which can "just" be
153
- compared and those that implement a total ordering. However you can't actually
154
- trust an implementor of ` Ord ` to actually provide a total ordering if failing to
155
- do so causes you to e.g. index out of bounds. But if it just makes your program
156
- do a stupid thing, then it's "fine" to rely on ` Ord ` .
157
-
158
- The reason this is the case is that ` Ord ` is safe to implement, and it should be
159
- impossible for bad * safe* code to violate memory safety. Rust has traditionally
160
- avoided making traits unsafe because it makes ` unsafe ` pervasive in the language,
161
- which is not desirable. The only reason ` Send ` and ` Sync ` are unsafe is because
162
- thread safety is a sort of fundamental thing that a program can't really guard
163
- against locally (even by-value message passing still requires a notion Send).
164
-
165
-
166
-
167
-
168
- # Working with unsafe
186
+ The need for unsafe traits boils down to the fundamental lack of trust that Unsafe
187
+ has for Safe. All safe traits are free to declare arbitrary contracts, but because
188
+ implementing them is a job for Safe, Unsafe can't trust those contracts to actually
189
+ be upheld.
190
+
191
+ For instance Rust has ` PartialOrd ` and ` Ord ` traits to try to differentiate
192
+ between types which can "just" be compared, and those that actually implement a
193
+ * total* ordering. Pretty much every API that wants to work with data that can be
194
+ compared * really* wants Ord data. For instance, a sorted map like BTreeMap
195
+ * doesn't even make sense* for partially ordered types. If you claim to implement
196
+ Ord for a type, but don't actually provide a proper total ordering, BTreeMap will
197
+ get * really confused* and start making a total mess of itself. Data that is
198
+ inserted may be impossible to find!
199
+
200
+ But that's ok. BTreeMap is safe, so it guarantees that even if you give it a
201
+ * completely* garbage Ord implementation, it will still do something * safe* . You
202
+ won't start reading uninitialized memory or unallocated memory. In fact, BTreeMap
203
+ manages to not actually lose any of your data. When the map is dropped, all the
204
+ destructors will be successfully called! Hooray!
205
+
206
+ However BTreeMap is implemented using a modest spoonful of Unsafe (most collections
207
+ are). That means that it is not necessarily * trivially true* that a bad Ord
208
+ implementation will make BTreeMap behave safely. Unsafe most be sure not to rely
209
+ on Ord * where safety is at stake* , because Ord is provided by Safe, and memory
210
+ safety is not Safe's responsibility to uphold. * It must be impossible for Safe
211
+ code to violate memory safety* .
212
+
213
+ But wouldn't it be grand if there was some way for Unsafe to trust * some* trait
214
+ contracts * somewhere* ? This is the problem that unsafe traits tackle: by marking
215
+ * the trait itself* as unsafe * to implement* , Unsafe can trust the implementation
216
+ to be correct (because Unsafe can trust themself).
217
+
218
+ Rust has traditionally avoided making traits unsafe because it makes Unsafe
219
+ pervasive, which is not desirable. Send and Sync are unsafe is because
220
+ thread safety is a * fundamental property* that Unsafe cannot possibly hope to
221
+ defend against in the same way it would defend against a bad Ord implementation.
222
+ The only way to possibly defend against thread-unsafety would be to * not use
223
+ threading at all* . Making every operation atomic isn't even sufficient, because
224
+ it's possible for complex invariants between disjoint locations in memory.
225
+
226
+ Even concurrent paradigms that are traditionally regarded as Totally Safe like
227
+ message passing implicitly rely on some notion of thread safety -- are you
228
+ really message-passing if you send a * pointer* ? Send and Sync therefore require
229
+ some * fundamental* level of trust that Safe code can't provide, so they must be
230
+ unsafe to implement. To help obviate the pervasive unsafety that this would
231
+ introduce, Send (resp. Sync) is * automatically* derived for all types composed only
232
+ of Send (resp. Sync) values. 99% of types are Send and Sync, and 99% of those
233
+ never actually say it (the remaining 1% is overwhelmingly synchronization
234
+ primitives).
235
+
236
+
237
+
238
+
239
+ # Working with Unsafe
169
240
170
241
Rust generally only gives us the tools to talk about safety in a scoped and
171
242
binary manner. Unfortunately reality is significantly more complicated than that.
@@ -254,5 +325,11 @@ trust the capacity field because there's no way to verify it.
254
325
Generally, the only bullet-proof way to limit the scope of unsafe code is at the
255
326
module boundary with privacy.
256
327
257
- [ trpl ] : https://doc.rust-lang.org/book/
258
328
329
+
330
+ [ trpl ] : https://doc.rust-lang.org/book/
331
+ [ pointer aliasing rules ] : lifetimes.html#references
332
+ [ uninitialized memory ] : uninitialized.html
333
+ [ data race ] : concurrency.html
334
+ [ destructors ] : raii.html
335
+ [ conversions ] : conversions.html
0 commit comments