Skip to content

Commit 9e2bea8

Browse files
committed
---
yaml --- r: 229034 b: refs/heads/try c: 0a36ea7 h: refs/heads/master v: v3
1 parent 86b0c2c commit 9e2bea8

File tree

4 files changed

+210
-59
lines changed

4 files changed

+210
-59
lines changed

[refs]

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
refs/heads/master: aca2057ed5fb7af3f8905b2bc01f72fa001c35c8
33
refs/heads/snap-stage3: 1af31d4974e33027a68126fa5a5a3c2c6491824f
4-
refs/heads/try: 7a47ffcbc73fa3bd02429c92841dcb0792a1f9b8
4+
refs/heads/try: 0a36ea7db130dfaa6012d76ccf80b9b77e15796b
55
refs/tags/release-0.1: 1f5c5126e96c79d22cb7862f75304136e204f105
66
refs/tags/release-0.2: c870d2dffb391e14efb05aa27898f1f6333a9596
77
refs/tags/release-0.3: b5f0d0f648d9a6153664837026ba1be43d3e2503

branches/try/src/doc/tarpl/vec-alloc.md

Lines changed: 131 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,22 @@
11
% Allocating Memory
22

3+
Using Unique throws a wrench in an important feature of Vec (and indeed all of
4+
the std collections): an empty Vec doesn't actually allocate at all. So if we
5+
can't allocate, but also can't put a null pointer in `ptr`, what do we do in
6+
`Vec::new`? Well, we just put some other garbage in there!
7+
8+
This is perfectly fine because we already have `cap == 0` as our sentinel for no
9+
allocation. We don't even need to handle it specially in almost any code because
10+
we usually need to check if `cap > len` or `len > 0` anyway. The traditional
11+
Rust value to put here is `0x01`. The standard library actually exposes this
12+
as `std::rt::heap::EMPTY`. There are quite a few places where we'll
13+
want to use `heap::EMPTY` because there's no real allocation to talk about but
14+
`null` would make the compiler do bad things.
15+
16+
All of the `heap` API is totally unstable under the `heap_api` feature, though.
17+
We could trivially define `heap::EMPTY` ourselves, but we'll want the rest of
18+
the `heap` API anyway, so let's just get that dependency over with.
19+
320
So:
421

522
```rust,ignore
@@ -24,15 +41,29 @@ I slipped in that assert there because zero-sized types will require some
2441
special handling throughout our code, and I want to defer the issue for now.
2542
Without this assert, some of our early drafts will do some Very Bad Things.
2643

27-
Next we need to figure out what to actually do when we *do* want space. For that,
28-
we'll need to use the rest of the heap APIs. These basically allow us to
29-
talk directly to Rust's instance of jemalloc.
30-
31-
We'll also need a way to handle out-of-memory conditions. The standard library
32-
calls the `abort` intrinsic, but calling intrinsics from normal Rust code is a
33-
pretty bad idea. Unfortunately, the `abort` exposed by the standard library
34-
allocates. Not something we want to do during `oom`! Instead, we'll call
35-
`std::process::exit`.
44+
Next we need to figure out what to actually do when we *do* want space. For
45+
that, we'll need to use the rest of the heap APIs. These basically allow us to
46+
talk directly to Rust's allocator (jemalloc by default).
47+
48+
We'll also need a way to handle out-of-memory (OOM) conditions. The standard
49+
library calls the `abort` intrinsic, which just calls an illegal instruction to
50+
crash the whole program. The reason we abort and don't panic is because
51+
unwinding can cause allocations to happen, and that seems like a bad thing to do
52+
when your allocator just came back with "hey I don't have any more memory".
53+
54+
Of course, this is a bit silly since most platforms don't actually run out of
55+
memory in a conventional way. Your operating system will probably kill the
56+
application by another means if you legitimately start using up all the memory.
57+
The most likely way we'll trigger OOM is by just asking for ludicrous quantities
58+
of memory at once (e.g. half the theoretical address space). As such it's
59+
*probably* fine to panic and nothing bad will happen. Still, we're trying to be
60+
like the standard library as much as possible, so we'll just kill the whole
61+
program.
62+
63+
We said we don't want to use intrinsics, so doing *exactly* what `std` does is
64+
out. `std::rt::util::abort` actually exists, but it takes a message to print,
65+
which will probably allocate. Also it's still unstable. Instead, we'll call
66+
`std::process::exit` with some random number.
3667

3768
```rust
3869
fn oom() {
@@ -51,29 +82,104 @@ else:
5182
cap *= 2
5283
```
5384

54-
But Rust's only supported allocator API is so low level that we'll need to
55-
do a fair bit of extra work, though. We also need to guard against some special
56-
conditions that can occur with really large allocations. In particular, we index
57-
into arrays using unsigned integers, but `ptr::offset` takes signed integers. This
58-
means Bad Things will happen if we ever manage to grow to contain more than
59-
`isize::MAX` elements. Thankfully, this isn't something we need to worry about
60-
in most cases.
85+
But Rust's only supported allocator API is so low level that we'll need to do a
86+
fair bit of extra work. We also need to guard against some special
87+
conditions that can occur with really large allocations or empty allocations.
88+
89+
In particular, `ptr::offset` will cause us *a lot* of trouble, because it has
90+
the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
91+
not have dealt with this instruction, here's the basic story with GEP: alias
92+
analysis, alias analysis, alias analysis. It's super important to an optimizing
93+
compiler to be able to reason about data dependencies and aliasing.
6194

62-
On 64-bit targets we're artifically limited to only 48-bits, so we'll run out
63-
of memory far before we reach that point. However on 32-bit targets, particularly
64-
those with extensions to use more of the address space, it's theoretically possible
65-
to successfully allocate more than `isize::MAX` bytes of memory. Still, we only
66-
really need to worry about that if we're allocating elements that are a byte large.
67-
Anything else will use up too much space.
95+
As a simple example, consider the following fragment of code:
96+
97+
```rust
98+
# let x = &mut 0;
99+
# let y = &mut 0;
100+
*x *= 7;
101+
*y *= 3;
102+
```
68103

69-
However since this is a tutorial, we're not going to be particularly optimal here,
70-
and just unconditionally check, rather than use clever platform-specific `cfg`s.
104+
If the compiler can prove that `x` and `y` point to different locations in
105+
memory, the two operations can in theory be executed in parallel (by e.g.
106+
loading them into different registers and working on them independently).
107+
However in *general* the compiler can't do this because if x and y point to
108+
the same location in memory, the operations need to be done to the same value,
109+
and they can't just be merged afterwards.
110+
111+
When you use GEP inbounds, you are specifically telling LLVM that the offsets
112+
you're about to do are within the bounds of a single allocated entity. The
113+
ultimate payoff being that LLVM can assume that if two pointers are known to
114+
point to two disjoint objects, all the offsets of those pointers are *also*
115+
known to not alias (because you won't just end up in some random place in
116+
memory). LLVM is heavily optimized to work with GEP offsets, and inbounds
117+
offsets are the best of all, so it's important that we use them as much as
118+
possible.
119+
120+
So that's what GEP's about, how can it cause us trouble?
121+
122+
The first problem is that we index into arrays with unsigned integers, but
123+
GEP (and as a consequence `ptr::offset`) takes a *signed integer*. This means
124+
that half of the seemingly valid indices into an array will overflow GEP and
125+
actually go in the wrong direction! As such we must limit all allocations to
126+
`isize::MAX` elements. This actually means we only need to worry about
127+
byte-sized objects, because e.g. `> isize::MAX` `u16`s will truly exhaust all of
128+
the system's memory. However in order to avoid subtle corner cases where someone
129+
reinterprets some array of `< isize::MAX` objects as bytes, std limits all
130+
allocations to `isize::MAX` bytes.
131+
132+
On all 64-bit targets that Rust currently supports we're artificially limited
133+
to significantly less than all 64 bits of the address space (modern x64
134+
platforms only expose 48-bit addressing), so we can rely on just running out of
135+
memory first. However on 32-bit targets, particularly those with extensions to
136+
use more of the address space (PAE x86 or x32), it's theoretically possible to
137+
successfully allocate more than `isize::MAX` bytes of memory.
138+
139+
However since this is a tutorial, we're not going to be particularly optimal
140+
here, and just unconditionally check, rather than use clever platform-specific
141+
`cfg`s.
142+
143+
The other corner-case we need to worry about is *empty* allocations. There will
144+
be two kinds of empty allocations we need to worry about: `cap = 0` for all T,
145+
and `cap > 0` for zero-sized types.
146+
147+
These cases are tricky because they come
148+
down to what LLVM means by "allocated". LLVM's notion of an
149+
allocation is significantly more abstract than how we usually use it. Because
150+
LLVM needs to work with different languages' semantics and custom allocators,
151+
it can't really intimately understand allocation. Instead, the main idea behind
152+
allocation is "doesn't overlap with other stuff". That is, heap allocations,
153+
stack allocations, and globals don't randomly overlap. Yep, it's about alias
154+
analysis. As such, Rust can technically play a bit fast an loose with the notion of
155+
an allocation as long as it's *consistent*.
156+
157+
Getting back to the empty allocation case, there are a couple of places where
158+
we want to offset by 0 as a consequence of generic code. The question is then:
159+
is it consistent to do so? For zero-sized types, we have concluded that it is
160+
indeed consistent to do a GEP inbounds offset by an arbitrary number of
161+
elements. This is a runtime no-op because every element takes up no space,
162+
and it's fine to pretend that there's infinite zero-sized types allocated
163+
at `0x01`. No allocator will ever allocate that address, because they won't
164+
allocate `0x00` and they generally allocate to some minimal alignment higher
165+
than a byte.
166+
167+
However what about for positive-sized types? That one's a bit trickier. In
168+
principle, you can argue that offsetting by 0 gives LLVM no information: either
169+
there's an element before the address, or after it, but it can't know which.
170+
However we've chosen to conservatively assume that it may do bad things. As
171+
such we *will* guard against this case explicitly.
172+
173+
*Phew*
174+
175+
Ok with all the nonsense out of the way, let's actually allocate some memory:
71176

72177
```rust,ignore
73178
fn grow(&mut self) {
74179
// this is all pretty delicate, so let's say it's all unsafe
75180
unsafe {
76-
let align = mem::min_align_of::<T>();
181+
// current API requires us to specify size and alignment manually.
182+
let align = mem::align_of::<T>();
77183
let elem_size = mem::size_of::<T>();
78184
79185
let (new_cap, ptr) = if self.cap == 0 {

branches/try/src/doc/tarpl/vec-layout.md

Lines changed: 63 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,64 @@ pub struct Vec<T> {
1313
# fn main() {}
1414
```
1515

16-
And indeed this would compile. Unfortunately, it would be incorrect. The
17-
compiler will give us too strict variance, so e.g. an `&Vec<&'static str>`
16+
And indeed this would compile. Unfortunately, it would be incorrect. First, the
17+
compiler will give us too strict variance. So a `&Vec<&'static str>`
1818
couldn't be used where an `&Vec<&'a str>` was expected. More importantly, it
19-
will give incorrect ownership information to dropck, as it will conservatively
20-
assume we don't own any values of type `T`. See [the chapter on ownership and
21-
lifetimes] (lifetimes.html) for details.
19+
will give incorrect ownership information to the drop checker, as it will
20+
conservatively assume we don't own any values of type `T`. See [the chapter
21+
on ownership and lifetimes][ownership] for all the details on variance and
22+
drop check.
2223

23-
As we saw in the lifetimes chapter, we should use `Unique<T>` in place of
24-
`*mut T` when we have a raw pointer to an allocation we own:
24+
As we saw in the ownership chapter, we should use `Unique<T>` in place of
25+
`*mut T` when we have a raw pointer to an allocation we own. Unique is unstable,
26+
so we'd like to not use it if possible, though.
27+
28+
As a recap, Unique is a wrapper around a raw pointer that declares that:
29+
30+
* We are variant over `T`
31+
* We may own a value of type `T` (for drop check)
32+
* We are Send/Sync if `T` is Send/Sync
33+
* We deref to `*mut T` (so it largely acts like a `*mut` in our code)
34+
* Our pointer is never null (so `Option<Vec<T>>` is null-pointer-optimized)
35+
36+
We can implement all of the above requirements except for the last
37+
one in stable Rust:
38+
39+
```rust
40+
use std::marker::PhantomData;
41+
use std::ops::Deref;
42+
use std::mem;
43+
44+
struct Unique<T> {
45+
ptr: *const T, // *const for variance
46+
_marker: PhantomData<T>, // For the drop checker
47+
}
48+
49+
// Deriving Send and Sync is safe because we are the Unique owners
50+
// of this data. It's like Unique<T> is "just" T.
51+
unsafe impl<T: Send> Send for Unique<T> {}
52+
unsafe impl<T: Sync> Sync for Unique<T> {}
53+
54+
impl<T> Unique<T> {
55+
pub fn new(ptr: *mut T) -> Self {
56+
Unique { ptr: ptr, _marker: PhantomData }
57+
}
58+
}
59+
60+
impl<T> Deref for Unique<T> {
61+
type Target = *mut T;
62+
fn deref(&self) -> &*mut T {
63+
// There's no way to cast the *const to a *mut
64+
// while also taking a reference. So we just
65+
// transmute it since it's all "just pointers".
66+
unsafe { mem::transmute(&self.ptr) }
67+
}
68+
}
69+
```
70+
71+
Unfortunately the mechanism for stating that your value is non-zero is
72+
unstable and unlikely to be stabilized soon. As such we're just going to
73+
take the hit and use std's Unique:
2574

2675

2776
```rust
@@ -38,29 +87,11 @@ pub struct Vec<T> {
3887
# fn main() {}
3988
```
4089

41-
As a recap, Unique is a wrapper around a raw pointer that declares that:
42-
43-
* We may own a value of type `T`
44-
* We are Send/Sync iff `T` is Send/Sync
45-
* Our pointer is never null (and therefore `Option<Vec>` is
46-
null-pointer-optimized)
47-
48-
That last point is subtle. First, it makes `Unique::new` unsafe to call, because
49-
putting `null` inside of it is Undefined Behaviour. It also throws a
50-
wrench in an important feature of Vec (and indeed all of the std collections):
51-
an empty Vec doesn't actually allocate at all. So if we can't allocate,
52-
but also can't put a null pointer in `ptr`, what do we do in
53-
`Vec::new`? Well, we just put some other garbage in there!
54-
55-
This is perfectly fine because we already have `cap == 0` as our sentinel for no
56-
allocation. We don't even need to handle it specially in almost any code because
57-
we usually need to check if `cap > len` or `len > 0` anyway. The traditional
58-
Rust value to put here is `0x01`. The standard library actually exposes this
59-
as `std::rt::heap::EMPTY`. There are quite a few places where we'll want to use
60-
`heap::EMPTY` because there's no real allocation to talk about but `null` would
61-
make the compiler angry.
62-
63-
All of the `heap` API is totally unstable under the `heap_api` feature, though.
64-
We could trivially define `heap::EMPTY` ourselves, but we'll want the rest of
65-
the `heap` API anyway, so let's just get that dependency over with.
90+
If you don't care about the null-pointer optimization, then you can use the
91+
stable code. However we will be designing the rest of the code around enabling
92+
the optimization. In particular, `Unique::new` is unsafe to call, because
93+
putting `null` inside of it is Undefined Behaviour. Our stable Unique doesn't
94+
need `new` to be unsafe because it doesn't make any interesting guarantees about
95+
its contents.
6696

97+
[ownership]: ownership.html

branches/try/src/doc/tarpl/vec.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,19 @@
22

33
To bring everything together, we're going to write `std::Vec` from scratch.
44
Because all the best tools for writing unsafe code are unstable, this
5-
project will only work on nightly (as of Rust 1.2.0).
5+
project will only work on nightly (as of Rust 1.2.0). With the exception of the
6+
allocator API, much of the unstable code we'll use is expected to be stabilized
7+
in a similar form as it is today.
68

9+
However we will generally try to avoid unstable code where possible. In
10+
particular we won't use any intrinsics that could make a code a little
11+
bit nicer or efficient because intrinsics are permanently unstable. Although
12+
many intrinsics *do* become stabilized elsewhere (`std::ptr` and `str::mem`
13+
consist of many intrinsics).
14+
15+
Ultimately this means out implementation may not take advantage of all
16+
possible optimizations, though it will be by no means *naive*. We will
17+
definitely get into the weeds over nitty-gritty details, even
18+
when the problem doesn't *really* merit it.
19+
20+
You wanted advanced. We're gonna go advanced.

0 commit comments

Comments
 (0)