1
1
% Allocating Memory
2
2
3
+ Using Unique throws a wrench in an important feature of Vec (and indeed all of
4
+ the std collections): an empty Vec doesn't actually allocate at all. So if we
5
+ can't allocate, but also can't put a null pointer in ` ptr ` , what do we do in
6
+ ` Vec::new ` ? Well, we just put some other garbage in there!
7
+
8
+ This is perfectly fine because we already have ` cap == 0 ` as our sentinel for no
9
+ allocation. We don't even need to handle it specially in almost any code because
10
+ we usually need to check if ` cap > len ` or ` len > 0 ` anyway. The traditional
11
+ Rust value to put here is ` 0x01 ` . The standard library actually exposes this
12
+ as ` std::rt::heap::EMPTY ` . There are quite a few places where we'll
13
+ want to use ` heap::EMPTY ` because there's no real allocation to talk about but
14
+ ` null ` would make the compiler do bad things.
15
+
16
+ All of the ` heap ` API is totally unstable under the ` heap_api ` feature, though.
17
+ We could trivially define ` heap::EMPTY ` ourselves, but we'll want the rest of
18
+ the ` heap ` API anyway, so let's just get that dependency over with.
19
+
3
20
So:
4
21
5
22
``` rust,ignore
@@ -24,15 +41,29 @@ I slipped in that assert there because zero-sized types will require some
24
41
special handling throughout our code, and I want to defer the issue for now.
25
42
Without this assert, some of our early drafts will do some Very Bad Things.
26
43
27
- Next we need to figure out what to actually do when we * do* want space. For that,
28
- we'll need to use the rest of the heap APIs. These basically allow us to
29
- talk directly to Rust's instance of jemalloc.
30
-
31
- We'll also need a way to handle out-of-memory conditions. The standard library
32
- calls the ` abort ` intrinsic, but calling intrinsics from normal Rust code is a
33
- pretty bad idea. Unfortunately, the ` abort ` exposed by the standard library
34
- allocates. Not something we want to do during ` oom ` ! Instead, we'll call
35
- ` std::process::exit ` .
44
+ Next we need to figure out what to actually do when we * do* want space. For
45
+ that, we'll need to use the rest of the heap APIs. These basically allow us to
46
+ talk directly to Rust's allocator (jemalloc by default).
47
+
48
+ We'll also need a way to handle out-of-memory (OOM) conditions. The standard
49
+ library calls the ` abort ` intrinsic, which just calls an illegal instruction to
50
+ crash the whole program. The reason we abort and don't panic is because
51
+ unwinding can cause allocations to happen, and that seems like a bad thing to do
52
+ when your allocator just came back with "hey I don't have any more memory".
53
+
54
+ Of course, this is a bit silly since most platforms don't actually run out of
55
+ memory in a conventional way. Your operating system will probably kill the
56
+ application by another means if you legitimately start using up all the memory.
57
+ The most likely way we'll trigger OOM is by just asking for ludicrous quantities
58
+ of memory at once (e.g. half the theoretical address space). As such it's
59
+ * probably* fine to panic and nothing bad will happen. Still, we're trying to be
60
+ like the standard library as much as possible, so we'll just kill the whole
61
+ program.
62
+
63
+ We said we don't want to use intrinsics, so doing * exactly* what ` std ` does is
64
+ out. ` std::rt::util::abort ` actually exists, but it takes a message to print,
65
+ which will probably allocate. Also it's still unstable. Instead, we'll call
66
+ ` std::process::exit ` with some random number.
36
67
37
68
``` rust
38
69
fn oom () {
@@ -51,29 +82,104 @@ else:
51
82
cap *= 2
52
83
```
53
84
54
- But Rust's only supported allocator API is so low level that we'll need to
55
- do a fair bit of extra work, though. We also need to guard against some special
56
- conditions that can occur with really large allocations. In particular, we index
57
- into arrays using unsigned integers, but ` ptr::offset ` takes signed integers. This
58
- means Bad Things will happen if we ever manage to grow to contain more than
59
- ` isize::MAX ` elements. Thankfully, this isn't something we need to worry about
60
- in most cases.
85
+ But Rust's only supported allocator API is so low level that we'll need to do a
86
+ fair bit of extra work. We also need to guard against some special
87
+ conditions that can occur with really large allocations or empty allocations.
88
+
89
+ In particular, ` ptr::offset ` will cause us * a lot* of trouble, because it has
90
+ the semantics of LLVM's GEP inbounds instruction. If you're fortunate enough to
91
+ not have dealt with this instruction, here's the basic story with GEP: alias
92
+ analysis, alias analysis, alias analysis. It's super important to an optimizing
93
+ compiler to be able to reason about data dependencies and aliasing.
61
94
62
- On 64-bit targets we're artifically limited to only 48-bits, so we'll run out
63
- of memory far before we reach that point. However on 32-bit targets, particularly
64
- those with extensions to use more of the address space, it's theoretically possible
65
- to successfully allocate more than ` isize::MAX ` bytes of memory. Still, we only
66
- really need to worry about that if we're allocating elements that are a byte large.
67
- Anything else will use up too much space.
95
+ As a simple example, consider the following fragment of code:
96
+
97
+ ``` rust
98
+ # let x = & mut 0 ;
99
+ # let y = & mut 0 ;
100
+ * x *= 7 ;
101
+ * y *= 3 ;
102
+ ```
68
103
69
- However since this is a tutorial, we're not going to be particularly optimal here,
70
- and just unconditionally check, rather than use clever platform-specific ` cfg ` s.
104
+ If the compiler can prove that ` x ` and ` y ` point to different locations in
105
+ memory, the two operations can in theory be executed in parallel (by e.g.
106
+ loading them into different registers and working on them independently).
107
+ However in * general* the compiler can't do this because if x and y point to
108
+ the same location in memory, the operations need to be done to the same value,
109
+ and they can't just be merged afterwards.
110
+
111
+ When you use GEP inbounds, you are specifically telling LLVM that the offsets
112
+ you're about to do are within the bounds of a single allocated entity. The
113
+ ultimate payoff being that LLVM can assume that if two pointers are known to
114
+ point to two disjoint objects, all the offsets of those pointers are * also*
115
+ known to not alias (because you won't just end up in some random place in
116
+ memory). LLVM is heavily optimized to work with GEP offsets, and inbounds
117
+ offsets are the best of all, so it's important that we use them as much as
118
+ possible.
119
+
120
+ So that's what GEP's about, how can it cause us trouble?
121
+
122
+ The first problem is that we index into arrays with unsigned integers, but
123
+ GEP (and as a consequence ` ptr::offset ` ) takes a * signed integer* . This means
124
+ that half of the seemingly valid indices into an array will overflow GEP and
125
+ actually go in the wrong direction! As such we must limit all allocations to
126
+ ` isize::MAX ` elements. This actually means we only need to worry about
127
+ byte-sized objects, because e.g. ` > isize::MAX ` ` u16 ` s will truly exhaust all of
128
+ the system's memory. However in order to avoid subtle corner cases where someone
129
+ reinterprets some array of ` < isize::MAX ` objects as bytes, std limits all
130
+ allocations to ` isize::MAX ` bytes.
131
+
132
+ On all 64-bit targets that Rust currently supports we're artificially limited
133
+ to significantly less than all 64 bits of the address space (modern x64
134
+ platforms only expose 48-bit addressing), so we can rely on just running out of
135
+ memory first. However on 32-bit targets, particularly those with extensions to
136
+ use more of the address space (PAE x86 or x32), it's theoretically possible to
137
+ successfully allocate more than ` isize::MAX ` bytes of memory.
138
+
139
+ However since this is a tutorial, we're not going to be particularly optimal
140
+ here, and just unconditionally check, rather than use clever platform-specific
141
+ ` cfg ` s.
142
+
143
+ The other corner-case we need to worry about is * empty* allocations. There will
144
+ be two kinds of empty allocations we need to worry about: ` cap = 0 ` for all T,
145
+ and ` cap > 0 ` for zero-sized types.
146
+
147
+ These cases are tricky because they come
148
+ down to what LLVM means by "allocated". LLVM's notion of an
149
+ allocation is significantly more abstract than how we usually use it. Because
150
+ LLVM needs to work with different languages' semantics and custom allocators,
151
+ it can't really intimately understand allocation. Instead, the main idea behind
152
+ allocation is "doesn't overlap with other stuff". That is, heap allocations,
153
+ stack allocations, and globals don't randomly overlap. Yep, it's about alias
154
+ analysis. As such, Rust can technically play a bit fast an loose with the notion of
155
+ an allocation as long as it's * consistent* .
156
+
157
+ Getting back to the empty allocation case, there are a couple of places where
158
+ we want to offset by 0 as a consequence of generic code. The question is then:
159
+ is it consistent to do so? For zero-sized types, we have concluded that it is
160
+ indeed consistent to do a GEP inbounds offset by an arbitrary number of
161
+ elements. This is a runtime no-op because every element takes up no space,
162
+ and it's fine to pretend that there's infinite zero-sized types allocated
163
+ at ` 0x01 ` . No allocator will ever allocate that address, because they won't
164
+ allocate ` 0x00 ` and they generally allocate to some minimal alignment higher
165
+ than a byte.
166
+
167
+ However what about for positive-sized types? That one's a bit trickier. In
168
+ principle, you can argue that offsetting by 0 gives LLVM no information: either
169
+ there's an element before the address, or after it, but it can't know which.
170
+ However we've chosen to conservatively assume that it may do bad things. As
171
+ such we * will* guard against this case explicitly.
172
+
173
+ * Phew*
174
+
175
+ Ok with all the nonsense out of the way, let's actually allocate some memory:
71
176
72
177
``` rust,ignore
73
178
fn grow(&mut self) {
74
179
// this is all pretty delicate, so let's say it's all unsafe
75
180
unsafe {
76
- let align = mem::min_align_of::<T>();
181
+ // current API requires us to specify size and alignment manually.
182
+ let align = mem::align_of::<T>();
77
183
let elem_size = mem::size_of::<T>();
78
184
79
185
let (new_cap, ptr) = if self.cap == 0 {
0 commit comments