|
| 1 | +% Example: Implementing Vec |
| 2 | + |
| 3 | +To bring everything together, we're going to write `std::Vec` from scratch. |
| 4 | +Because the all the best tools for writing unsafe code are unstable, this |
| 5 | +project will only work on nightly (as of Rust 1.2.0). |
| 6 | + |
| 7 | +First off, we need to come up with the struct layout. Naively we want this |
| 8 | +design: |
| 9 | + |
| 10 | +``` |
| 11 | +struct Vec<T> { |
| 12 | + ptr: *mut T, |
| 13 | + cap: usize, |
| 14 | + len: usize, |
| 15 | +} |
| 16 | +``` |
| 17 | + |
| 18 | +And indeed this would compile. Unfortunately, it would be incorrect. The compiler |
| 19 | +will give us too strict variance, so e.g. an `&Vec<&'static str>` couldn't be used |
| 20 | +where an `&Vec<&'a str>` was expected. More importantly, it will give incorrect |
| 21 | +ownership information to dropck, as it will conservatively assume we don't own |
| 22 | +any values of type `T`. See [the chapter on ownership and lifetimes] |
| 23 | +(lifetimes.html) for details. |
| 24 | + |
| 25 | +As we saw in the lifetimes chapter, we should use `Unique<T>` in place of `*mut T` |
| 26 | +when we have a raw pointer to an allocation we own: |
| 27 | + |
| 28 | + |
| 29 | +``` |
| 30 | +#![feature(unique)] |
| 31 | +
|
| 32 | +use std::ptr::Unique; |
| 33 | +
|
| 34 | +pub struct Vec<T> { |
| 35 | + ptr: Unique<T>, |
| 36 | + cap: usize, |
| 37 | + len: usize, |
| 38 | +} |
| 39 | +``` |
| 40 | + |
| 41 | +As a recap, Unique is a wrapper around a raw pointer that declares that: |
| 42 | + |
| 43 | +* We own at least one value of type `T` |
| 44 | +* We are Send/Sync iff `T` is Send/Sync |
| 45 | +* Our pointer is never null (and therefore `Option<Vec>` is null-pointer-optimized) |
| 46 | + |
| 47 | +That last point is subtle. First, it makes `Unique::new` unsafe to call, because |
| 48 | +putting `null` inside of it is Undefined Behaviour. It also throws a |
| 49 | +wrench in an important feature of Vec (and indeed all of the std collections): |
| 50 | +an empty Vec doesn't actually allocate at all. So if we can't allocate, |
| 51 | +but also can't put a null pointer in `ptr`, what do we do in |
| 52 | +`Vec::new`? Well, we just put some other garbage in there! |
| 53 | + |
| 54 | +This is perfectly fine because we already have `cap == 0` as our sentinel for no |
| 55 | +allocation. We don't even need to handle it specially in almost any code because |
| 56 | +we usually need to check if `cap > len` or `len > 0` anyway. The traditional |
| 57 | +Rust value to put here is `0x01`. The standard library actually exposes this |
| 58 | +as `std::rt::heap::EMPTY`. There are quite a few places where we'll want to use |
| 59 | +`heap::EMPTY` because there's no real allocation to talk about but `null` would |
| 60 | +make the compiler angry. |
| 61 | + |
| 62 | +All of the `heap` API is totally unstable under the `alloc` feature, though. |
| 63 | +We could trivially define `heap::EMPTY` ourselves, but we'll want the rest of |
| 64 | +the `heap` API anyway, so let's just get that dependency over with. |
| 65 | + |
| 66 | +So: |
| 67 | + |
| 68 | +```rust |
| 69 | +#![feature(alloc)] |
| 70 | + |
| 71 | +use std::rt::heap::EMPTY; |
| 72 | +use std::mem; |
| 73 | + |
| 74 | +impl<T> Vec<T> { |
| 75 | + fn new() -> Self { |
| 76 | + assert!(mem::size_of::<T>() != 0, "We're not ready to handle ZSTs"); |
| 77 | + unsafe { |
| 78 | + // need to cast EMPTY to the actual ptr type we want, let |
| 79 | + // inference handle it. |
| 80 | + Vec { ptr: Unique::new(heap::EMPTY as *mut _), len: 0, cap: 0 } |
| 81 | + } |
| 82 | + } |
| 83 | +} |
| 84 | +``` |
| 85 | + |
| 86 | +I slipped in that assert there because zero-sized types will require some |
| 87 | +special handling throughout our code, and I want to defer the issue for now. |
| 88 | +Without this assert, some of our early drafts will do some Very Bad Things. |
| 89 | + |
| 90 | +Next we need to figure out what to actually do when we *do* want space. For that, |
| 91 | +we'll need to use the rest of the heap APIs. These basically allow us to |
| 92 | +talk directly to Rust's instance of jemalloc. |
| 93 | + |
| 94 | +We'll also need a way to handle out-of-memory conditions. The standard library |
| 95 | +calls the `abort` intrinsic, but calling intrinsics from normal Rust code is a |
| 96 | +pretty bad idea. Unfortunately, the `abort` exposed by the standard library |
| 97 | +allocates. Not something we want to do during `oom`! Instead, we'll call |
| 98 | +`std::process::exit`. |
| 99 | + |
| 100 | +```rust |
| 101 | +fn oom() { |
| 102 | + ::std::process::exit(-9999); |
| 103 | +} |
| 104 | +``` |
| 105 | + |
| 106 | +Okay, now we can write growing: |
| 107 | + |
| 108 | +```rust |
| 109 | +fn grow(&mut self) { |
| 110 | + unsafe { |
| 111 | + let align = mem::min_align_of::<T>(); |
| 112 | + let elem_size = mem::size_of::<T>(); |
| 113 | + |
| 114 | + let (new_cap, ptr) = if self.cap == 0 { |
| 115 | + let ptr = heap::allocate(elem_size, align); |
| 116 | + (1, ptr) |
| 117 | + } else { |
| 118 | + let new_cap = 2 * self.cap; |
| 119 | + let ptr = heap::reallocate(*self.ptr as *mut _, |
| 120 | + self.cap * elem_size, |
| 121 | + new_cap * elem_size, |
| 122 | + align); |
| 123 | + (new_cap, ptr) |
| 124 | + }; |
| 125 | + |
| 126 | + // If allocate or reallocate fail, we'll get `null` back |
| 127 | + if ptr.is_null() { oom() } |
| 128 | + |
| 129 | + self.ptr = Unique::new(ptr as *mut _); |
| 130 | + self.cap = new_cap; |
| 131 | + } |
| 132 | +} |
| 133 | +``` |
| 134 | + |
| 135 | +There's nothing particularly tricky in here: if we're totally empty, we need |
| 136 | +to do a fresh allocation. Otherwise, we need to reallocate the current pointer. |
| 137 | +Although we have a subtle bug here with the multiply overflow. |
| 138 | + |
| 139 | +TODO: rest of this |
| 140 | + |
| 141 | + |
0 commit comments