Skip to content

Commit d9c7508

Browse files
committed
Clean up and improvements to the types chapter
1 parent 266d429 commit d9c7508

File tree

1 file changed

+136
-90
lines changed

1 file changed

+136
-90
lines changed

src/types.md

Lines changed: 136 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -3,21 +3,42 @@
33
Every variable, item and value in a Rust program has a type. The _type_ of a
44
*value* defines the interpretation of the memory holding it.
55

6-
Built-in types and type-constructors are tightly integrated into the language,
7-
in nontrivial ways that are not possible to emulate in user-defined types.
8-
User-defined types have limited capabilities.
6+
Built-in types are tightly integrated into the language, in nontrivial ways
7+
that are not possible to emulate in user-defined types. User-defined types have
8+
limited capabilities.
99

1010
## Primitive types
1111

12-
The primitive types are the following:
12+
Some types are defined by the language, rather than as part of the standard library,
13+
these are called _primitive types_. Some of these are induvidual types:
1314

1415
* The boolean type `bool` with values `true` and `false`.
15-
* The machine types (integer and floating-point).
16-
* The machine-dependent integer types.
17-
* Arrays
18-
* Tuples
19-
* Slices
20-
* Function pointers
16+
* The [machine types] (integer and floating-point).
17+
* The [machine-dependent integer types].
18+
* The [textual types] `char` and `str`.
19+
20+
There are also some primitive constructs for generic types built in to the language
21+
22+
* [Tuples]
23+
* [Arrays]
24+
* [Slices]
25+
* [Function pointers]
26+
* [References]
27+
* [Pointers]
28+
29+
[machine types]: #machine-types
30+
[machine-dependent integer types]: #machine-dependent-integer-types
31+
[textual types]: #textual-types
32+
[Tuples]: #tuple-types
33+
[Arrays]: #array-and-slice-types
34+
[Slices]: #array-and-slice-types
35+
[References]: #pointer-types
36+
[Pointers]: #raw-pointers
37+
[Function pointers]: #function-pointer-types
38+
[function]: #function-types
39+
[closure]: #closure-types
40+
41+
## Numeric types
2142

2243
### Machine types
2344

@@ -52,13 +73,13 @@ The types `char` and `str` hold textual data.
5273

5374
A value of type `char` is a [Unicode scalar value](
5475
http://www.unicode.org/glossary/#unicode_scalar_value) (i.e. a code point that
55-
is not a surrogate) represented as a 32-bit unsigned word in the 0x0000 to
56-
0xD7FF or 0xE000 to 0x10FFFF range. A `[char]` array is effectively a UCS-4 /
57-
UTF-32 string.
76+
is not a surrogate), represented as a 32-bit unsigned word in the 0x0000 to
77+
0xD7FF or 0xE000 to 0x10FFFF range. A `[char]` is effectively a UCS-4 / UTF-32
78+
string.
5879

5980
A value of type `str` is a Unicode string, represented as an array of 8-bit
60-
unsigned bytes holding a sequence of UTF-8 code points. Since `str` is of
61-
unknown size, it is not a _first-class_ type, but can only be instantiated
81+
unsigned bytes holding a sequence of UTF-8 code points. Since `str` is a
82+
[dynamically sized type], it is not a _first-class_ type, but can only be instantiated
6283
through a pointer type, such as `&str`.
6384

6485
## Tuple types
@@ -94,42 +115,46 @@ is often called ‘unit’ or ‘the unit type’.
94115
Rust has two different types for a list of items:
95116

96117
* `[T; N]`, an 'array'
97-
* `&[T]`, a 'slice'
118+
* `[T]`, a 'slice'
98119

99120
An array has a fixed size, and can be allocated on either the stack or the
100121
heap.
101122

102-
A slice is a 'view' into an array. It doesn't own the data it points
103-
to, it borrows it.
123+
A slice is a [dynamically sized type] representing a 'view' into an array. To
124+
use a slice type it generally has to be used behind a pointer for example as
125+
126+
* `&[T]`, a 'shared slice', often just called a 'slice', it doesn't own the
127+
data it points to, it borrows it.
128+
* `&mut [T]`, a 'mutable slice', mutably borrows the data it points to.
129+
* `Box<[T]>`, a 'boxed slice'
104130

105131
Examples:
106132

107133
```rust
108134
// A stack-allocated array
109135
let array: [i32; 3] = [1, 2, 3];
110136

111-
// A heap-allocated array
112-
let vector: Vec<i32> = vec![1, 2, 3];
137+
// A heap-allocated array, coerced to a slice
138+
let boxed_array: Box<[i32]> = Box::new([1, 2, 3]);
113139

114-
// A slice into an array
115-
let slice: &[i32] = &vector[..];
140+
// A (shared) slice into an array
141+
let slice: &[i32] = &boxed_array[..];
116142
```
117143

118-
As you can see, the `vec!` macro allows you to create a `Vec<T>` easily. The
119-
`vec!` macro is also part of the standard library, rather than the language.
120-
121144
All in-bounds elements of arrays and slices are always initialized, and access
122-
to an array or slice is always bounds-checked.
145+
to an array or slice is always bounds-checked in safe methods and operators.
146+
147+
The [`Vec<T>`] standard library type provides a heap allocated resizable array
148+
type.
149+
150+
[dynamically sized type]: #dynamically-sized-types
151+
[`Vec<T>`]: ../std/vec/struct.Vec.html
123152

124153
## Struct types
125154

126155
A `struct` *type* is a heterogeneous product of other types, called the
127156
*fields* of the type.[^structtype]
128157

129-
[^structtype]: `struct` types are analogous to `struct` types in C,
130-
the *record* types of the ML family,
131-
or the *struct* types of the Lisp family.
132-
133158
New instances of a `struct` can be constructed with a [struct
134159
expression](expressions.html#struct-expressions).
135160

@@ -151,39 +176,42 @@ fields. The one value constructed by the associated [struct
151176
expression](expressions.html#struct-expressions) is the only value that inhabits such a
152177
type.
153178

179+
[^structtype]: `struct` types are analogous to `struct` types in C,
180+
the *record* types of the ML family,
181+
or the *struct* types of the Lisp family.
182+
154183
## Enumerated types
155184

156185
An *enumerated type* is a nominal, heterogeneous disjoint union type, denoted
157186
by the name of an [`enum` item](items.html#enumerations). [^enumtype]
158187

159-
[^enumtype]: The `enum` type is analogous to a `data` constructor declaration in
160-
ML, or a *pick ADT* in Limbo.
188+
An [`enum` item](items.html#enumerations) declares both the type and a number
189+
of *variants*, each of which is independently named and has the syntax of a
190+
struct, tuple struct or unit-like struct.
161191

162-
An [`enum` item](items.html#enumerations) declares both the type and a number of *variant
163-
constructors*, each of which is independently named and takes an optional tuple
164-
of arguments.
192+
New instances of an `enum` can be constructed in an [enumeration variant
193+
expression](expressions.html#enumeration-variant-expressions).
165194

166-
New instances of an `enum` can be constructed by calling one of the variant
167-
constructors, in a [call expression](expressions.html#call-expressions).
168-
169-
Any `enum` value consumes as much memory as the largest variant constructor for
170-
its corresponding `enum` type.
195+
Any `enum` value consumes as much memory as the largest variant for its
196+
corresponding `enum` type, as well as the size needed to store a discriminant.
171197

172198
Enum types cannot be denoted *structurally* as types, but must be denoted by
173199
named reference to an [`enum` item](items.html#enumerations).
174200

201+
[^enumtype]: The `enum` type is analogous to a `data` constructor declaration in
202+
ML, or a *pick ADT* in Limbo.
203+
175204
## Recursive types
176205

177-
Nominal types &mdash; [enumerations](#enumerated-types) and
178-
[structs](#struct-types) &mdash; may be recursive. That is, each `enum`
179-
constructor or `struct` field may refer, directly or indirectly, to the
180-
enclosing `enum` or `struct` type itself. Such recursion has restrictions:
206+
Nominal types &mdash; [structs](#struct-types),
207+
[enumerations](#enumerated-types) and [unions](#union-types) &mdash; may be
208+
recursive. That is, each `enum` variant or `struct` or `union` field may refer,
209+
directly or indirectly, to the enclosing `enum` or `struct` type itself. Such
210+
recursion has restrictions:
181211

182212
* Recursive types must include a nominal type in the recursion
183213
(not mere [type definitions](../grammar.html#type-definitions),
184214
or other structural types such as [arrays](#array-and-slice-types) or [tuples](#tuple-types)).
185-
* A recursive `enum` item must have at least one non-recursive constructor
186-
(in order to give the recursion a basis case).
187215
* The size of a recursive type must be finite;
188216
in other words the recursive fields of the type must be [pointer types](#pointer-types).
189217
* Recursive type definitions can cross module boundaries, but not module *visibility* boundaries,
@@ -206,38 +234,48 @@ All pointers in Rust are explicit first-class values. They can be copied,
206234
stored into data structs, and returned from functions. There are two
207235
varieties of pointer in Rust:
208236

209-
* References (`&`)
210-
: These point to memory _owned by some other value_.
211-
A reference type is written `&type`,
212-
or `&'a type` when you need to specify an explicit lifetime.
213-
Copying a reference is a "shallow" operation:
214-
it involves only copying the pointer itself.
215-
Releasing a reference has no effect on the value it points to,
216-
but a reference of a temporary value will keep it alive during the scope
217-
of the reference itself.
218-
219-
* Raw pointers (`*`)
220-
: Raw pointers are pointers without safety or liveness guarantees.
221-
Raw pointers are written as `*const T` or `*mut T`,
222-
for example `*const i32` means a raw pointer to a 32-bit integer.
223-
Copying or dropping a raw pointer has no effect on the lifecycle of any
224-
other value. Dereferencing a raw pointer or converting it to any other
225-
pointer type is an [`unsafe` operation](unsafe-functions.html).
226-
Raw pointers are generally discouraged in Rust code;
227-
they exist to support interoperability with foreign code,
228-
and writing performance-critical or low-level functions.
237+
### Shared references (`&`)
238+
239+
These point to memory _owned by some other value_. When a shared reference to a
240+
value is created it prevents direct mutation of the value. [Interior
241+
mutability](#interior-mutability) provides an exception for this in certain
242+
circumstances. As the name suggests, any number of shared references to a value
243+
may exit. A shared reference type is written `&type`, or `&'a type` when you
244+
need to specify an explicit lifetime. Copying a reference is a "shallow"
245+
operation: it involves only copying the pointer itself, that is, pointers are
246+
`Copy`. Releasing a reference has no effect on the value it points to, but
247+
referencing of a [temporary value](expressions.html#temporary-lifetimes) will
248+
keep it alive during the scope of the reference itself.
249+
250+
### Raw pointers (`*const` and `*mut`)
251+
252+
Raw pointers are pointers without safety or liveness guarantees. Raw pointers
253+
are written as `*const T` or `*mut T`, for example `*const i32` means a raw
254+
pointer to a 32-bit integer. Copying or dropping a raw pointer has no effect on
255+
the lifecycle of any other value. Dereferencing a raw pointer is an [`unsafe`
256+
operation](unsafe-functions.html), this can also be used to convert a raw
257+
pointer to a reference by reborrowing it (`&*` or `&mut *`). Raw pointers are
258+
generally discouraged in Rust code; they exist to support interoperability with
259+
foreign code, and writing performance-critical or low-level functions.
260+
261+
When comparing pointers they are compared by their address, rather than by what
262+
they point to. When comparing pointers to [dynamically sized
263+
types](#dynamically-sized-types) they also have their addition data compared.
264+
265+
### Smart Pointers
229266

230267
The standard library contains additional 'smart pointer' types beyond references
231268
and raw pointers.
232269

233270
## Function item types
234271

235-
When referred to, a function item yields a zero-sized value of its
236-
_function item type_. That type explicitly identifies the function - its name,
237-
its type arguments, and its early-bound lifetime arguments (but not its
238-
late-bound lifetime arguments, which are only assigned when the function
239-
is called) - so the value does not need to contain an actual function pointer,
240-
and no indirection is needed when the function is called.
272+
When referred to, a function item, or the constructor of a tuple-like struct or
273+
enum variant, yields a zero-sized value of its _function item type_. That type
274+
explicitly identifies the function - its name, its type arguments, and its
275+
early-bound lifetime arguments (but not its late-bound lifetime arguments,
276+
which are only assigned when the function is called) - so the value does not
277+
need to contain an actual function pointer, and no indirection is needed when
278+
the function is called.
241279

242280
There is currently no syntax that directly refers to a function item type, but
243281
the compiler will display the type as something like `fn() {foo::<u32>}` in error
@@ -247,7 +285,7 @@ Because the function item type explicitly identifies the function, the item
247285
types of different functions - different items, or the same item with different
248286
generics - are distinct, and mixing them will create a type error:
249287

250-
```rust,ignore
288+
```rust,compile_fail,E0308
251289
fn foo<T>() { }
252290
let x = &mut foo::<i32>;
253291
*x = foo::<u32>; //~ ERROR mismatched types
@@ -278,16 +316,16 @@ let foo_ptr_2 = if want_i32 {
278316

279317
## Function pointer types
280318

281-
Function pointer types, created using the `fn` type constructor, refer
282-
to a function whose identity is not necessarily known at compile-time. They
283-
can be created via a coercion from both [function items](#function-item-types)
284-
and non-capturing [closures](#closure-types).
319+
Function pointer types, written using the `fn` keyword, refer to a function
320+
whose identity is not necessarily known at compile-time. They can be created
321+
via a coercion from both [function items](#function-item-types) and
322+
non-capturing [closures](#closure-types).
285323

286324
A function pointer type consists of a possibly-empty set of function-type
287325
modifiers (such as `unsafe` or `extern`), a sequence of input types and an
288326
output type.
289327

290-
An example of a `fn` type:
328+
An example where `Binop` is defined as a function pointer type:
291329

292330
```rust
293331
fn add(x: i32, y: i32) -> i32 {
@@ -319,10 +357,12 @@ more of the closure traits:
319357
`FnOnce` (i.e. anything implementing `FnMut` also implements `FnOnce`).
320358

321359
* `Fn`
322-
: The closure can be called multiple times through a shared reference.
323-
A closure called as `Fn` can neither move out from nor mutate values
324-
from its environment, but read-only access to such values is allowed.
325-
`Fn` inherits from `FnMut`, which itself inherits from `FnOnce`.
360+
: The closure can be called multiple times through a shared reference. A
361+
closure called as `Fn` can neither move out from nor mutate captured
362+
variables, but read-only access to such values is allowed. Using `move` to
363+
capture variables by value is allowed so long as they aren't mutated or
364+
moved in the body of the closure. `Fn` inherits from `FnMut`, which itself
365+
inherits from `FnOnce`.
326366

327367
Closures that don't use anything from their environment ("non capturing closures")
328368
can be coerced to function pointers (`fn`) with the matching signature.
@@ -340,8 +380,9 @@ x = bo(5,7);
340380

341381
## Trait objects
342382

343-
In Rust, a type like `&SomeTrait` or `Box<SomeTrait>` is called a _trait object_.
344-
Each instance of a trait object includes:
383+
In Rust, trait names also refer to [dynamically sized types] called _trait
384+
objects_. Like all DSTs, trait objects are used behind some kind of pointer:
385+
`&SomeTrait` or `Box<SomeTrait>`. Each instance of a trait object includes:
345386

346387
- a pointer to an instance of a type `T` that implements `SomeTrait`
347388
- a _virtual method table_, often just called a _vtable_, which contains, for
@@ -354,11 +395,18 @@ function pointer is loaded from the trait object vtable and invoked indirectly.
354395
The actual implementation for each vtable entry can vary on an object-by-object
355396
basis.
356397

357-
Note that for a trait object to be instantiated, the trait must be
358-
_object-safe_. Object safety rules are defined in [RFC 255].
398+
Note that for a trait object types only exist for _object-safe_ traits ([RFC 255]):
359399

360400
[RFC 255]: https://github.com/rust-lang/rfcs/blob/master/text/0255-object-safety.md
361401

402+
* It must not require `Self: Sized`
403+
* All associated functions must either have a `where Self: Sized` bound or
404+
* Not have any type parameters (lifetime parameters are allowed)
405+
* Must be a method: its first parameter must be called self, with type
406+
`Self`, `&Self`, `&mut Self`, `Box<Self>`.
407+
* `Self` may only be used in the type of the receiver.
408+
* It must not have any associated constants.
409+
362410
Given a pointer-typed expression `E` of type `&T` or `Box<T>`, where `T`
363411
implements trait `R`, casting `E` to the corresponding pointer type `&R` or
364412
`Box<R>` results in a value of the _trait object_ `R`. This result is
@@ -434,7 +482,7 @@ default bound for trait objects of that type. For example, `std::cell::Ref<'a,
434482
T>` contains a `T: 'a` bound, therefore trait objects of type `Ref<'a,
435483
SomeTrait>` are the same as `Ref<'a, (SomeTrait + 'a)>`.
436484

437-
### Type parameters
485+
## Type parameters
438486

439487
Within the body of an item that has type parameter declarations, the names of
440488
its type parameters are types:
@@ -487,6 +535,4 @@ impl Printable for String {
487535
}
488536
```
489537

490-
The notation `&self` is a shorthand for `self: &Self`. In this case,
491-
in the impl, `Self` refers to the value of type `String` that is the
492-
receiver for a call to the method `make_string`.
538+
The notation `&self` is a shorthand for `self: &Self`.

0 commit comments

Comments
 (0)