Skip to content

Collect literal examples together in the reference #18441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 21, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions src/doc/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,52 @@ reserved for future extension, that is, the above gives the lexical
grammar, but a Rust parser will reject everything but the 12 special
cases mentioned in [Number literals](#number-literals) below.

#### Examples

##### Characters and strings

| | Example | Number of `#` pairs allowed | Available characters | Escapes | Equivalent to |
|---|---------|-----------------------------|----------------------|---------|---------------|
| [Character](#character-literals) | `'H'` | `N/A` | All unicode | `\'` & [Byte escapes](#byte-escapes) & [Unicode escapes](#unicode-escapes) | `N/A` |
| [String](#string-literals) | `"hello"` | `N/A` | All unicode | `\"` & [Byte escapes](#byte-escapes) & [Unicode escapes](#unicode-escapes) | `N/A` |
| [Raw](#raw-string-literals) | `r##"hello"##` | `0...` | All unicode | `N/A` | `N/A` |
| [Byte](#byte-literals) | `b'H'` | `N/A` | All ASCII | `\'` & [Byte escapes](#byte-escapes) | `u8` |
| [Byte string](#byte-string-literals) | `b"hello"` | `N/A` | All ASCII | `\"` & [Byte escapes](#byte-escapes) | `&'static [u8]` |
| [Raw byte string](#raw-byte-string-literals) | `br##"hello"##` | `0...` | All ASCII | `N/A` | `&'static [u8]` (unsure...not stated) |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this last column, if it needs updating. Other than that though, it's good.


##### Byte escapes

| | Name |
|---|------|
| `\x7F` | 8-bit character code (exactly 2 digits) |
| `\n` | Newline |
| `\r` | Carriage return |
| `\t` | Tab |
| `\\` | Backslash |

##### Unicode escapes
| | Name |
|---|------|
| `\u7FFF` | 16-bit character code (exactly 4 digits) |
| `\U7EEEFFFF` | 32-bit character code (exactly 8 digits) |

##### Numbers

| [Number literals](#number-literals)`*` | Example | Exponentiation | Suffixes |
|----------------------------------------|---------|----------------|----------|
| Decimal integer | `98_222i` | `N/A` | Integer suffixes |
| Hex integer | `0xffi` | `N/A` | Integer suffixes |
| Octal integer | `0o77i` | `N/A` | Integer suffixes |
| Binary integer | `0b1111_0000i` | `N/A` | Integer suffixes |
| Floating-point | `123.0E+77f64` | `Optional` | Floating-point suffixes |

`*` All number literals allow `_` as a visual separator: `1_234.0E+18f64`

##### Suffixes
| Integer | Floating-point |
|---------|----------------|
| `i` (`int`), `u` (`uint`), `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64` | `f32`, `f64` |

#### Character and string literals

```{.ebnf .gram}
Expand Down Expand Up @@ -253,15 +299,21 @@ nonzero_dec: '1' | '2' | '3' | '4'
| '5' | '6' | '7' | '8' | '9' ;
```

##### Character literals

A _character literal_ is a single Unicode character enclosed within two
`U+0027` (single-quote) characters, with the exception of `U+0027` itself,
which must be _escaped_ by a preceding U+005C character (`\`).

##### String literals

A _string literal_ is a sequence of any Unicode characters enclosed within two
`U+0022` (double-quote) characters, with the exception of `U+0022` itself,
which must be _escaped_ by a preceding `U+005C` character (`\`), or a _raw
string literal_.

##### Character escapes

Some additional _escapes_ are available in either character or non-raw string
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
following forms:
Expand All @@ -281,6 +333,8 @@ following forms:
* The _backslash escape_ is the character `U+005C` (`\`) which must be
escaped in order to denote *itself*.

##### Raw string literals

Raw string literals do not process any escapes. They start with the character
`U+0072` (`r`), followed by zero or more of the character `U+0023` (`#`) and a
`U+0022` (double-quote) character. The _raw string body_ is not defined in the
Expand Down Expand Up @@ -322,12 +376,16 @@ raw_byte_string : '"' raw_byte_string_body '"' | '#' raw_byte_string '#' ;

```

##### Byte literals

A _byte literal_ is a single ASCII character (in the `U+0000` to `U+007F`
range) enclosed within two `U+0027` (single-quote) characters, with the
exception of `U+0027` itself, which must be _escaped_ by a preceding U+005C
character (`\`), or a single _escape_. It is equivalent to a `u8` unsigned
8-bit integer _number literal_.

##### Byte string literals

A _byte string literal_ is a sequence of ASCII characters and _escapes_
enclosed within two `U+0022` (double-quote) characters, with the exception of
`U+0022` itself, which must be _escaped_ by a preceding `U+005C` character
Expand All @@ -347,6 +405,8 @@ following forms:
* The _backslash escape_ is the character `U+005C` (`\`) which must be
escaped in order to denote its ASCII encoding `0x5C`.

##### Raw byte string literals

Raw byte string literals do not process any escapes. They start with the
character `U+0062` (`b`), followed by `U+0072` (`r`), followed by zero or more
of the character `U+0023` (`#`), and a `U+0022` (double-quote) character. The
Expand Down
6 changes: 3 additions & 3 deletions src/libcollections/str.rs
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,9 @@
//! # Representation
//!
//! Rust's string type, `str`, is a sequence of Unicode scalar values encoded as a
//! stream of UTF-8 bytes. All strings are guaranteed to be validly encoded UTF-8
//! sequences. Additionally, strings are not null-terminated and can thus contain
//! null bytes.
//! stream of UTF-8 bytes. All [strings](../../reference.html#literals) are
//! guaranteed to be validly encoded UTF-8 sequences. Additionally, strings are
//! not null-terminated and can thus contain null bytes.
//!
//! The actual representation of strings have direct mappings to slices: `&str`
//! is the same as `&[u8]`.
Expand Down