Skip to content

Tokens #240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 12, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions src/notation.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@ productions.

## String table productions

Some rules in the grammar — notably [unary operators], [binary operators],
and [keywords] — are given in a simplified form: as a listing of a table
of unquoted, printable whitespace-separated strings. These cases form a subset
of the rules regarding the [token][tokens] rule, and are assumed to be the
result of a lexical-analysis phase feeding the parser, driven by a
<abbr title="Deterministic Finite Automaton">DFA</abbr>, operating over the
disjunction of all such string table entries.
Some rules in the grammar &mdash; notably [unary operators], [binary
operators], and [keywords] &mdash; are given in a simplified form: as a listing
of printable strings. These cases form a subset of the rules regarding the
[token][tokens] rule, and are assumed to be the result of a lexical-analysis
phase feeding the parser, driven by a <abbr title="Deterministic Finite
Automaton">DFA</abbr>, operating over the disjunction of all such string table
entries.

When such a string enclosed in double-quotes (`"`) occurs inside the grammar,
When such a string in `monospace` font occurs inside the grammar,
it is an implicit reference to a single member of such a string table
production. See [tokens] for more information.

Expand Down
20 changes: 15 additions & 5 deletions src/tokens.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Tokens are primitive productions in the grammar defined by regular
(non-recursive) languages. "Simple" tokens are given in [string table
production] form, and occur in the rest of the
grammar as double-quoted strings. Other tokens have exact rules given.
grammar in `monospace` font. Other tokens have exact rules given.

[string table production]: notation.html#string-table-productions

Expand Down Expand Up @@ -227,8 +227,7 @@ preceded by the characters `U+0062` (`b`) and `U+0022` (double-quote), and
followed by the character `U+0022`. If the character `U+0022` is present within
the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character.
Alternatively, a byte string literal can be a _raw byte string literal_, defined
below. A byte string literal of length `n` is equivalent to a `&'static [u8; n]` borrowed fixed-sized array
of unsigned 8-bit integers.
below. The type of a byte string literal of length `n` is `&'static [u8; n]`.

Some additional _escapes_ are available in either byte or non-raw byte string
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
Expand Down Expand Up @@ -295,10 +294,14 @@ literal_. The grammar for recognizing the two kinds of literals is mixed.
> INTEGER_LITERAL :
> &nbsp;&nbsp; ( DEC_LITERAL | BIN_LITERAL | OCT_LITERAL | HEX_LITERAL )
> INTEGER_SUFFIX<sup>?</sup>
>
>
> DEC_LITERAL :
> &nbsp;&nbsp; DEC_DIGIT (DEC_DIGIT|`_`)<sup>\*</sup>
>
> TUPLE_INDEX :
> &nbsp;&nbsp; &nbsp;&nbsp; `0`
> &nbsp;&nbsp; | NON_ZERO_DEC_DIGIT DEC_DIGIT<sup>\*</sup>
>
> BIN_LITERAL :
> &nbsp;&nbsp; `0b` (BIN_DIGIT|`_`)<sup>\*</sup> BIN_DIGIT (BIN_DIGIT|`_`)<sup>\*</sup>
>
Expand All @@ -314,19 +317,23 @@ literal_. The grammar for recognizing the two kinds of literals is mixed.
>
> DEC_DIGIT : [`0`-`9`]
>
> NON_ZERO_DEC_DIGIT : [`1`-`9`]
>
> HEX_DIGIT : [`0`-`9` `a`-`f` `A`-`F`]
>
> INTEGER_SUFFIX :
> &nbsp;&nbsp; &nbsp;&nbsp; `u8` | `u16` | `u32` | `u64` | `usize`
> &nbsp;&nbsp; | `i8` | `i16` | `i32` | `i64` | `isize`

<!-- FIXME: separate the DECIMAL_LITERAL with no prefix or suffix (used on tuple indexing and float_literal -->
<!-- FIXME: u128 and i128 -->

An _integer literal_ has one of four forms:

* A _decimal literal_ starts with a *decimal digit* and continues with any
mixture of *decimal digits* and _underscores_.
* A _tuple index_ is either `0`, or starts with a *non-zero decimal digit* and
continues with zero or more decimal digits. Tuple indexes are used to refer
to the fields of [tuples], [tuple structs] and [tuple variants].
* A _hex literal_ starts with the character sequence `U+0030` `U+0078`
(`0x`) and continues as any mixture (with at least one digit) of hex digits
and underscores.
Expand Down Expand Up @@ -501,3 +508,6 @@ They are catalogued in [the Symbols section][symbols] of the Grammar document.
[symbols]: ../grammar.html#symbols
[keywords]: keywords.html
[identifier]: identifiers.html
[tuples]: types.html#tuple-types
[tuple structs]: items/structs.html
[tuple variants]: items/enumerations.html