Skip to content

Commit 113f9ea

Browse files
committed
Add integer literal's grammar
1 parent d8dfa75 commit 113f9ea

File tree

1 file changed

+70
-4
lines changed

1 file changed

+70
-4
lines changed

src/tokens.md

Lines changed: 70 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -216,16 +216,48 @@ literal_. The grammar for recognizing the two kinds of literals is mixed.
216216

217217
#### Integer literals
218218

219+
> **<sup>Lexer</sup>**
220+
> INTEGER_LITERAL :
221+
> &nbsp;&nbsp; ( DEC_LITERAL | BIN_LITERAL | OCT_LITERAL | HEX_LITERAL )
222+
> INTEGER_SUFFIX<sup>?</sup>
223+
>
224+
> DEC_LITERAL :
225+
> &nbsp;&nbsp; DEC_DIGIT (DEC_DIGIT|`_`)<sup>\*</sup>
226+
>
227+
> BIN_LITERAL :
228+
> &nbsp;&nbsp; `0b` (BIN_DIGIT|`_`)<sup>\*</sup> BIN_DIGIT (BIN_DIGIT|`_`)<sup>\*</sup>
229+
>
230+
> OCT_LITERAL :
231+
> &nbsp;&nbsp; `0o` (OCT_DIGIT|`_`)<sup>\*</sup> OCT_DIGIT (OCT_DIGIT|`_`)<sup>\*</sup>
232+
>
233+
> HEX_LITERAL :
234+
> &nbsp;&nbsp; `0x` (HEX_DIGIT|`_`)<sup>\*</sup> HEX_DIGIT (HEX_DIGIT|`_`)<sup>\*</sup>
235+
>
236+
> BIN_DIGIT : [`0`-`1` `_`]
237+
> OCT_DIGIT : [`0`-`7` `_`]
238+
> DEC_DIGIT : [`0`-`9` `_`]
239+
> HEX_DIGIT : [`0`-`9` `a`-`f` `A`-`F` `_`]
240+
>
241+
> INTEGER_SUFFIX :
242+
> &nbsp;&nbsp; &nbsp;&nbsp; `u8` | `u16` | `u32` | `u64` | `usize`
243+
> &nbsp;&nbsp; | `i8` | `u16` | `i32` | `i64` | `usize`
244+
245+
<!-- FIXME: separate the DECIMAL_LITERAL with no prefix or suffix (used on tuple indexing and float_literal -->
246+
<!-- FIXME: u128 and i128 -->
247+
219248
An _integer literal_ has one of four forms:
220249

221250
* A _decimal literal_ starts with a *decimal digit* and continues with any
222251
mixture of *decimal digits* and _underscores_.
223252
* A _hex literal_ starts with the character sequence `U+0030` `U+0078`
224-
(`0x`) and continues as any mixture of hex digits and underscores.
253+
(`0x`) and continues as any mixture (with at least one digit) of hex digits
254+
and underscores.
225255
* An _octal literal_ starts with the character sequence `U+0030` `U+006F`
226-
(`0o`) and continues as any mixture of octal digits and underscores.
256+
(`0o`) and continues as any mixture (with at least one digit) of octal digits
257+
and underscores.
227258
* A _binary literal_ starts with the character sequence `U+0030` `U+0062`
228-
(`0b`) and continues as any mixture of binary digits and underscores.
259+
(`0b`) and continues as any mixture (with at least one digit) of binary digits
260+
and underscores.
229261

230262
Like any literal, an integer literal may be followed (immediately,
231263
without any spaces) by an _integer suffix_, which forcibly sets the
@@ -247,15 +279,49 @@ The type of an _unsuffixed_ integer literal is determined by type inference:
247279
Examples of integer literals of various forms:
248280

249281
```rust
282+
123; // type i32
250283
123i32; // type i32
251284
123u32; // type u32
252285
123_u32; // type u32
286+
let a: u64 = 123; // type u64
287+
288+
0xff; // type i32
253289
0xff_u8; // type u8
290+
291+
0o70; // type i32
254292
0o70_i16; // type i16
255-
0b1111_1111_1001_0000_i32; // type i32
293+
294+
0b1111_1111_1001_0000; // type i32
295+
0b1111_1111_1001_0000i32; // type i64
296+
0b________1; // type i32
297+
256298
0usize; // type usize
257299
```
258300

301+
Examples of invalid integer literals:
302+
303+
```rust,ignore
304+
// invalid suffixes
305+
306+
0invalidSuffix;
307+
308+
// uses numbers of the wrong base
309+
310+
123AFB43;
311+
0b0102;
312+
0o0581;
313+
314+
// integers too big for their type (they overflow)
315+
316+
128_i8;
317+
256_u8;
318+
319+
// bin, hex and octal literals must have at least one digit
320+
321+
0b_;
322+
0b____;
323+
```
324+
259325
Note that the Rust syntax considers `-1i8` as an application of the [unary minus
260326
operator] to an integer literal `1i8`, rather than
261327
a single integer literal.

0 commit comments

Comments
 (0)