@@ -21,13 +21,24 @@ evaluated (primarily) at compile time.
21
21
22
22
| | Example | ` # ` sets | Characters | Escapes |
23
23
| ----------------------------------------------| -----------------| ------------| -------------| ---------------------|
24
- | [ Character] ( #character-literals ) | ` 'H' ` | ` N/A ` | All Unicode | [ Quote] ( #quote-escapes ) & [ Byte ] ( #byte -escapes ) & [ Unicode] ( #unicode-escapes ) |
25
- | [ String] ( #string-literals ) | ` "hello" ` | ` N/A ` | All Unicode | [ Quote] ( #quote-escapes ) & [ Byte ] ( #byte -escapes ) & [ Unicode] ( #unicode-escapes ) |
24
+ | [ Character] ( #character-literals ) | ` 'H' ` | ` N/A ` | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII ] ( #ascii -escapes ) & [ Unicode] ( #unicode-escapes ) |
25
+ | [ String] ( #string-literals ) | ` "hello" ` | ` N/A ` | All Unicode | [ Quote] ( #quote-escapes ) & [ ASCII ] ( #ascii -escapes ) & [ Unicode] ( #unicode-escapes ) |
26
26
| [ Raw] ( #raw-string-literals ) | ` r#"hello"# ` | ` 0... ` | All Unicode | ` N/A ` |
27
27
| [ Byte] ( #byte-literals ) | ` b'H' ` | ` N/A ` | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
28
28
| [ Byte string] ( #byte-string-literals ) | ` b"hello" ` | ` N/A ` | All ASCII | [ Quote] ( #quote-escapes ) & [ Byte] ( #byte-escapes ) |
29
29
| [ Raw byte string] ( #raw-byte-string-literals ) | ` br#"hello"# ` | ` 0... ` | All ASCII | ` N/A ` |
30
30
31
+ #### ASCII escapes
32
+
33
+ | | Name |
34
+ | ---| ------|
35
+ | ` \x41 ` | 7-bit character code (exactly 2 digits, up to 0x7F) |
36
+ | ` \n ` | Newline |
37
+ | ` \r ` | Carriage return |
38
+ | ` \t ` | Tab |
39
+ | ` \\ ` | Backslash |
40
+ | ` \0 ` | Null |
41
+
31
42
#### Byte escapes
32
43
33
44
| | Name |
@@ -74,12 +85,45 @@ evaluated (primarily) at compile time.
74
85
75
86
#### Character literals
76
87
88
+ > ** <sup >Lexer</sup >**
89
+ > CHAR_LITERAL :
90
+ >   ;  ; ` ' ` ( ~ [ ` ' ` ` \ ` \\ n \\ r \\ t] | QUOTE_ESCAPE | ASCII_ESCAPE | UNICODE_ESCAPE ) ` ' `
91
+ >
92
+ > QUOTE_ESCAPE :
93
+ >   ;  ; ` \' ` | ` \" `
94
+ >
95
+ > ASCII_ESCAPE :
96
+ >   ;  ;   ;  ; ` \x ` OCT_DIGIT HEX_DIGIT
97
+ >   ;  ; | ` \n ` | ` \r ` | ` \t ` | ` \\ ` | ` \0 `
98
+ >
99
+ > UNICODE_ESCAPE :
100
+ >   ;  ;   ;  ; ` \u{ ` HEX_DIGIT ` } `
101
+ >   ;  ; | ` \u{ ` HEX_DIGIT HEX_DIGIT ` } `
102
+ >   ;  ; | ` \u{ ` HEX_DIGIT HEX_DIGIT HEX_DIGIT ` } `
103
+ >   ;  ; | ` \u{ ` HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT ` } `
104
+ >   ;  ; | ` \u{ ` HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT` } `
105
+ >   ;  ; | ` \u{ ` HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT` } `
106
+
77
107
A _ character literal_ is a single Unicode character enclosed within two
78
108
` U+0027 ` (single-quote) characters, with the exception of ` U+0027 ` itself,
79
109
which must be _ escaped_ by a preceding ` U+005C ` character (` \ ` ).
80
110
81
111
#### String literals
82
112
113
+ > ** <sup >Lexer</sup >**
114
+ > STRING_LITERAL :
115
+ >   ;  ; ` " ` (
116
+ >   ;  ;   ;  ; ~ [ ` " ` ` \ ` _ IsolatedCR_ ]
117
+ >   ;  ;   ;  ; | QUOTE_ESCAPE
118
+ >   ;  ;   ;  ; | ASCII_ESCAPE
119
+ >   ;  ;   ;  ; | UNICODE_ESCAPE
120
+ >   ;  ;   ;  ; | STRING_CONTINUE
121
+ >   ;  ; )<sup >\* </sup > ` " `
122
+ >
123
+ > STRING_CONTINUE :
124
+ >   ;  ; ` \ ` _ followed by_ \\ n
125
+
126
+
83
127
A _ string literal_ is a sequence of any Unicode characters enclosed within two
84
128
` U+0022 ` (double-quote) characters, with the exception of ` U+0022 ` itself,
85
129
which must be _ escaped_ by a preceding ` U+005C ` character (` \ ` ).
@@ -120,6 +164,14 @@ following forms:
120
164
121
165
#### Raw string literals
122
166
167
+ > ** <sup >Lexer</sup >**
168
+ > RAW_STRING_LITERAL :
169
+ >   ;  ; ` r ` RAW_STRING_CONTENT
170
+ >
171
+ > RAW_STRING_CONTENT :
172
+ >   ;  ;   ;  ; ` " ` ( ~ _ IsolatedCR_ )<sup >* (non-greedy)</sup > ` " `
173
+ >   ;  ; | ` # ` RAW_STRING_CONTENT ` # `
174
+
123
175
Raw string literals do not process any escapes. They start with the character
124
176
` U+0072 ` (` r ` ), followed by zero or more of the character ` U+0023 ` (` # ` ) and a
125
177
` U+0022 ` (double-quote) character. The _ raw string body_ can contain any sequence
@@ -149,6 +201,17 @@ r##"foo #"# bar"##; // foo #"# bar
149
201
150
202
#### Byte literals
151
203
204
+ > ** <sup >Lexer</sup >**
205
+ > BYTE_LITERAL :
206
+ >   ;  ; ` b' ` ( ASCII_FOR_CHAR | BYTE_ESCAPE ) ` ' `
207
+ >
208
+ > ASCII_FOR_CHAR :
209
+ >   ;  ; _ any ASCII (i.e. 0x00 to 0x7F), except_ ` ' ` , ` / ` , \\ n, \\ r or \\ t
210
+ >
211
+ > BYTE_ESCAPE :
212
+ >   ;  ;   ;  ; ` \x ` HEX_DIGIT HEX_DIGIT
213
+ >   ;  ; | ` \n ` | ` \r ` | ` \t ` | ` \\ ` | ` \0 `
214
+
152
215
A _ byte literal_ is a single ASCII character (in the ` U+0000 ` to ` U+007F `
153
216
range) or a single _ escape_ preceded by the characters ` U+0062 ` (` b ` ) and
154
217
` U+0027 ` (single-quote), and followed by the character ` U+0027 ` . If the character
@@ -158,6 +221,13 @@ _number literal_.
158
221
159
222
#### Byte string literals
160
223
224
+ > ** <sup >Lexer</sup >**
225
+ > BYTE_STRING_LITERAL :
226
+ >   ;  ; ` b" ` ( ASCII_FOR_STRING | BYTE_ESCAPE | STRING_CONTINUE )<sup >\* </sup > ` " `
227
+ >
228
+ > ASCII_FOR_STRING :
229
+ >   ;  ; _ any ASCII (i.e 0x00 to 0x7F), except_ ` " ` , ` / ` _ and IsolatedCR_
230
+
161
231
A non-raw _ byte string literal_ is a sequence of ASCII characters and _ escapes_ ,
162
232
preceded by the characters ` U+0062 ` (` b ` ) and ` U+0022 ` (double-quote), and
163
233
followed by the character ` U+0022 ` . If the character ` U+0022 ` is present within
@@ -183,6 +253,18 @@ following forms:
183
253
184
254
#### Raw byte string literals
185
255
256
+ > ** <sup >Lexer</sup >**
257
+ > RAW_BYTE_STRING_LITERAL :
258
+ >   ;  ; ` br ` RAW_BYTE_STRING_CONTENT
259
+ >
260
+ > RAW_BYTE_STRING_CONTENT :
261
+ >   ;  ;   ;  ; ` " ` ASCII<sup >* (non-greedy)</sup > ` " `
262
+ >   ;  ; | ` # ` RAW_STRING_CONTENT ` # `
263
+ >
264
+ > ASCII :
265
+ >   ;  ; _ any ASCII (i.e. 0x00 to 0x7F)_
266
+
267
+
186
268
Raw byte string literals do not process any escapes. They start with the
187
269
character ` U+0062 ` (` b ` ), followed by ` U+0072 ` (` r ` ), followed by zero or more
188
270
of the character ` U+0023 ` (` # ` ), and a ` U+0022 ` (double-quote) character. The
0 commit comments