@@ -239,13 +239,14 @@ literal : string_lit | char_lit | num_lit ;
239
239
240
240
~~~~~~~~ {.ebnf .gram}
241
241
char_lit : '\x27' char_body '\x27' ;
242
- string_lit : '"' string_body * '"' ;
242
+ string_lit : '"' string_body * '"' | 'r' raw_string ;
243
243
244
244
char_body : non_single_quote
245
245
| '\x5c' [ '\x27' | common_escape ] ;
246
246
247
247
string_body : non_double_quote
248
248
| '\x5c' [ '\x22' | common_escape ] ;
249
+ raw_string : '"' raw_string_body '"' | '#' raw_string '#' ;
249
250
250
251
common_escape : '\x5c'
251
252
| 'n' | 'r' | 't' | '0'
@@ -267,9 +268,10 @@ which must be _escaped_ by a preceding U+005C character (`\`).
267
268
268
269
A _ string literal_ is a sequence of any Unicode characters enclosed within
269
270
two ` U+0022 ` (double-quote) characters, with the exception of ` U+0022 `
270
- itself, which must be _ escaped_ by a preceding ` U+005C ` character (` \ ` ).
271
+ itself, which must be _ escaped_ by a preceding ` U+005C ` character (` \ ` ),
272
+ or a _ raw string literal_ .
271
273
272
- Some additional _ escapes_ are available in either character or string
274
+ Some additional _ escapes_ are available in either character or non-raw string
273
275
literals. An escape starts with a ` U+005C ` (` \ ` ) and continues with one of
274
276
the following forms:
275
277
@@ -285,9 +287,35 @@ the following forms:
285
287
* A _ whitespace escape_ is one of the characters ` U+006E ` (` n ` ), ` U+0072 `
286
288
(` r ` ), or ` U+0074 ` (` t ` ), denoting the unicode values ` U+000A ` (LF),
287
289
` U+000D ` (CR) or ` U+0009 ` (HT) respectively.
288
- * The _ backslash escape_ is the character U+005C (` \ ` ) which must be
290
+ * The _ backslash escape_ is the character ` U+005C ` (` \ ` ) which must be
289
291
escaped in order to denote * itself* .
290
292
293
+ Raw string literals do not process any escapes. They start with the character
294
+ ` U+0072 ` (` r ` ), followed zero or more of the character ` U+0023 ` (` # ` ) and a
295
+ ` U+0022 ` (double-quote) character. The _ raw string body_ is not defined in the
296
+ EBNF grammar above: it can contain any sequence of Unicode characters and is
297
+ terminated only by another ` U+0022 ` (double-quote) character, followed by the
298
+ same number of ` U+0023 ` (` # ` ) characters that preceeded the opening ` U+0022 `
299
+ (double-quote) character.
300
+
301
+ All Unicode characters contained in the raw string body represent themselves,
302
+ the characters ` U+0022 ` (double-quote) (except when followed by at least as
303
+ many ` U+0023 ` (` # ` ) characters as were used to start the raw string literal) or
304
+ ` U+005C ` (` \ ` ) do not have any special meaning.
305
+
306
+ Examples for string literals:
307
+
308
+ ~~~
309
+ "foo"; r"foo"; // foo
310
+ "\"foo\""; r#""foo""#; // "foo"
311
+
312
+ "foo #\"# bar";
313
+ r##"foo #"# bar"##; // foo #"# bar
314
+
315
+ "\x52"; "R"; r"R"; // R
316
+ "\\x52"; r"\x52"; // \x52
317
+ ~~~
318
+
291
319
#### Number literals
292
320
293
321
~~~~~~~~ {.ebnf .gram}
0 commit comments