Skip to content

Commit 333889f

Browse files
committed
---
yaml --- r: 23418 b: refs/heads/master c: 9ef56a6 h: refs/heads/master v: v3
1 parent 5b9fd0f commit 333889f

File tree

2 files changed

+155
-10
lines changed

2 files changed

+155
-10
lines changed

[refs]

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
refs/heads/master: 77e83d83a9a93188c3bb80ea2bb09f2e5fd2579f
2+
refs/heads/master: 9ef56a6ca85ba30494fe63020cde08c8e91f39d4
33
refs/heads/snap-stage1: e33de59e47c5076a89eadeb38f4934f58a3618a6
44
refs/heads/snap-stage3: cd6f24f9d14ac90d167386a56e7a6ac1f0318195
55
refs/heads/try: ffbe0e0e00374358b789b0037bcb3a577cd218be

trunk/doc/tutorial.md

Lines changed: 154 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -642,31 +642,31 @@ applies to the module or crate in which it appears.
642642

643643
## Syntax extensions
644644

645-
There are plans to support user-defined syntax (macros) in Rust. This
646-
currently only exists in very limited form.
647-
648645
The compiler defines a few built-in syntax extensions. The most useful
649-
one is `#fmt`, a printf-style text formatting macro that is expanded
646+
one is `fmt!`, a sprintf-style text formatter that is expanded
650647
at compile time.
651648

652649
~~~~
653650
io::println(fmt!("%s is %d", ~"the answer", 42));
654651
~~~~
655652

656-
`#fmt` supports most of the directives that [printf][pf] supports, but
653+
`fmt!` supports most of the directives that [printf][pf] supports, but
657654
will give you a compile-time error when the types of the directives
658655
don't match the types of the arguments.
659656

660657
[pf]: http://en.cppreference.com/w/cpp/io/c/fprintf
661658

662-
All syntax extensions look like `#word`. Another built-in one is
663-
`#env`, which will look up its argument as an environment variable at
659+
All syntax extensions look like `extension_name!`. Another built-in one is
660+
`env!`, which will look up its argument as an environment variable at
664661
compile-time.
665662

666663
~~~~
667664
io::println(env!("PATH"));
668665
~~~~
669666

667+
It is possible for the user to define new syntax extensions, within certain
668+
limits. These are called [macros](#macros).
669+
670670
# Control structures
671671

672672
## Conditionals
@@ -902,7 +902,7 @@ So except in code that needs to be really, really fast,
902902
you should feel free to scatter around debug logging statements, and
903903
leave them in.
904904

905-
Three macros that combine text-formatting (as with `#fmt`) and logging
905+
Three macros that combine text-formatting (as with `fmt!`) and logging
906906
are available. These take a string and any number of format arguments,
907907
and will log the formatted string:
908908

@@ -912,7 +912,7 @@ warn!("only %d seconds remaining", 10);
912912
error!("fatal: %s", get_error_string());
913913
~~~~
914914

915-
Because the macros `#debug`, `#warn`, and `#error` expand to calls to `log`,
915+
Because the macros `debug!`, `warn!`, and `error!` expand to calls to `log`,
916916
their arguments are also lazily evaluated.
917917

918918
# Functions
@@ -2334,6 +2334,151 @@ This makes it possible to rebind a variable without actually mutating
23342334
it, which is mostly useful for destructuring (which can rebind, but
23352335
not assign).
23362336

2337+
# Macros
2338+
2339+
Functions are the programmer's primary tool of abstraction, but there are
2340+
cases in which they are insufficient, because the programmer wants to
2341+
abstract over concepts not represented as values. Consider the following
2342+
example:
2343+
2344+
~~~~
2345+
# enum t { special_a(uint), special_b(uint) };
2346+
# fn f() -> uint {
2347+
# let input_1 = special_a(0), input_2 = special_a(0);
2348+
match input_1 {
2349+
special_a(x) => { return x; }
2350+
_ => {}
2351+
}
2352+
// ...
2353+
match input_2 {
2354+
special_b(x) => { return x; }
2355+
_ => {}
2356+
}
2357+
# return 0u;
2358+
# }
2359+
~~~~
2360+
2361+
This code could become tiresome if repeated many times. However, there is
2362+
no reasonable function that could be written to solve this problem. In such a
2363+
case, it's possible to define a macro to solve the problem. Macros are
2364+
lightweight custom syntax extensions, themselves defined using the
2365+
`macro_rules!` syntax extension:
2366+
2367+
~~~~
2368+
# enum t { special_a(uint), special_b(uint) };
2369+
# fn f() -> uint {
2370+
# let input_1 = special_a(0), input_2 = special_a(0);
2371+
macro_rules! early_return(
2372+
($inp:expr $sp:ident) => ( //invoke it like `(input_5 special_e)`
2373+
match $inp {
2374+
$sp(x) => { return x; }
2375+
_ => {}
2376+
}
2377+
);
2378+
);
2379+
// ...
2380+
early_return!(input_1 special_a);
2381+
// ...
2382+
early_return!(input_2 special_b);
2383+
# return 0;
2384+
# }
2385+
~~~~
2386+
2387+
Macros are defined in pattern-matching style:
2388+
2389+
## Invocation syntax
2390+
2391+
On the left-hand-side of the `=>` is the macro invocation syntax. It is
2392+
free-form, excepting the following rules:
2393+
2394+
1. It must be surrounded in parentheses.
2395+
2. `$` has special meaning.
2396+
3. The `()`s, `[]`s, and `{}`s it contains must balance. For example, `([)` is
2397+
forbidden.
2398+
2399+
To take as an argument a fragment of Rust code, write `$` followed by a name
2400+
(for use on the right-hand side), followed by a `:`, followed by the sort of
2401+
fragment to match (the most common ones are `ident`, `expr`, `ty`, `pat`, and
2402+
`block`). Anything not preceeded by a `$` is taken literally. The standard
2403+
rules of tokenization apply,
2404+
2405+
So `($x:ident => (($e:expr)))`, though excessively fancy, would create a macro
2406+
that could be invoked like `my_macro!(i=>(( 2+2 )))`.
2407+
2408+
## Transcription syntax
2409+
2410+
The right-hand side of the `=>` follows the same rules as the left-hand side,
2411+
except that `$` need only be followed by the name of the syntactic fragment
2412+
to transcribe.
2413+
2414+
## Multiplicity
2415+
2416+
### Invocation
2417+
2418+
Going back to the motivating example, suppose that we wanted each invocation
2419+
of `early_return` to potentially accept multiple "special" identifiers. The
2420+
syntax `$(...)*` accepts zero or more occurences of its contents, much like
2421+
the Kleene star operator in regular expressions. It also supports a separator
2422+
token (a comma-separated list could be written `$(...),*`), and `+` instead of
2423+
`*` to mean "at least one".
2424+
2425+
~~~~
2426+
# enum t { special_a(uint),special_b(uint),special_c(uint),special_d(uint)};
2427+
# fn f() -> uint {
2428+
# let input_1 = special_a(0), input_2 = special_a(0);
2429+
macro_rules! early_return(
2430+
($inp:expr, [ $($sp:ident)|+ ]) => (
2431+
match $inp {
2432+
$(
2433+
$sp(x) => { return x; }
2434+
)+
2435+
_ => {}
2436+
}
2437+
);
2438+
);
2439+
// ...
2440+
early_return!(input_1, [special_a|special_c|special_d]);
2441+
// ...
2442+
early_return!(input_2, [special_b]);
2443+
# return 0;
2444+
# }
2445+
~~~~
2446+
2447+
### Transcription
2448+
2449+
As the above example demonstrates, `$(...)*` is also valid on the right-hand
2450+
side of a macro definition. The behavior of Kleene star in transcription,
2451+
especially in cases where multiple stars are nested, and multiple different
2452+
names are involved, can seem somewhat magical and intuitive at first. The
2453+
system that interprets them is called "Macro By Example". The two rules to
2454+
keep in mind are (1) the behavior of `$(...)*` is to walk through one "layer"
2455+
of repetitions for all of the `$name`s it contains in lockstep, and (2) each
2456+
`$name` must be under at least as many `$(...)*`s as it was matched against.
2457+
If it is under more, it'll will be repeated, as appropriate.
2458+
2459+
## Parsing limitations
2460+
2461+
The parser used by the macro system is reasonably powerful, but the parsing of
2462+
Rust syntax is restricted in two ways:
2463+
2464+
1. The parser will always parse as much as possible. For example, if the comma
2465+
were omitted from the syntax of `early_return!` above, `input_1 [` would've
2466+
been interpreted as the beginning of an array index. In fact, invoking the
2467+
macro would have been impossible.
2468+
2. The parser must have eliminated all ambiguity by the time it reaches a
2469+
`$name:fragment_specifier`. This most often affects them when they occur in
2470+
the beginning of, or immediately after, a `$(...)*`; requiring a distinctive
2471+
token in front can solve the problem.
2472+
2473+
## A final note
2474+
2475+
Macros, as currently implemented, are not for the faint of heart. Even
2476+
ordinary syntax errors can be more difficult to debug when they occur inside
2477+
a macro, and errors caused by parse problems in generated code can be very
2478+
tricky. Invoking the `log_syntax!` macro can help elucidate intermediate
2479+
states, and using `--pretty expanded` as an argument to the compiler will
2480+
show the result of expansion.
2481+
23372482
# Traits
23382483

23392484
Traits are Rust's take on value polymorphism—the thing that

0 commit comments

Comments
 (0)