@@ -200,7 +200,196 @@ parsing `e`. Changing the invocation syntax to require a distinctive token in
200
200
front can solve the problem. In the above example, ` $(T $t:ty)* E $e:exp `
201
201
solves the problem.
202
202
203
- ## A final note
203
+ # Macro argument pattern matching
204
+
205
+ Now consider code like the following:
206
+
207
+ ## Motivation
208
+
209
+ ~~~~
210
+ # enum t1 { good_1(t2, uint), bad_1 };
211
+ # pub struct t2 { body: t3 }
212
+ # enum t3 { good_2(uint), bad_2};
213
+ # fn f(x: t1) -> uint {
214
+ match x {
215
+ good_1(g1, val) => {
216
+ match g1.body {
217
+ good_2(result) => {
218
+ // complicated stuff goes here
219
+ return result + val;
220
+ },
221
+ _ => fail ~"Didn't get good_2"
222
+ }
223
+ }
224
+ _ => return 0 // default value
225
+ }
226
+ # }
227
+ ~~~~
228
+
229
+ All the complicated stuff is deeply indented, and the error-handling code is
230
+ separated from matches that fail. We'd like to write a macro that performs
231
+ a match, but with a syntax that suits the problem better. The following macro
232
+ can solve the problem:
233
+
234
+ ~~~~
235
+ macro_rules! biased_match (
236
+ // special case: `let (x) = ...` is illegal, so use `let x = ...` instead
237
+ ( ($e:expr) ~ ($p:pat) else $err:stmt ;
238
+ binds $bind_res:ident
239
+ ) => (
240
+ let $bind_res = match $e {
241
+ $p => ( $bind_res ),
242
+ _ => { $err }
243
+ };
244
+ );
245
+ // more than one name; use a tuple
246
+ ( ($e:expr) ~ ($p:pat) else $err:stmt ;
247
+ binds $( $bind_res:ident ),*
248
+ ) => (
249
+ let ( $( $bind_res ),* ) = match $e {
250
+ $p => ( $( $bind_res ),* ),
251
+ _ => { $err }
252
+ };
253
+ )
254
+ )
255
+
256
+ # enum t1 { good_1(t2, uint), bad_1 };
257
+ # pub struct t2 { body: t3 }
258
+ # enum t3 { good_2(uint), bad_2};
259
+ # fn f(x: t1) -> uint {
260
+ biased_match!((x) ~ (good_1(g1, val)) else { return 0 };
261
+ binds g1, val )
262
+ biased_match!((g1.body) ~ (good_2(result) )
263
+ else { fail ~"Didn't get good_2" };
264
+ binds result )
265
+ // complicated stuff goes here
266
+ return result + val;
267
+ # }
268
+ ~~~~
269
+
270
+ This solves the indentation problem. But if we have a lot of chained matches
271
+ like this, we might prefer to write a single macro invocation. The input
272
+ pattern we want is clear:
273
+ ~~~~
274
+ # macro_rules! b(
275
+ ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
276
+ binds $( $bind_res:ident ),*
277
+ )
278
+ # => (0))
279
+ ~~~~
280
+
281
+ However, it's not possible to directly expand to nested match statements. But
282
+ there is a solution.
283
+
284
+ ## The recusive approach to macro writing
285
+
286
+ A macro may accept multiple different input grammars. The first one to
287
+ successfully match the actual argument to a macro invocation is the one that
288
+ "wins".
289
+
290
+
291
+ In the case of the example above, we want to write a recursive macro to
292
+ process the semicolon-terminated lines, one-by-one. So, we want the following
293
+ input patterns:
294
+
295
+ ~~~~
296
+ # macro_rules! b(
297
+ ( binds $( $bind_res:ident ),* )
298
+ # => (0))
299
+ ~~~~
300
+ ...and:
301
+
302
+ ~~~~
303
+ # macro_rules! b(
304
+ ( ($e :expr) ~ ($p :pat) else $err :stmt ;
305
+ $( ($e_rest:expr) ~ ($p_rest:pat) else $err_rest:stmt ; )*
306
+ binds $( $bind_res:ident ),*
307
+ )
308
+ # => (0))
309
+ ~~~~
310
+
311
+ The resulting macro looks like this. Note that the separation into
312
+ ` biased_match! ` and ` biased_match_rec! ` occurs only because we have an outer
313
+ piece of syntax (the ` let ` ) which we only want to transcribe once.
314
+
315
+ ~~~~
316
+
317
+ macro_rules! biased_match_rec (
318
+ // Handle the first layer
319
+ ( ($e :expr) ~ ($p :pat) else $err :stmt ;
320
+ $( ($e_rest:expr) ~ ($p_rest:pat) else $err_rest:stmt ; )*
321
+ binds $( $bind_res:ident ),*
322
+ ) => (
323
+ match $e {
324
+ $p => {
325
+ // Recursively handle the next layer
326
+ biased_match_rec!($( ($e_rest) ~ ($p_rest) else $err_rest ; )*
327
+ binds $( $bind_res ),*
328
+ )
329
+ }
330
+ _ => { $err }
331
+ }
332
+ );
333
+ ( binds $( $bind_res:ident ),* ) => ( ($( $bind_res ),*) )
334
+ )
335
+
336
+ // Wrap the whole thing in a `let`.
337
+ macro_rules! biased_match (
338
+ // special case: `let (x) = ...` is illegal, so use `let x = ...` instead
339
+ ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
340
+ binds $bind_res:ident
341
+ ) => (
342
+ let ( $( $bind_res ),* ) = biased_match_rec!(
343
+ $( ($e) ~ ($p) else $err ; )*
344
+ binds $bind_res
345
+ );
346
+ );
347
+ // more than one name: use a tuple
348
+ ( $( ($e:expr) ~ ($p:pat) else $err:stmt ; )*
349
+ binds $( $bind_res:ident ),*
350
+ ) => (
351
+ let ( $( $bind_res ),* ) = biased_match_rec!(
352
+ $( ($e) ~ ($p) else $err ; )*
353
+ binds $( $bind_res ),*
354
+ );
355
+ )
356
+ )
357
+
358
+
359
+ # enum t1 { good_1(t2, uint), bad_1 };
360
+ # pub struct t2 { body: t3 }
361
+ # enum t3 { good_2(uint), bad_2};
362
+ # fn f(x: t1) -> uint {
363
+ biased_match!(
364
+ (x) ~ (good_1(g1, val)) else { return 0 };
365
+ (g1.body) ~ (good_2(result) ) else { fail ~"Didn't get good_2" };
366
+ binds val, result )
367
+ // complicated stuff goes here
368
+ return result + val;
369
+ # }
370
+ ~~~~
371
+
372
+ This technique is applicable in many cases where transcribing a result "all
373
+ at once" is not possible. It resembles ordinary functional programming in some
374
+ respects, but it is important to recognize the differences.
375
+
376
+ The first difference is important, but also easy to forget: the transcription
377
+ (right-hand) side of a ` macro_rules! ` rule is literal syntax, which can only
378
+ be executed at run-time. If a piece of transcription syntax does not itself
379
+ appear inside another macro invocation, it will become part of the final
380
+ program. If it is inside a macro invocation (for example, the recursive
381
+ invocation of ` biased_match_rec! ` ), it does have the opprotunity to affect
382
+ transcription, but only through the process of attempted pattern matching.
383
+
384
+ The second difference is related: the evaluation order of macros feels
385
+ "backwards" compared to ordinary programming. Given an invocation
386
+ ` m1!(m2!()) ` , the expander first expands ` m1! ` , giving it as input the literal
387
+ syntax ` m2!() ` . If it transcribes its argument unchanged into an appropriate
388
+ position (in particular, not as an argument to yet another macro invocation),
389
+ the expander will then proceed to evaluate ` m2!() ` (along with any other macro
390
+ invocations ` m1!(m2!()) ` produced).
391
+
392
+ # A final note
204
393
205
394
Macros, as currently implemented, are not for the faint of heart. Even
206
395
ordinary syntax errors can be more difficult to debug when they occur inside a
@@ -209,3 +398,4 @@ tricky. Invoking the `log_syntax!` macro can help elucidate intermediate
209
398
states, invoking ` trace_macros!(true) ` will automatically print those
210
399
intermediate states out, and passing the flag ` --pretty expanded ` as a
211
400
command-line argument to the compiler will show the result of expansion.
401
+
0 commit comments