-
Notifications
You must be signed in to change notification settings - Fork 786
GC Implementation Lowering Tips
The GC optimization cookbook has useful suggestions for how to optimize WasmGC code effectively using Binaryen, in terms of which passes and optimizations to run. This page focuses on how to emit efficient WasmGC code from a compiler that Binaryen can optimize well.
The Binaryen optimizer help remove unnecessarily-boxed values in some situations, such as in Heap2Local
which does escape analysis and replaces a GC allocation with locals. However, in general, boxing is something that the compiler to WasmGC needs to do effectively, because later optimizers (Binaryen and VMs) are limited in what they can do. To see why, first consider the simple situation:
(type $Box (struct (field i32)))
(type $Boxer (struct (field (ref $Box))))
;; All sets look like this:
(struct.set $Boxer
(struct.new $Box
..
)
)
;; All gets look like this:
(struct.get $Box 0
(struct.get $Boxer 0
..
)
)
Here every time we write to Boxer's field we box an integer. We could instead simply store the integer there:
(type $Boxer (struct (field i32)))
;; All sets look like this:
(struct.set $Boxer
..
)
;; All gets look like this:
(struct.get $Boxer 0
..
)
In general, however, we cannot do this. First, we'd have to see that only struct.new
flows into Boxer's field, because otherwise there can be other references to the Box object, that potentially change the value of the field if it is mutable. But say that the field is immutable - we still may not wish to optimize here, because if we don't see struct.new
in all sets then we'd have this situation:
(type $Box (struct (field i32)))
(type $Boxer (struct (field (ref $Box))))
;; All sets look like this:
(struct.set $Boxer
..
)
;; All gets look like this:
(struct.get $Box 0
(struct.get $Boxer 0
..
)
)
;; => optimize that to this: =>
(type $Box (struct (field i32)))
(type $Boxer (struct (field i32)))
;; All sets look like this:
(struct.set $Boxer
(struct.get $Box 0
..
)
)
;; All gets look like this:
(struct.get $Box 0
..
)
What we do here is effectively "pull" the struct.get $Box
from the reads of Boxer
's fields into the writes. That allows us to store only an i32
instead of a reference, which may be useful in itself, but whether this is actually faster depends on how many reads we have vs. writes, which is something a compile-time optimizer can't see. If we have far more writes than reads then we'd be making the code slower.