Skip to content

GC Implementation Lowering Tips

Alon Zakai edited this page May 25, 2023 · 7 revisions

The GC optimization cookbook has useful suggestions for how to optimize WasmGC code effectively using Binaryen, in terms of which passes and optimizations to run. This page focuses on how to emit efficient WasmGC code from a compiler that Binaryen can optimize well.

Vtables and ITables

The most optimizable form for such data structures is to put them in immutable globals with immutable fields, like this:

(struct $Foo (field $vtable (ref $vtable.Foo)) ..)

(struct $vtable.Foo (field $func1 (ref $funcTypeX)) (field $func2 (ref $funcTypeY)) ..)

(global $vtable.Foo
  (struct.new $vtable.Foo
    (ref.func $A)
    (ref.func $B)
    (ref.func $C)
  )
)

(struct.new $Foo
  (global.get $vtable.Foo)
)

Note how the vtable field on the object is immutable as well. In this form, the optimizer can easily see what is being called when it can infer enough about the type of a reference. That is, if we can infer that a reference points to (ref $Foo) above (and not a subtype or supertype) then if we call the second item from the vtable, we can optimize to a static call to function $B.

(The optimizer will infer immutability when it sees no sets, but it is still good to emit fields as immutable yourself - that way you'd get an error if there is a bad set of the field.)

Boxing

The Binaryen optimizer help remove unnecessarily-boxed values in some situations, such as in Heap2Local which does escape analysis and replaces a GC allocation with locals. However, in general, boxing is something that the compiler to WasmGC needs to do effectively, because later optimizers (Binaryen and VMs) are limited in what they can do. To see why, first consider the simple situation:

(type $Box (struct (field i32)))

(type $Boxer (struct (field (ref $Box))))

;; All sets look like this:
(struct.set $Boxer
  (struct.new $Box
    ..
  )
)

;; All gets look like this:
(struct.get $Box 0
  (struct.get $Boxer 0
    ..
  )
)

Here every time we write to Boxer's field we box an integer. We could instead simply store the integer there:

(type $Boxer (struct (field i32)))

;; All sets look like this:
(struct.set $Boxer
  ..
)

;; All gets look like this:
(struct.get $Boxer 0
  ..
)

In general, however, we cannot do this. First, we'd have to see that only struct.new flows into Boxer's field, because otherwise there can be other references to the Box object, that potentially change the value of the field if it is mutable. But say that the field is immutable - we still may not wish to optimize here, because if we don't see struct.new in all sets then we'd have this situation:

(type $Box (struct (field i32)))

(type $Boxer (struct (field (ref $Box))))

;; All sets look like this:
(struct.set $Boxer
  ..
)

;; All gets look like this:
(struct.get $Box 0
  (struct.get $Boxer 0
    ..
  )
)

;; => optimize that to this: =>

(type $Box (struct (field i32)))

(type $Boxer (struct (field i32)))

;; All sets look like this:
(struct.set $Boxer
  (struct.get $Box 0
    ..
  )
)

;; All gets look like this:
(struct.get $Box 0
  ..
)

What we do here is effectively "pull" the struct.get $Box from the reads of Boxer's fields into the writes. That allows us to store only an i32 instead of a reference, which may be useful in itself, but whether this is actually faster depends on how many reads we have vs. writes, which is something a compile-time optimizer can't see. If we have far more writes than reads then we'd be making the code slower.

Clone this wiki locally