Skip to content

Commit c456b09

Browse files
committed
Add comments
1 parent 85a2f69 commit c456b09

File tree

2 files changed

+8
-2
lines changed

2 files changed

+8
-2
lines changed

mlir/lib/Dialect/GPU/Transforms/SubgroupReduceLowering.cpp

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -144,8 +144,11 @@ struct ScalarizeSingleElementReduce final
144144
};
145145

146146
/// Emits a subgroup reduction using a sequence of shuffles. Uses the `packFn`
147-
/// and `unpackFn` to convert to/from the native shuffle type. Assumes that the
148-
/// subgroup is `subgroupSize` lanes wide and reduces across all of them.
147+
/// and `unpackFn` to convert to the native shuffle type and to the reduction
148+
/// type, respectively. For example, with `input` of type `f16`, `packFn` could
149+
/// build ops to cast the value to `i32` to perform shuffles, while `unpackFn`
150+
/// would cast it back to `f16` to perform arithmetic reduction on. Assumes that
151+
/// the subgroup is `subgroupSize` lanes wide and reduces across all of them.
149152
static Value createSubgroupShuffleReduction(
150153
OpBuilder &builder, Location loc, Value input, gpu::AllReduceOperation mode,
151154
unsigned subgroupSize, function_ref<Value(Value)> packFn,

mlir/test/lib/Dialect/GPU/TestGpuRewrite.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,9 @@ struct TestGpuSubgroupReduceLoweringPass
7272

7373
void runOnOperation() override {
7474
RewritePatternSet patterns(&getContext());
75+
76+
// Since both pattern sets match on the same ops, set higher benefit to
77+
// perform fewer failing matches.
7578
populateGpuBreakDownSubgrupReducePatterns(patterns,
7679
/*maxShuffleBitwidth=*/32,
7780
PatternBenefit(2));

0 commit comments

Comments
 (0)