Skip to content

Remove an incorrect assert in MFMASmallGemmSingleWaveOpt. #130131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1891,7 +1891,6 @@ class MFMASmallGemmSingleWaveOpt final : public IGLPStrategy {
}
}

assert(Cache->size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can return false?
There are other InstructionRule in MFMASmallGemmSingleWaveOpt that make a similar assert -- can you apply the same change in those places as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think an explicit "return false" would be helpful - if Cache is empty at this point then the loop immediately below will do zero iterations and we'll hit the "return false" immediately below that.

The only other similar assert I found was the one in the next inner class down, MFMASmallGemmSingleWaveOpt::IsPermForDSW . Again this will return false if Cache is empty due to the llvm::any_of() call.

auto *DAG = SyncPipe[0].DAG;
for (auto &Elt : *Cache) {
if (DAG->IsReachable(Elt, const_cast<SUnit *>(SU)))
Expand Down Expand Up @@ -1928,8 +1927,6 @@ class MFMASmallGemmSingleWaveOpt final : public IGLPStrategy {
return FitsInGroup;
}

assert(Cache->size());

// Does the VALU have a DS_WRITE successor that is the same as other
// VALU already in the group. The V_PERMs will all share 1 DS_W succ
return llvm::any_of(*Cache, [&SU](SUnit *Elt) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -O1 < %s | FileCheck -check-prefix=GCN %s

define amdgpu_kernel void @test_iglp_opt_rev_mfma_gemm(<1 x i64> %L1) {
; GCN-LABEL: test_iglp_opt_rev_mfma_gemm:
; GCN: ; %bb.0: ; %entry
; GCN-NEXT: v_mov_b32_e32 v32, 0
; GCN-NEXT: s_load_dwordx2 s[0:1], s[8:9], 0x0
; GCN-NEXT: ds_read_b128 v[28:31], v32 offset:112
; GCN-NEXT: ds_read_b128 v[24:27], v32 offset:96
; GCN-NEXT: ds_read_b128 v[20:23], v32 offset:80
; GCN-NEXT: ds_read_b128 v[16:19], v32 offset:64
; GCN-NEXT: ds_read_b128 v[0:3], v32
; GCN-NEXT: ds_read_b128 v[4:7], v32 offset:16
; GCN-NEXT: ds_read_b128 v[8:11], v32 offset:32
; GCN-NEXT: ds_read_b128 v[12:15], v32 offset:48
; GCN-NEXT: v_mov_b32_e32 v34, 0
; GCN-NEXT: v_mov_b32_e32 v35, v34
; GCN-NEXT: s_waitcnt lgkmcnt(0)
; GCN-NEXT: s_cmp_lg_u64 s[0:1], 0
; GCN-NEXT: ; iglp_opt mask(0x00000001)
; GCN-NEXT: ds_write_b128 v32, v[28:31] offset:112
; GCN-NEXT: ds_write_b128 v32, v[24:27] offset:96
; GCN-NEXT: ds_write_b128 v32, v[20:23] offset:80
; GCN-NEXT: ds_write_b128 v32, v[16:19] offset:64
; GCN-NEXT: ds_write_b128 v32, v[12:15] offset:48
; GCN-NEXT: ds_write_b128 v32, v[8:11] offset:32
; GCN-NEXT: ds_write_b128 v32, v[4:7] offset:16
; GCN-NEXT: ds_write_b128 v32, v[0:3]
; GCN-NEXT: ds_write_b64 v32, v[34:35]
; GCN-NEXT: s_endpgm
entry:
call void @llvm.amdgcn.iglp.opt(i32 1)
%load.4 = load <32 x float>, ptr addrspace(3) null, align 128
%B = urem <1 x i64> zeroinitializer, %L1
store <32 x float> %load.4, ptr addrspace(3) null, align 128
store <1 x i64> %B, ptr addrspace(3) null, align 8
ret void
}

declare void @llvm.amdgcn.iglp.opt(i32 immarg) #0

attributes #0 = { convergent nocallback nofree nounwind willreturn }
Loading