Remove an incorrect assert in MFMASmallGemmSingleWaveOpt. #130131

anjenner · 2025-03-06T16:20:29Z

This assert was failing in a fuzzing test. I consulted with @jrbyrnes who said:

The MFMASmallGemmSingleWaveOpt::apply() method is invoked if and only if the user has inserted an intrinsic llvm.amdgcn.iglp.opt(i32 1) into their source code. This intrinsic applies a highly specialized DAG mutation to result in specific scheduling for a specific set of kernels. These assertions are really just confirming that the characteristics of the kernel match what is expected (i.e. The kernels are similar to the ones this DAG mutation strategy were designed against).

However, if we apply this DAG mutation to kernels for which is was not designed, then we may not find the types of instructions we are looking for, and may end up with empty caches.

I think it should be fine to just return false if the cache is empty instead of the assert.

llvmbot · 2025-03-06T16:21:09Z

@llvm/pr-subscribers-backend-amdgpu

Author: None (anjenner)

Changes

This assert was failing in a fuzzing test. I consulted with @jrbyrnes who said:

The MFMASmallGemmSingleWaveOpt::apply() method is invoked if and only if the user has inserted an intrinsic llvm.amdgcn.iglp.opt(i32 1) into their source code. This intrinsic applies a highly specialized DAG mutation to result in specific scheduling for a specific set of kernels. These assertions are really just confirming that the characteristics of the kernel match what is expected (i.e. The kernels are similar to the ones this DAG mutation strategy were designed against).

However, if we apply this DAG mutation to kernels for which is was not designed, then we may not find the types of instructions we are looking for, and may end up with empty caches.

I think it should be fine to just return false if the cache is empty instead of the assert.

Full diff: https://github.com/llvm/llvm-project/pull/130131.diff

1 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp (-1)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp b/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
index bbd262748d680..a284cc0e5af51 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
@@ -1891,7 +1891,6 @@ class MFMASmallGemmSingleWaveOpt final : public IGLPStrategy {
         }
       }
 
-      assert(Cache->size());
       auto *DAG = SyncPipe[0].DAG;
       for (auto &Elt : *Cache) {
         if (DAG->IsReachable(Elt, const_cast<SUnit *>(SU)))

@jrbyrnes

This assert was failing in a fuzzing test. I consulted with @jrbyrnes who said: The MFMASmallGemmSingleWaveOpt::apply() method is invoked if and only if the user has inserted an intrinsic llvm.amdgcn.iglp.opt(i32 1) into their source code. This intrinsic applies a highly specialized DAG mutation to result in specific scheduling for a specific set of kernels. These assertions are really just confirming that the characteristics of the kernel match what is expected (i.e. The kernels are similar to the ones this DAG mutation strategy were designed against). However, if we apply this DAG mutation to kernels for which is was not designed, then we may not find the types of instructions we are looking for, and may end up with empty caches. I think it should be fine to just return false if the cache is empty instead of the assert.

jrbyrnes · 2025-03-06T17:26:17Z

llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp

@@ -1891,7 +1891,6 @@ class MFMASmallGemmSingleWaveOpt final : public IGLPStrategy {
        }
      }

-      assert(Cache->size());


Can return false?
There are other InstructionRule in MFMASmallGemmSingleWaveOpt that make a similar assert -- can you apply the same change in those places as well?

I didn't think an explicit "return false" would be helpful - if Cache is empty at this point then the loop immediately below will do zero iterations and we'll hit the "return false" immediately below that.

The only other similar assert I found was the one in the next inner class down, MFMASmallGemmSingleWaveOpt::IsPermForDSW . Again this will return false if Cache is empty due to the llvm::any_of() call.

arsenm

Needs tests

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.AFLCustomIRMutator.opt.ll

anjenner · 2025-03-13T14:33:42Z

Needs tests

I have added the testcase generated by the fuzzer (I don't know of any other way to reproduce this). I've verified that the test fails when the assert is present and passes now that it has been removed.

arsenm · 2025-03-18T02:59:58Z

I have added the testcase generated by the fuzzer (I don't know of any other way to reproduce this). I've verified that the test fails when the assert is present and passes now that it has been removed.

Seems to reproduce fine without globalisel. test also needs to check the output and drop other unnecessary bits. You can also try llvm-reducing this down to find a smaller reproducer

…utput.

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.AFLCustomIRMutator.opt.ll

@jrbyrnes

This assert was failing in a fuzzing test. I consulted with @jrbyrnes who said: The MFMASmallGemmSingleWaveOpt::apply() method is invoked if and only if the user has inserted an intrinsic llvm.amdgcn.iglp.opt(i32 1) into their source code. This intrinsic applies a highly specialized DAG mutation to result in specific scheduling for a specific set of kernels. These assertions are really just confirming that the characteristics of the kernel match what is expected (i.e. The kernels are similar to the ones this DAG mutation strategy were designed against). However, if we apply this DAG mutation to kernels for which is was not designed, then we may not find the types of instructions we are looking for, and may end up with empty caches. I think it should be fine to just return false if the cache is empty instead of the assert.

@jrbyrnes

This assert was failing in a fuzzing test. I consulted with @jrbyrnes who said: The MFMASmallGemmSingleWaveOpt::apply() method is invoked if and only if the user has inserted an intrinsic llvm.amdgcn.iglp.opt(i32 1) into their source code. This intrinsic applies a highly specialized DAG mutation to result in specific scheduling for a specific set of kernels. These assertions are really just confirming that the characteristics of the kernel match what is expected (i.e. The kernels are similar to the ones this DAG mutation strategy were designed against). However, if we apply this DAG mutation to kernels for which is was not designed, then we may not find the types of instructions we are looking for, and may end up with empty caches. I think it should be fine to just return false if the cache is empty instead of the assert.

@jrbyrnes

This assert was failing in a fuzzing test. I consulted with @jrbyrnes who said: The MFMASmallGemmSingleWaveOpt::apply() method is invoked if and only if the user has inserted an intrinsic llvm.amdgcn.iglp.opt(i32 1) into their source code. This intrinsic applies a highly specialized DAG mutation to result in specific scheduling for a specific set of kernels. These assertions are really just confirming that the characteristics of the kernel match what is expected (i.e. The kernels are similar to the ones this DAG mutation strategy were designed against). However, if we apply this DAG mutation to kernels for which is was not designed, then we may not find the types of instructions we are looking for, and may end up with empty caches. I think it should be fine to just return false if the cache is empty instead of the assert.

@jrbyrnes

This assert was failing in a fuzzing test. I consulted with @jrbyrnes who said: The MFMASmallGemmSingleWaveOpt::apply() method is invoked if and only if the user has inserted an intrinsic llvm.amdgcn.iglp.opt(i32 1) into their source code. This intrinsic applies a highly specialized DAG mutation to result in specific scheduling for a specific set of kernels. These assertions are really just confirming that the characteristics of the kernel match what is expected (i.e. The kernels are similar to the ones this DAG mutation strategy were designed against). However, if we apply this DAG mutation to kernels for which is was not designed, then we may not find the types of instructions we are looking for, and may end up with empty caches. I think it should be fine to just return false if the cache is empty instead of the assert.

llvmbot added the backend:AMDGPU label Mar 6, 2025

anjenner requested a review from jrbyrnes March 6, 2025 16:20

jrbyrnes reviewed Mar 6, 2025

View reviewed changes

arsenm requested changes Mar 7, 2025

View reviewed changes

Address review feedback.

0a47cb5

arsenm reviewed Mar 13, 2025

View reviewed changes

Reduce testcase, remove -global-isel, drop Attrs comments and check o…

097fbb7

…utput.

arsenm approved these changes Apr 22, 2025

View reviewed changes

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.iglp.AFLCustomIRMutator.opt.ll Outdated Show resolved Hide resolved

Make mtriple change suggested by Matt Arsenault.

26aafac

arsenm approved these changes Apr 23, 2025

View reviewed changes

anjenner merged commit a3d05e8 into llvm:main Apr 24, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove an incorrect assert in MFMASmallGemmSingleWaveOpt. #130131

Remove an incorrect assert in MFMASmallGemmSingleWaveOpt. #130131

Uh oh!

anjenner commented Mar 6, 2025

Uh oh!

llvmbot commented Mar 6, 2025

Uh oh!

jrbyrnes Mar 6, 2025

Uh oh!

anjenner Mar 13, 2025

Uh oh!

arsenm left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anjenner commented Mar 13, 2025

Uh oh!

arsenm commented Mar 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Remove an incorrect assert in MFMASmallGemmSingleWaveOpt. #130131

Remove an incorrect assert in MFMASmallGemmSingleWaveOpt. #130131

Uh oh!

Conversation

anjenner commented Mar 6, 2025

Uh oh!

llvmbot commented Mar 6, 2025

Uh oh!

jrbyrnes Mar 6, 2025

Choose a reason for hiding this comment

Uh oh!

anjenner Mar 13, 2025

Choose a reason for hiding this comment

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anjenner commented Mar 13, 2025

Uh oh!

arsenm commented Mar 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!