Skip to content

[MLIR][AMDGPU] Add a wrapper for global LDS load intrinsics in AMDGPU #133498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Apr 8, 2025

Conversation

lialan
Copy link
Member

@lialan lialan commented Mar 28, 2025

Defining a new amdgpu.global_load op, which is a thin wrap around ROCDL global_load_lds intrinsic, along with its lowering logics to rocdl.global.load.lds.

@lialan lialan force-pushed the users/lialan/global_load_lds branch from f271c01 to 92a1ef9 Compare March 28, 2025 20:56
@lialan lialan requested review from krzysz00 and kuhar April 1, 2025 22:39
Copy link

github-actions bot commented Apr 2, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@lialan lialan requested a review from krzysz00 April 2, 2025 18:13
@lialan lialan requested a review from krzysz00 April 3, 2025 01:00
@lialan lialan marked this pull request as ready for review April 3, 2025 02:30
Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good minus doc wording and formatting nits

Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % nits

@krzysz00
Copy link
Contributor

krzysz00 commented Apr 7, 2025

Wanted to flag #133015 landing for a future PR

@lialan lialan merged commit dae0ef5 into llvm:main Apr 8, 2025
10 of 11 checks passed
@lialan lialan deleted the users/lialan/global_load_lds branch April 8, 2025 13:18
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 8, 2025

LLVM Buildbot has detected a new failure on builder amdgpu-offload-ubuntu-22-cmake-build-only running on rocm-docker-ubu-22 while building mlir at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/203/builds/6999

Here is the relevant piece of the build log for the reference
Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure)
...
[6819/7736] Linking CXX shared library lib/libFortranSemantics.so.21.0git
[6820/7736] Creating library symlink lib/libFortranSemantics.so
[6821/7736] Linking CXX executable tools/flang/unittests/Evaluate/logical.test
[6822/7736] Linking CXX executable tools/flang/unittests/Evaluate/integer.test
[6823/7736] Linking CXX executable tools/flang/unittests/Evaluate/real.test
[6824/7736] Building CXX object tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o
[6825/7736] Linking CXX executable tools/flang/unittests/Evaluate/intrinsics.test
[6826/7736] Linking CXX executable tools/flang/unittests/Evaluate/folding.test
[6827/7736] Linking CXX executable tools/flang/unittests/Evaluate/expression.test
[6828/7736] Linking CXX shared library lib/libMLIRAMDGPUDialect.so.21.0git
FAILED: lib/libMLIRAMDGPUDialect.so.21.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wundef -Wno-unused-but-set-parameter -Wno-deprecated-copy -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,libMLIRAMDGPUDialect.so.21.0git -o lib/libMLIRAMDGPUDialect.so.21.0git tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib:"  lib/libMLIRROCDLDialect.so.21.0git  lib/libMLIRGPUDialect.so.21.0git  lib/libMLIRLLVMDialect.so.21.0git  lib/libLLVMBitWriter.so.21.0git  lib/libLLVMBitReader.so.21.0git  lib/libLLVMAsmParser.so.21.0git  lib/libLLVMCore.so.21.0git  lib/libLLVMBinaryFormat.so.21.0git  lib/libMLIRDLTIDialect.so.21.0git  lib/libMLIRMemRefDialect.so.21.0git  lib/libMLIRMemorySlotInterfaces.so.21.0git  lib/libMLIRArithUtils.so.21.0git  lib/libMLIRComplexDialect.so.21.0git  lib/libMLIRArithDialect.so.21.0git  lib/libMLIRCastInterfaces.so.21.0git  lib/libMLIRInferIntRangeCommon.so.21.0git  lib/libMLIRShapedOpInterfaces.so.21.0git  lib/libMLIRUBDialect.so.21.0git  lib/libMLIRDialect.so.21.0git  lib/libMLIRDialectUtils.so.21.0git  lib/libMLIRValueBoundsOpInterface.so.21.0git  lib/libMLIRAnalysis.so.21.0git  lib/libMLIRSideEffectInterfaces.so.21.0git  lib/libMLIRInferIntRangeInterface.so.21.0git  lib/libMLIRInferTypeOpInterface.so.21.0git  lib/libMLIRControlFlowInterfaces.so.21.0git  lib/libMLIRDataLayoutInterfaces.so.21.0git  lib/libMLIRLoopLikeInterface.so.21.0git  lib/libMLIRFunctionInterfaces.so.21.0git  lib/libMLIRCallInterfaces.so.21.0git  lib/libMLIRPresburger.so.21.0git  lib/libMLIRDestinationStyleOpInterface.so.21.0git  lib/libMLIRViewLikeInterface.so.21.0git  lib/libMLIRIR.so.21.0git  lib/libMLIRSupport.so.21.0git  lib/libLLVMSupport.so.21.0git  -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib && :
/usr/bin/ld: tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o: in function `mlir::amdgpu::GatherToLDSOp::verify() [clone .localalias]':
AMDGPUDialect.cpp:(.text._ZN4mlir6amdgpu13GatherToLDSOp6verifyEv+0xa0): undefined reference to `mlir::memref::isStaticShapeAndContiguousRowMajor(mlir::MemRefType)'
collect2: error: ld returned 1 exit status
[6829/7736] Building InstCombineTables.inc...
ninja: build stopped: subcommand failed.
['ninja'] exited with return code 1.
The build step threw an exception...
Traceback (most recent call last):
  File "/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py", line 50, in step
    yield
  File "/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py", line 41, in main
    run_command(["ninja"])
  File "/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py", line 63, in run_command
    util.report_run_cmd(cmd, cwd=directory)
  File "/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-zorg/zorg/buildbot/builders/annotated/util.py", line 49, in report_run_cmd
    subprocess.check_call(cmd, shell=shell, *args, **kwargs)
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ninja']' returned non-zero exit status 1.
@@@STEP_FAILURE@@@
Step 7 (build cmake config) failure: build cmake config (failure)
...
[6819/7736] Linking CXX shared library lib/libFortranSemantics.so.21.0git
[6820/7736] Creating library symlink lib/libFortranSemantics.so
[6821/7736] Linking CXX executable tools/flang/unittests/Evaluate/logical.test
[6822/7736] Linking CXX executable tools/flang/unittests/Evaluate/integer.test
[6823/7736] Linking CXX executable tools/flang/unittests/Evaluate/real.test
[6824/7736] Building CXX object tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o
[6825/7736] Linking CXX executable tools/flang/unittests/Evaluate/intrinsics.test
[6826/7736] Linking CXX executable tools/flang/unittests/Evaluate/folding.test
[6827/7736] Linking CXX executable tools/flang/unittests/Evaluate/expression.test
[6828/7736] Linking CXX shared library lib/libMLIRAMDGPUDialect.so.21.0git
FAILED: lib/libMLIRAMDGPUDialect.so.21.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wundef -Wno-unused-but-set-parameter -Wno-deprecated-copy -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,libMLIRAMDGPUDialect.so.21.0git -o lib/libMLIRAMDGPUDialect.so.21.0git tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib:"  lib/libMLIRROCDLDialect.so.21.0git  lib/libMLIRGPUDialect.so.21.0git  lib/libMLIRLLVMDialect.so.21.0git  lib/libLLVMBitWriter.so.21.0git  lib/libLLVMBitReader.so.21.0git  lib/libLLVMAsmParser.so.21.0git  lib/libLLVMCore.so.21.0git  lib/libLLVMBinaryFormat.so.21.0git  lib/libMLIRDLTIDialect.so.21.0git  lib/libMLIRMemRefDialect.so.21.0git  lib/libMLIRMemorySlotInterfaces.so.21.0git  lib/libMLIRArithUtils.so.21.0git  lib/libMLIRComplexDialect.so.21.0git  lib/libMLIRArithDialect.so.21.0git  lib/libMLIRCastInterfaces.so.21.0git  lib/libMLIRInferIntRangeCommon.so.21.0git  lib/libMLIRShapedOpInterfaces.so.21.0git  lib/libMLIRUBDialect.so.21.0git  lib/libMLIRDialect.so.21.0git  lib/libMLIRDialectUtils.so.21.0git  lib/libMLIRValueBoundsOpInterface.so.21.0git  lib/libMLIRAnalysis.so.21.0git  lib/libMLIRSideEffectInterfaces.so.21.0git  lib/libMLIRInferIntRangeInterface.so.21.0git  lib/libMLIRInferTypeOpInterface.so.21.0git  lib/libMLIRControlFlowInterfaces.so.21.0git  lib/libMLIRDataLayoutInterfaces.so.21.0git  lib/libMLIRLoopLikeInterface.so.21.0git  lib/libMLIRFunctionInterfaces.so.21.0git  lib/libMLIRCallInterfaces.so.21.0git  lib/libMLIRPresburger.so.21.0git  lib/libMLIRDestinationStyleOpInterface.so.21.0git  lib/libMLIRViewLikeInterface.so.21.0git  lib/libMLIRIR.so.21.0git  lib/libMLIRSupport.so.21.0git  lib/libLLVMSupport.so.21.0git  -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/lib && :
/usr/bin/ld: tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o: in function `mlir::amdgpu::GatherToLDSOp::verify() [clone .localalias]':
AMDGPUDialect.cpp:(.text._ZN4mlir6amdgpu13GatherToLDSOp6verifyEv+0xa0): undefined reference to `mlir::memref::isStaticShapeAndContiguousRowMajor(mlir::MemRefType)'
collect2: error: ld returned 1 exit status
[6829/7736] Building InstCombineTables.inc...
ninja: build stopped: subcommand failed.
['ninja'] exited with return code 1.
The build step threw an exception...
Traceback (most recent call last):
  File "/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py", line 50, in step
    yield
  File "/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py", line 41, in main
    run_command(["ninja"])
  File "/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/build/../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py", line 63, in run_command
    util.report_run_cmd(cmd, cwd=directory)
  File "/home/botworker/bbot/amdgpu-offload-ubuntu-22-cmake-build-only/llvm-zorg/zorg/buildbot/builders/annotated/util.py", line 49, in report_run_cmd
    subprocess.check_call(cmd, shell=shell, *args, **kwargs)
  File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ninja']' returned non-zero exit status 1.
program finished with exit code 0
elapsedTime=51.452654

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 8, 2025

LLVM Buildbot has detected a new failure on builder amdgpu-offload-rhel-8-cmake-build-only running on rocm-docker-rhel-8 while building mlir at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/204/builds/5812

Here is the relevant piece of the build log for the reference
Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure)
...
[7150/7736] Building CXX object tools/clang/tools/clang-linker-wrapper/CMakeFiles/clang-linker-wrapper.dir/ClangLinkerWrapper.cpp.o
[7151/7736] Building CXX object tools/clang/tools/clang-nvlink-wrapper/CMakeFiles/clang-nvlink-wrapper.dir/ClangNVLinkWrapper.cpp.o
[7152/7736] Building CXX object tools/clang/tools/clang-sycl-linker/CMakeFiles/clang-sycl-linker.dir/ClangSYCLLinker.cpp.o
[7153/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/BuildSystem.cpp.o
[7154/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexCXX.cpp.o
[7155/7736] Linking CXX shared library lib/libLLVMAMDGPUUtils.so.21.0git
[7156/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXCompilationDatabase.cpp.o
[7157/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXLoadedDiagnostic.cpp.o
[7158/7736] Creating library symlink lib/libLLVMAMDGPUUtils.so
[7159/7736] Linking CXX shared library lib/libMLIRAMDGPUDialect.so.21.0git
FAILED: lib/libMLIRAMDGPUDialect.so.21.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -Wundef -Wno-unused-but-set-parameter -Wno-deprecated-copy -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,libMLIRAMDGPUDialect.so.21.0git -o lib/libMLIRAMDGPUDialect.so.21.0git tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/lib:"  lib/libMLIRROCDLDialect.so.21.0git  lib/libMLIRGPUDialect.so.21.0git  lib/libMLIRLLVMDialect.so.21.0git  lib/libLLVMBitWriter.so.21.0git  lib/libLLVMBitReader.so.21.0git  lib/libLLVMAsmParser.so.21.0git  lib/libLLVMCore.so.21.0git  lib/libLLVMBinaryFormat.so.21.0git  lib/libMLIRDLTIDialect.so.21.0git  lib/libMLIRMemRefDialect.so.21.0git  lib/libMLIRMemorySlotInterfaces.so.21.0git  lib/libMLIRArithUtils.so.21.0git  lib/libMLIRComplexDialect.so.21.0git  lib/libMLIRArithDialect.so.21.0git  lib/libMLIRCastInterfaces.so.21.0git  lib/libMLIRInferIntRangeCommon.so.21.0git  lib/libMLIRShapedOpInterfaces.so.21.0git  lib/libMLIRUBDialect.so.21.0git  lib/libMLIRDialect.so.21.0git  lib/libMLIRDialectUtils.so.21.0git  lib/libMLIRValueBoundsOpInterface.so.21.0git  lib/libMLIRAnalysis.so.21.0git  lib/libMLIRSideEffectInterfaces.so.21.0git  lib/libMLIRInferIntRangeInterface.so.21.0git  lib/libMLIRInferTypeOpInterface.so.21.0git  lib/libMLIRControlFlowInterfaces.so.21.0git  lib/libMLIRDataLayoutInterfaces.so.21.0git  lib/libMLIRLoopLikeInterface.so.21.0git  lib/libMLIRFunctionInterfaces.so.21.0git  lib/libMLIRCallInterfaces.so.21.0git  lib/libMLIRPresburger.so.21.0git  lib/libMLIRDestinationStyleOpInterface.so.21.0git  lib/libMLIRViewLikeInterface.so.21.0git  lib/libMLIRIR.so.21.0git  lib/libMLIRSupport.so.21.0git  lib/libLLVMSupport.so.21.0git  -lpthread  -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/lib && :
tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o: In function `mlir::amdgpu::GatherToLDSOp::verify() [clone .localalias.308]':
AMDGPUDialect.cpp:(.text._ZN4mlir6amdgpu13GatherToLDSOp6verifyEv+0xb6): undefined reference to `mlir::memref::isStaticShapeAndContiguousRowMajor(mlir::MemRefType)'
collect2: error: ld returned 1 exit status
[7160/7736] Building CXX object tools/clang/tools/driver/CMakeFiles/clang.dir/cc1gen_reproducer_main.cpp.o
[7161/7736] Linking CXX shared library lib/libLLVMAMDGPUDesc.so.21.0git
[7162/7736] Building CXX object tools/clang/tools/driver/CMakeFiles/clang.dir/cc1as_main.cpp.o
[7163/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexer.cpp.o
[7164/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexDiagnostic.cpp.o
[7165/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXComment.cpp.o
[7166/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexUSRs.cpp.o
[7167/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXStoredDiagnostic.cpp.o
[7168/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXSourceLocation.cpp.o
[7169/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexInclusionStack.cpp.o
[7170/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXCursor.cpp.o
[7171/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXString.cpp.o
[7172/7736] Building CXX object tools/clang/tools/driver/CMakeFiles/clang.dir/driver.cpp.o
[7173/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexHigh.cpp.o
[7174/7736] Building CXX object tools/clang/tools/clang-extdef-mapping/CMakeFiles/clang-extdef-mapping.dir/ClangExtDefMapGen.cpp.o
[7175/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXType.cpp.o
[7176/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexCodeCompletion.cpp.o
[7177/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXIndexDataConsumer.cpp.o
[7178/7736] Building CXX object tools/clang/tools/clang-check/CMakeFiles/clang-check.dir/ClangCheck.cpp.o
[7179/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndex.cpp.o
[7180/7736] Building CXX object tools/clang/tools/c-index-test/CMakeFiles/c-index-test.dir/core_main.cpp.o
[7181/7736] Building CXX object tools/clang/tools/clang-repl/CMakeFiles/clang-repl.dir/ClangRepl.cpp.o
[7182/7736] Building CXX object tools/clang/tools/driver/CMakeFiles/clang.dir/cc1_main.cpp.o
[7183/7736] Building CXX object tools/clang/tools/clang-scan-deps/CMakeFiles/clang-scan-deps.dir/ClangScanDeps.cpp.o
[7184/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXExtractAPI.cpp.o
[7185/7736] Building CXX object tools/mlir/tools/mlir-query/CMakeFiles/mlir-query.dir/mlir-query.cpp.o
[7186/7736] Building CXX object tools/mlir/tools/mlir-rewrite/CMakeFiles/mlir-rewrite.dir/mlir-rewrite.cpp.o
[7187/7736] Building CXX object tools/mlir/tools/mlir-lsp-server/CMakeFiles/mlir-lsp-server.dir/mlir-lsp-server.cpp.o
[7188/7736] Building CXX object tools/mlir/lib/CAPI/RegisterEverything/CMakeFiles/obj.MLIRCAPIRegisterEverything.dir/RegisterEverything.cpp.o
[7189/7736] Building CXX object tools/mlir/tools/mlir-opt/CMakeFiles/MLIRMlirOptMain.dir/mlir-opt.cpp.o
[7190/7736] Building CXX object tools/mlir/tools/mlir-opt/CMakeFiles/mlir-opt.dir/mlir-opt.cpp.o
[7191/7736] Building CXX object tools/mlir/tools/mlir-reduce/CMakeFiles/mlir-reduce.dir/mlir-reduce.cpp.o
[7192/7736] Building CXX object tools/mlir/examples/transform-opt/CMakeFiles/mlir-transform-opt.dir/mlir-transform-opt.cpp.o
ninja: build stopped: subcommand failed.
Step 7 (build cmake config) failure: build cmake config (failure)
...
[7150/7736] Building CXX object tools/clang/tools/clang-linker-wrapper/CMakeFiles/clang-linker-wrapper.dir/ClangLinkerWrapper.cpp.o
[7151/7736] Building CXX object tools/clang/tools/clang-nvlink-wrapper/CMakeFiles/clang-nvlink-wrapper.dir/ClangNVLinkWrapper.cpp.o
[7152/7736] Building CXX object tools/clang/tools/clang-sycl-linker/CMakeFiles/clang-sycl-linker.dir/ClangSYCLLinker.cpp.o
[7153/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/BuildSystem.cpp.o
[7154/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexCXX.cpp.o
[7155/7736] Linking CXX shared library lib/libLLVMAMDGPUUtils.so.21.0git
[7156/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXCompilationDatabase.cpp.o
[7157/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXLoadedDiagnostic.cpp.o
[7158/7736] Creating library symlink lib/libLLVMAMDGPUUtils.so
[7159/7736] Linking CXX shared library lib/libMLIRAMDGPUDialect.so.21.0git
FAILED: lib/libMLIRAMDGPUDialect.so.21.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -Wundef -Wno-unused-but-set-parameter -Wno-deprecated-copy -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,libMLIRAMDGPUDialect.so.21.0git -o lib/libMLIRAMDGPUDialect.so.21.0git tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/lib:"  lib/libMLIRROCDLDialect.so.21.0git  lib/libMLIRGPUDialect.so.21.0git  lib/libMLIRLLVMDialect.so.21.0git  lib/libLLVMBitWriter.so.21.0git  lib/libLLVMBitReader.so.21.0git  lib/libLLVMAsmParser.so.21.0git  lib/libLLVMCore.so.21.0git  lib/libLLVMBinaryFormat.so.21.0git  lib/libMLIRDLTIDialect.so.21.0git  lib/libMLIRMemRefDialect.so.21.0git  lib/libMLIRMemorySlotInterfaces.so.21.0git  lib/libMLIRArithUtils.so.21.0git  lib/libMLIRComplexDialect.so.21.0git  lib/libMLIRArithDialect.so.21.0git  lib/libMLIRCastInterfaces.so.21.0git  lib/libMLIRInferIntRangeCommon.so.21.0git  lib/libMLIRShapedOpInterfaces.so.21.0git  lib/libMLIRUBDialect.so.21.0git  lib/libMLIRDialect.so.21.0git  lib/libMLIRDialectUtils.so.21.0git  lib/libMLIRValueBoundsOpInterface.so.21.0git  lib/libMLIRAnalysis.so.21.0git  lib/libMLIRSideEffectInterfaces.so.21.0git  lib/libMLIRInferIntRangeInterface.so.21.0git  lib/libMLIRInferTypeOpInterface.so.21.0git  lib/libMLIRControlFlowInterfaces.so.21.0git  lib/libMLIRDataLayoutInterfaces.so.21.0git  lib/libMLIRLoopLikeInterface.so.21.0git  lib/libMLIRFunctionInterfaces.so.21.0git  lib/libMLIRCallInterfaces.so.21.0git  lib/libMLIRPresburger.so.21.0git  lib/libMLIRDestinationStyleOpInterface.so.21.0git  lib/libMLIRViewLikeInterface.so.21.0git  lib/libMLIRIR.so.21.0git  lib/libMLIRSupport.so.21.0git  lib/libLLVMSupport.so.21.0git  -lpthread  -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-8-cmake-build-only/build/lib && :
tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o: In function `mlir::amdgpu::GatherToLDSOp::verify() [clone .localalias.308]':
AMDGPUDialect.cpp:(.text._ZN4mlir6amdgpu13GatherToLDSOp6verifyEv+0xb6): undefined reference to `mlir::memref::isStaticShapeAndContiguousRowMajor(mlir::MemRefType)'
collect2: error: ld returned 1 exit status
[7160/7736] Building CXX object tools/clang/tools/driver/CMakeFiles/clang.dir/cc1gen_reproducer_main.cpp.o
[7161/7736] Linking CXX shared library lib/libLLVMAMDGPUDesc.so.21.0git
[7162/7736] Building CXX object tools/clang/tools/driver/CMakeFiles/clang.dir/cc1as_main.cpp.o
[7163/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexer.cpp.o
[7164/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexDiagnostic.cpp.o
[7165/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXComment.cpp.o
[7166/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexUSRs.cpp.o
[7167/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXStoredDiagnostic.cpp.o
[7168/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXSourceLocation.cpp.o
[7169/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexInclusionStack.cpp.o
[7170/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXCursor.cpp.o
[7171/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXString.cpp.o
[7172/7736] Building CXX object tools/clang/tools/driver/CMakeFiles/clang.dir/driver.cpp.o
[7173/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexHigh.cpp.o
[7174/7736] Building CXX object tools/clang/tools/clang-extdef-mapping/CMakeFiles/clang-extdef-mapping.dir/ClangExtDefMapGen.cpp.o
[7175/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXType.cpp.o
[7176/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndexCodeCompletion.cpp.o
[7177/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXIndexDataConsumer.cpp.o
[7178/7736] Building CXX object tools/clang/tools/clang-check/CMakeFiles/clang-check.dir/ClangCheck.cpp.o
[7179/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CIndex.cpp.o
[7180/7736] Building CXX object tools/clang/tools/c-index-test/CMakeFiles/c-index-test.dir/core_main.cpp.o
[7181/7736] Building CXX object tools/clang/tools/clang-repl/CMakeFiles/clang-repl.dir/ClangRepl.cpp.o
[7182/7736] Building CXX object tools/clang/tools/driver/CMakeFiles/clang.dir/cc1_main.cpp.o
[7183/7736] Building CXX object tools/clang/tools/clang-scan-deps/CMakeFiles/clang-scan-deps.dir/ClangScanDeps.cpp.o
[7184/7736] Building CXX object tools/clang/tools/libclang/CMakeFiles/libclang.dir/CXExtractAPI.cpp.o
[7185/7736] Building CXX object tools/mlir/tools/mlir-query/CMakeFiles/mlir-query.dir/mlir-query.cpp.o
[7186/7736] Building CXX object tools/mlir/tools/mlir-rewrite/CMakeFiles/mlir-rewrite.dir/mlir-rewrite.cpp.o
[7187/7736] Building CXX object tools/mlir/tools/mlir-lsp-server/CMakeFiles/mlir-lsp-server.dir/mlir-lsp-server.cpp.o
[7188/7736] Building CXX object tools/mlir/lib/CAPI/RegisterEverything/CMakeFiles/obj.MLIRCAPIRegisterEverything.dir/RegisterEverything.cpp.o
[7189/7736] Building CXX object tools/mlir/tools/mlir-opt/CMakeFiles/MLIRMlirOptMain.dir/mlir-opt.cpp.o
[7190/7736] Building CXX object tools/mlir/tools/mlir-opt/CMakeFiles/mlir-opt.dir/mlir-opt.cpp.o
[7191/7736] Building CXX object tools/mlir/tools/mlir-reduce/CMakeFiles/mlir-reduce.dir/mlir-reduce.cpp.o
[7192/7736] Building CXX object tools/mlir/examples/transform-opt/CMakeFiles/mlir-transform-opt.dir/mlir-transform-opt.cpp.o
ninja: build stopped: subcommand failed.

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 8, 2025

LLVM Buildbot has detected a new failure on builder amdgpu-offload-rhel-9-cmake-build-only running on rocm-docker-rhel-9 while building mlir at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/205/builds/5790

Here is the relevant piece of the build log for the reference
Step 4 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py --jobs=32' (failure)
...
[6820/7736] Creating library symlink lib/libFortranSemantics.so
[6821/7736] Linking CXX executable tools/flang/unittests/Evaluate/logical.test
[6822/7736] Linking CXX executable tools/flang/unittests/Evaluate/real.test
[6823/7736] Linking CXX executable tools/flang/unittests/Evaluate/integer.test
[6824/7736] Linking CXX executable tools/flang/unittests/Evaluate/intrinsics.test
[6825/7736] Linking CXX executable tools/flang/unittests/Evaluate/folding.test
[6826/7736] Linking CXX executable tools/flang/unittests/Evaluate/expression.test
[6827/7736] Building CXX object tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o
[6828/7736] Building InstCombineTables.inc...
[6829/7736] Linking CXX shared library lib/libMLIRAMDGPUDialect.so.21.0git
FAILED: lib/libMLIRAMDGPUDialect.so.21.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wundef -Wno-unused-but-set-parameter -Wno-deprecated-copy -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,libMLIRAMDGPUDialect.so.21.0git -o lib/libMLIRAMDGPUDialect.so.21.0git tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/lib:"  lib/libMLIRROCDLDialect.so.21.0git  lib/libMLIRGPUDialect.so.21.0git  lib/libMLIRLLVMDialect.so.21.0git  lib/libLLVMBitWriter.so.21.0git  lib/libLLVMBitReader.so.21.0git  lib/libLLVMAsmParser.so.21.0git  lib/libLLVMCore.so.21.0git  lib/libLLVMBinaryFormat.so.21.0git  lib/libMLIRDLTIDialect.so.21.0git  lib/libMLIRMemRefDialect.so.21.0git  lib/libMLIRMemorySlotInterfaces.so.21.0git  lib/libMLIRArithUtils.so.21.0git  lib/libMLIRComplexDialect.so.21.0git  lib/libMLIRArithDialect.so.21.0git  lib/libMLIRCastInterfaces.so.21.0git  lib/libMLIRInferIntRangeCommon.so.21.0git  lib/libMLIRShapedOpInterfaces.so.21.0git  lib/libMLIRUBDialect.so.21.0git  lib/libMLIRDialect.so.21.0git  lib/libMLIRDialectUtils.so.21.0git  lib/libMLIRValueBoundsOpInterface.so.21.0git  lib/libMLIRAnalysis.so.21.0git  lib/libMLIRSideEffectInterfaces.so.21.0git  lib/libMLIRInferIntRangeInterface.so.21.0git  lib/libMLIRInferTypeOpInterface.so.21.0git  lib/libMLIRControlFlowInterfaces.so.21.0git  lib/libMLIRDataLayoutInterfaces.so.21.0git  lib/libMLIRLoopLikeInterface.so.21.0git  lib/libMLIRFunctionInterfaces.so.21.0git  lib/libMLIRCallInterfaces.so.21.0git  lib/libMLIRPresburger.so.21.0git  lib/libMLIRDestinationStyleOpInterface.so.21.0git  lib/libMLIRViewLikeInterface.so.21.0git  lib/libMLIRIR.so.21.0git  lib/libMLIRSupport.so.21.0git  lib/libLLVMSupport.so.21.0git  -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/lib && :
/usr/bin/ld: tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o: in function `mlir::amdgpu::GatherToLDSOp::verify() [clone .localalias]':
AMDGPUDialect.cpp:(.text._ZN4mlir6amdgpu13GatherToLDSOp6verifyEv+0x8a): undefined reference to `mlir::memref::isStaticShapeAndContiguousRowMajor(mlir::MemRefType)'
collect2: error: ld returned 1 exit status
[6830/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAliasAnalysis.cpp.o
[6831/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUFrameLowering.cpp.o
[6832/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUCtorDtorLowering.cpp.o
[6833/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUExportKernelRuntimeHandles.cpp.o
[6834/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULibFunc.cpp.o
[6835/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAsanInstrumentation.cpp.o
[6836/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAnnotateUniformValues.cpp.o
[6837/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInstrInfo.cpp.o
[6838/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULowerKernelAttributes.cpp.o
[6839/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUArgumentUsageInfo.cpp.o
[6840/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUGlobalISelUtils.cpp.o
[6841/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInsertDelayAlu.cpp.o
[6842/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUHSAMetadataStreamer.cpp.o
[6843/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUImageIntrinsicOptimizer.cpp.o
[6844/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUISelLowering.cpp.o
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp: In member function ‘llvm::SDValue llvm::AMDGPUTargetLowering::lowerFEXP10Unsafe(llvm::SDValue, const llvm::SDLoc&, llvm::SelectionDAG&, llvm::SDNodeFlags) const’:
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2974: warning: enumerated mismatch in conditional expression: ‘llvm::AMDGPUISD::NodeType’ vs ‘llvm::ISD::NodeType’ [-Wenum-compare]
 2974 |   const unsigned Exp2Op = VT == MVT::f32 ? AMDGPUISD::EXP : ISD::FEXP2;
      | 
[6845/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUGlobalISelDivergenceLowering.cpp.o
[6846/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAnnotateKernelFeatures.cpp.o
[6847/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAlwaysInlinePass.cpp.o
[6848/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInstructionSelector.cpp.o
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp: In member function ‘bool llvm::AMDGPUInstructionSelector::selectG_TRUNC(llvm::MachineInstr&) const’:
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:2531: warning: enumerated and non-enumerated type in conditional expression [-Wextra]
 2531 |         DstSize < 32 ? AMDGPU::sub0 : TRI.getSubRegFromChannel(0, DstSize / 32);
      | 
[6849/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUCodeGenPrepare.cpp.o
[6850/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUCallLowering.cpp.o
[6851/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAtomicOptimizer.cpp.o
[6852/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULegalizerInfo.cpp.o
[6853/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAsmPrinter.cpp.o
[6854/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAttributor.cpp.o
[6855/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUExportClustering.cpp.o
Step 7 (build cmake config) failure: build cmake config (failure)
...
[6820/7736] Creating library symlink lib/libFortranSemantics.so
[6821/7736] Linking CXX executable tools/flang/unittests/Evaluate/logical.test
[6822/7736] Linking CXX executable tools/flang/unittests/Evaluate/real.test
[6823/7736] Linking CXX executable tools/flang/unittests/Evaluate/integer.test
[6824/7736] Linking CXX executable tools/flang/unittests/Evaluate/intrinsics.test
[6825/7736] Linking CXX executable tools/flang/unittests/Evaluate/folding.test
[6826/7736] Linking CXX executable tools/flang/unittests/Evaluate/expression.test
[6827/7736] Building CXX object tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o
[6828/7736] Building InstCombineTables.inc...
[6829/7736] Linking CXX shared library lib/libMLIRAMDGPUDialect.so.21.0git
FAILED: lib/libMLIRAMDGPUDialect.so.21.0git 
: && /usr/bin/c++ -fPIC -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-uninitialized -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -Wundef -Wno-unused-but-set-parameter -Wno-deprecated-copy -O3 -DNDEBUG  -Wl,-z,defs -Wl,-z,nodelete   -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/./lib  -Wl,--gc-sections -shared -Wl,-soname,libMLIRAMDGPUDialect.so.21.0git -o lib/libMLIRAMDGPUDialect.so.21.0git tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o  -Wl,-rpath,"\$ORIGIN/../lib:/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/lib:"  lib/libMLIRROCDLDialect.so.21.0git  lib/libMLIRGPUDialect.so.21.0git  lib/libMLIRLLVMDialect.so.21.0git  lib/libLLVMBitWriter.so.21.0git  lib/libLLVMBitReader.so.21.0git  lib/libLLVMAsmParser.so.21.0git  lib/libLLVMCore.so.21.0git  lib/libLLVMBinaryFormat.so.21.0git  lib/libMLIRDLTIDialect.so.21.0git  lib/libMLIRMemRefDialect.so.21.0git  lib/libMLIRMemorySlotInterfaces.so.21.0git  lib/libMLIRArithUtils.so.21.0git  lib/libMLIRComplexDialect.so.21.0git  lib/libMLIRArithDialect.so.21.0git  lib/libMLIRCastInterfaces.so.21.0git  lib/libMLIRInferIntRangeCommon.so.21.0git  lib/libMLIRShapedOpInterfaces.so.21.0git  lib/libMLIRUBDialect.so.21.0git  lib/libMLIRDialect.so.21.0git  lib/libMLIRDialectUtils.so.21.0git  lib/libMLIRValueBoundsOpInterface.so.21.0git  lib/libMLIRAnalysis.so.21.0git  lib/libMLIRSideEffectInterfaces.so.21.0git  lib/libMLIRInferIntRangeInterface.so.21.0git  lib/libMLIRInferTypeOpInterface.so.21.0git  lib/libMLIRControlFlowInterfaces.so.21.0git  lib/libMLIRDataLayoutInterfaces.so.21.0git  lib/libMLIRLoopLikeInterface.so.21.0git  lib/libMLIRFunctionInterfaces.so.21.0git  lib/libMLIRCallInterfaces.so.21.0git  lib/libMLIRPresburger.so.21.0git  lib/libMLIRDestinationStyleOpInterface.so.21.0git  lib/libMLIRViewLikeInterface.so.21.0git  lib/libMLIRIR.so.21.0git  lib/libMLIRSupport.so.21.0git  lib/libLLVMSupport.so.21.0git  -Wl,-rpath-link,/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/build/lib && :
/usr/bin/ld: tools/mlir/lib/Dialect/AMDGPU/IR/CMakeFiles/obj.MLIRAMDGPUDialect.dir/AMDGPUDialect.cpp.o: in function `mlir::amdgpu::GatherToLDSOp::verify() [clone .localalias]':
AMDGPUDialect.cpp:(.text._ZN4mlir6amdgpu13GatherToLDSOp6verifyEv+0x8a): undefined reference to `mlir::memref::isStaticShapeAndContiguousRowMajor(mlir::MemRefType)'
collect2: error: ld returned 1 exit status
[6830/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAliasAnalysis.cpp.o
[6831/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUFrameLowering.cpp.o
[6832/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUCtorDtorLowering.cpp.o
[6833/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUExportKernelRuntimeHandles.cpp.o
[6834/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULibFunc.cpp.o
[6835/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAsanInstrumentation.cpp.o
[6836/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAnnotateUniformValues.cpp.o
[6837/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInstrInfo.cpp.o
[6838/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULowerKernelAttributes.cpp.o
[6839/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUArgumentUsageInfo.cpp.o
[6840/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUGlobalISelUtils.cpp.o
[6841/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInsertDelayAlu.cpp.o
[6842/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUHSAMetadataStreamer.cpp.o
[6843/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUImageIntrinsicOptimizer.cpp.o
[6844/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUISelLowering.cpp.o
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp: In member function ‘llvm::SDValue llvm::AMDGPUTargetLowering::lowerFEXP10Unsafe(llvm::SDValue, const llvm::SDLoc&, llvm::SelectionDAG&, llvm::SDNodeFlags) const’:
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp:2974: warning: enumerated mismatch in conditional expression: ‘llvm::AMDGPUISD::NodeType’ vs ‘llvm::ISD::NodeType’ [-Wenum-compare]
 2974 |   const unsigned Exp2Op = VT == MVT::f32 ? AMDGPUISD::EXP : ISD::FEXP2;
      | 
[6845/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUGlobalISelDivergenceLowering.cpp.o
[6846/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAnnotateKernelFeatures.cpp.o
[6847/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAlwaysInlinePass.cpp.o
[6848/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUInstructionSelector.cpp.o
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp: In member function ‘bool llvm::AMDGPUInstructionSelector::selectG_TRUNC(llvm::MachineInstr&) const’:
/home/botworker/bbot/amdgpu-offload-rhel-9-cmake-build-only/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:2531: warning: enumerated and non-enumerated type in conditional expression [-Wextra]
 2531 |         DstSize < 32 ? AMDGPU::sub0 : TRI.getSubRegFromChannel(0, DstSize / 32);
      | 
[6849/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUCodeGenPrepare.cpp.o
[6850/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUCallLowering.cpp.o
[6851/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAtomicOptimizer.cpp.o
[6852/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPULegalizerInfo.cpp.o
[6853/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAsmPrinter.cpp.o
[6854/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUAttributor.cpp.o
[6855/7736] Building CXX object lib/Target/AMDGPU/CMakeFiles/LLVMAMDGPUCodeGen.dir/AMDGPUExportClustering.cpp.o

@jplehr
Copy link
Contributor

jplehr commented Apr 8, 2025

To me, this looks like some missed dependency.
Can you please take a look?

Comment on lines +1038 to +1040
} else {
return transferType.getIntOrFloatBitWidth() / 8;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -15,6 +15,7 @@
#include "mlir/Dialect/Arith/IR/Arith.h"
#include "mlir/Dialect/GPU/IR/GPUDialect.h"
#include "mlir/Dialect/LLVMIR/ROCDLDialect.h"
#include "mlir/Dialect/MemRef/Utils/MemRefUtils.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to link MLIRMemRefUtils library in cmake to fix the buildbot "undefined reference failure" "mlir::memref::isStaticShapeAndContiguousRowMajor(mlir::MemRefType)"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix up here: #134862

lialan added a commit to lialan/llvm-project that referenced this pull request Apr 8, 2025
lialan added a commit that referenced this pull request Apr 8, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Apr 8, 2025
lialan added a commit to iree-org/iree that referenced this pull request May 28, 2025
## Summary
This PR sets the foundation for using `global_load_lds` instruction to
load values from global to LDS memory. The pipeline is as follows:
* Only convert `linalg.copy` emitted in `PromoteGPUMatMulOperands`. When
it sees fit, insert a different attribute
(`#iree_gpu.use_global_load_dma`) to `linalg.copy` to tag it along the
pipeline.
* Tagged `linalg.copy` will not be decomposed/tiled until bufferization.
* after distributed to threads and bufferization, the tagged
`linalg.copy` will then be lowered to a sequence of code responsible for
subgroup-coalesced loading op `iree_gpu.global_load_dma`.
* `iree_gpu.global_load_dma` will be mapped to `amdgpu.gather_to_lds`
op, which will mapped to corresponding rocdl op.
* Disable padding to reduce bank conflict pass because the destination
workgroup memory has to be contiguous.

## Lowering `linalg.copy`
After bufferization and distribute to threads, tagged `linalg.copy`
still exists in the IR:
```
linalg.copy {lowering_config = #iree_gpu.use_global_load_dma}
  ins(%subview_12 : memref<64x128xi8, strided<[256, 1], offset: ?>, #amdgpu.address_space<fat_raw_buffer>>)
  outs(%alloc_4 : memref<64x128xi8, #gpu.address_space<workgroup>>)
```

Note that this `linalg.copy` is kept in the thread's code. The op itself
is then converted into a `for loop`, in which subgroup of threads loads
coalesced chunk of values. For example, assume there are N subgroups
loading from `tensor<a x b x c>`:
* then `i`-th subgruop will load a sub tensor of size `[a/N, b, c]`, so
each slice is consecutive.
	* At this moment, assume row-major, and only tile the outermost dim.
* The reason right now we are only dealing with `linalg.copy` emitted by
`GPUPromoteMatmulOperands` is that we know the destination is allocated
contiguously.
	* TODO: expand to any memref slices.
* given `gpu.subgroup_id` and `gpu.lane_id`, each thread calculates the
consecutive data chunk the subgroup the thread belongs to is responsible
to load:
* the chunk indices is the delinearized indices of the input tensor,
from:
* `affine.delinearize_index[gpu.subgroup_id * (num_elems_of(tensor) /
num_subgroups)]`, to
* `affine.delinearize_index[(gpu.subgroup_id + 1) *
(num_elems_of(tensor) / num_subgroups) - 1]`
* Assume each subgroup will load `n` values from linearized index `[N_f,
N_b]`, then thread with lane id `i` will try to load: `iter = 0 to n :
N_f + subgroup_size * iter + (i - 1)` .
Then it will be converted to something like the following (in the
example, assume `workgroup size = 256`, `subgroup_size = 64`, loading
`64x128xi8`):
```miler
scf.for %indvar = %c0 to %c32 step %c1 {
  ;; thread-specific gathering address from global address
  %17 = affine.apply affine_map<()[s0, s1, s2] -> (s0 + s1 * 2048 + s2 * 64)>()[%lane_id, %subgroup_id, %indvar]
  %18:2 = affine.delinearize_index %17 into (128, 64) : index, index
  ;; this iteration's base storing index
  %19 = affine.apply affine_map<()[s0, s1] -> (s0 * 2048 + s1 * 64)>()[%subgroup_id, %indvar]
  %20:2 = affine.delinearize_index %19 into (128, 64) : index, index 
  iree_gpu.global_load_dma %subview_13[%18#0, %18#1] -> %alloc_5[%20#0, %20#1] : memref<128x64xi8, strided<[256, 1], offset: ?>, #amdgpu.address_space<fat_raw_buffer>> -> memref<128x64xi8, #gpu.address_space<workgroup>>
}
;; if there are residual elements (subgroup_copy_region_size % subgroup_size != 0), copy residual elements here 
gpu.barrier
```

## Dependent PRs:
* design doc: https://hackmd.io/N0RitxPzT9GPhM0jEPtOCg?view
* upstream changes required: 
  * llvm/llvm-project#133498
  * llvm/llvm-project#136405
  * llvm/llvm-project#137671
  * llvm/llvm-project#137425
  * #20800 (review)

---------

Signed-off-by: Alan Li <[email protected]>
tonykuttai pushed a commit to tonykuttai/llvm-project that referenced this pull request May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants