Skip to content

[flang][cuda] Update stream type for cuf kernel op #136627

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 22, 2025

Conversation

clementval
Copy link
Contributor

Update the type of the stream operand to be similar to KernelLaunchOp.

@clementval clementval requested a review from wangzpgi April 21, 2025 22:31
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Apr 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 21, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Valentin Clement (バレンタイン クレメン) (clementval)

Changes

Update the type of the stream operand to be similar to KernelLaunchOp.


Full diff: https://github.com/llvm/llvm-project/pull/136627.diff

4 Files Affected:

  • (modified) flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td (+7-12)
  • (modified) flang/lib/Lower/Bridge.cpp (+4-6)
  • (modified) flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp (+1-1)
  • (modified) flang/test/Lower/CUDA/cuda-kernel-loop-directive.cuf (+1-3)
diff --git a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td
index 926983d364ed1..46cc59cda1612 100644
--- a/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td
+++ b/flang/include/flang/Optimizer/Dialect/CUF/CUFOps.td
@@ -254,24 +254,19 @@ def cuf_KernelOp : cuf_Op<"kernel", [AttrSizedOperandSegments,
     represented by a 0 constant value.
   }];
 
-  let arguments = (ins
-    Variadic<I32>:$grid, // empty means `*`
-    Variadic<I32>:$block, // empty means `*`
-    Optional<I32>:$stream,
-    Variadic<Index>:$lowerbound,
-    Variadic<Index>:$upperbound,
-    Variadic<Index>:$step,
-    OptionalAttr<I64Attr>:$n,
-    Variadic<AnyType>:$reduceOperands,
-    OptionalAttr<ArrayAttr>:$reduceAttrs
-  );
+  let arguments = (ins Variadic<I32>:$grid, // empty means `*`
+      Variadic<I32>:$block,                 // empty means `*`
+      Optional<fir_ReferenceType>:$stream, Variadic<Index>:$lowerbound,
+      Variadic<Index>:$upperbound, Variadic<Index>:$step,
+      OptionalAttr<I64Attr>:$n, Variadic<AnyType>:$reduceOperands,
+      OptionalAttr<ArrayAttr>:$reduceAttrs);
 
   let regions = (region AnyRegion:$region);
 
   let assemblyFormat = [{
     `<` `<` `<` custom<CUFKernelValues>($grid, type($grid)) `,` 
                 custom<CUFKernelValues>($block, type($block))
-        ( `,` `stream` `=` $stream^ )? `>` `>` `>`
+        ( `,` `stream` `=` $stream^ `:` qualified(type($stream)))? `>` `>` `>`
         ( `reduce` `(` $reduceOperands^ `:` type($reduceOperands) `:` $reduceAttrs `)` )?
         custom<CUFKernelLoopControl>($region, $lowerbound, type($lowerbound),
             $upperbound, type($upperbound), $step, type($step))
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 1652a86ed7e63..7b76845b5af05 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -3097,7 +3097,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
 
     llvm::SmallVector<mlir::Value> gridValues;
     llvm::SmallVector<mlir::Value> blockValues;
-    mlir::Value streamValue;
+    mlir::Value streamAddr;
 
     if (launchConfig) {
       const std::list<Fortran::parser::CUFKernelDoConstruct::StarOrExpr> &grid =
@@ -3130,10 +3130,8 @@ class FirConverter : public Fortran::lower::AbstractConverter {
       }
 
       if (stream)
-        streamValue = builder->createConvert(
-            loc, builder->getI32Type(),
-            fir::getBase(
-                genExprValue(*Fortran::semantics::GetExpr(*stream), stmtCtx)));
+        streamAddr = fir::getBase(
+            genExprAddr(*Fortran::semantics::GetExpr(*stream), stmtCtx));
     }
 
     const auto &outerDoConstruct =
@@ -3267,7 +3265,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
     }
 
     auto op = builder->create<cuf::KernelOp>(
-        loc, gridValues, blockValues, streamValue, lbs, ubs, steps, n,
+        loc, gridValues, blockValues, streamAddr, lbs, ubs, steps, n,
         mlir::ValueRange(reduceOperands), builder->getArrayAttr(reduceAttrs));
     builder->createBlock(&op.getRegion(), op.getRegion().end(), ivTypes,
                          ivLocs);
diff --git a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp
index a86f12c2c4a55..24033bc15b8eb 100644
--- a/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp
+++ b/flang/lib/Optimizer/Dialect/CUF/CUFOps.cpp
@@ -271,7 +271,7 @@ llvm::LogicalResult cuf::KernelOp::verify() {
         return emitOpError("expect reduce attributes to be ReduceAttr");
     }
   }
-  return mlir::success();
+  return checkStreamType(*this);
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/flang/test/Lower/CUDA/cuda-kernel-loop-directive.cuf b/flang/test/Lower/CUDA/cuda-kernel-loop-directive.cuf
index 0fceb292f10d2..10f0b9e3d1215 100644
--- a/flang/test/Lower/CUDA/cuda-kernel-loop-directive.cuf
+++ b/flang/test/Lower/CUDA/cuda-kernel-loop-directive.cuf
@@ -75,9 +75,7 @@ subroutine sub1()
   end do
 end
 
-! CHECK: %[[STREAM_LOAD:.*]] = fir.load %[[STREAM]]#0 : !fir.ref<i64>
-! CHECK: %[[STREAM_I32:.*]] = fir.convert %[[STREAM_LOAD]] : (i64) -> i32
-! CHECK: cuf.kernel<<<*, *, stream = %[[STREAM_I32]]>>>
+! CHECK: cuf.kernel<<<*, *, stream = %[[STREAM]]#0 : !fir.ref<i64>>>>
 
 
 ! Test lowering with unstructured construct inside.

@clementval clementval merged commit 46e7347 into llvm:main Apr 22, 2025
14 checks passed
@clementval clementval deleted the cuf_kernel_stream2 branch April 22, 2025 02:22
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 22, 2025

LLVM Buildbot has detected a new failure on builder flang-x86_64-windows running on minipc-ryzen-win while building flang at step 8 "test-build-unified-tree-check-flang-rt".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/166/builds/1299

Here is the relevant piece of the build log for the reference
Step 8 (test-build-unified-tree-check-flang-rt) failure: test (failure)
******************** TEST 'flang-rt-OldUnit :: Runtime/RuntimeTests.exe' FAILED ********************
[==========] Running 225 tests from 54 test suites.

[----------] Global test environment set-up.

[----------] 4 tests from AllocatableTest

[ RUN      ] AllocatableTest.MoveAlloc

[       OK ] AllocatableTest.MoveAlloc (1 ms)

[ RUN      ] AllocatableTest.AllocateFromScalarSource

[       OK ] AllocatableTest.AllocateFromScalarSource (0 ms)

[ RUN      ] AllocatableTest.AllocateSourceZeroSize

[       OK ] AllocatableTest.AllocateSourceZeroSize (0 ms)

[ RUN      ] AllocatableTest.DoubleAllocation

[       OK ] AllocatableTest.DoubleAllocation (0 ms)

[----------] 4 tests from AllocatableTest (1 ms total)



[----------] 3 tests from ArrayConstructor

[ RUN      ] ArrayConstructor.Basic

[       OK ] ArrayConstructor.Basic (0 ms)

[ RUN      ] ArrayConstructor.Character

[       OK ] ArrayConstructor.Character (0 ms)

[ RUN      ] ArrayConstructor.CharacterRuntimeCheck

[       OK ] ArrayConstructor.CharacterRuntimeCheck (74 ms)

[----------] 3 tests from ArrayConstructor (74 ms total)



[----------] 1 test from BufferTests

[ RUN      ] BufferTests.TestFrameBufferReadAndWrite

[       OK ] BufferTests.TestFrameBufferReadAndWrite (0 ms)
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 22, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-sve2-vla running on linaro-g4-01 while building flang at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/198/builds/3882

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'LLVM :: tools/llvm-exegesis/RISCV/rvv/filter.test' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
/home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=inverse_throughput --opcode-name=PseudoVNCLIPU_WX_M1_MASK     --riscv-filter-config='vtype = {VXRM: rod, AVL: VLMAX, SEW: e(8|16), Policy: ta/mu}' --max-configs-per-opcode=1000 --min-instructions=10 | /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/llvm/test/tools/llvm-exegesis/RISCV/rvv/filter.test # RUN: at line 1
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/llvm-exegesis -mtriple=riscv64 -mcpu=sifive-x280 -benchmark-phase=assemble-measured-code --mode=inverse_throughput --opcode-name=PseudoVNCLIPU_WX_M1_MASK '--riscv-filter-config=vtype = {VXRM: rod, AVL: VLMAX, SEW: e(8|16), Policy: ta/mu}' --max-configs-per-opcode=1000 --min-instructions=10
+ /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/llvm/test/tools/llvm-exegesis/RISCV/rvv/filter.test
PseudoVNCLIPU_WX_M1_MASK: Failed to produce any snippet via: instruction has tied variables, avoiding Read-After-Write issue, picking random def and use registers not aliasing each other, for uses, one unique register for each position
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/llvm/test/tools/llvm-exegesis/RISCV/rvv/filter.test

--

********************


IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Update the type of the stream operand to be similar to KernelLaunchOp.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Update the type of the stream operand to be similar to KernelLaunchOp.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Update the type of the stream operand to be similar to KernelLaunchOp.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants