Skip to content

[flang][cuda] Change induction variable from i32 to index for doconcurrent inside cuf kernel directive #129924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 5, 2025

Conversation

wangzpgi
Copy link
Contributor

@wangzpgi wangzpgi commented Mar 5, 2025

Use index instead of i32 for induction variables for doconcurrent inside cuf kernel directive. Regular do loop inside cuf kernel directive also uses index:

cuf.kernel<<<*, *>>> (%arg0 : index) = ...

…e from i32 to index to be the same as regular do loops inside cuf kernel.
@wangzpgi wangzpgi requested a review from clementval March 5, 2025 20:00
@wangzpgi wangzpgi self-assigned this Mar 5, 2025
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Mar 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 5, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Zhen Wang (wangzpgi)

Changes

Use index instead of i32 for induction variables for doconcurrent inside cuf kernel directive. Regular do loop inside cuf kernel directive also uses index:

cuf.kernel&lt;&lt;&lt;*, *&gt;&gt;&gt; (%arg0 : index) = ...

Full diff: https://github.com/llvm/llvm-project/pull/129924.diff

2 Files Affected:

  • (modified) flang/lib/Lower/Bridge.cpp (+5-10)
  • (modified) flang/test/Lower/CUDA/cuda-doconc.cuf (+7-7)
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 4c6e47d250329..95f431983d442 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -3150,10 +3150,9 @@ class FirConverter : public Fortran::lower::AbstractConverter {
               loc, 1); // Use index type directly
 
         // Ensure lb, ub, and step are of index type using fir.convert
-        mlir::Type indexType = builder->getIndexType();
-        lb = builder->create<fir::ConvertOp>(loc, indexType, lb);
-        ub = builder->create<fir::ConvertOp>(loc, indexType, ub);
-        step = builder->create<fir::ConvertOp>(loc, indexType, step);
+        lb = builder->create<fir::ConvertOp>(loc, idxTy, lb);
+        ub = builder->create<fir::ConvertOp>(loc, idxTy, ub);
+        step = builder->create<fir::ConvertOp>(loc, idxTy, step);
 
         lbs.push_back(lb);
         ubs.push_back(ub);
@@ -3163,10 +3162,6 @@ class FirConverter : public Fortran::lower::AbstractConverter {
 
         // Handle induction variable
         mlir::Value ivValue = getSymbolAddress(*name.symbol);
-        std::size_t ivTypeSize = name.symbol->size();
-        if (ivTypeSize == 0)
-          llvm::report_fatal_error("unexpected induction variable size");
-        mlir::Type ivTy = builder->getIntegerType(ivTypeSize * 8);
 
         if (!ivValue) {
           // DO CONCURRENT induction variables are not mapped yet since they are
@@ -3174,7 +3169,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
           mlir::OpBuilder::InsertPoint insPt = builder->saveInsertionPoint();
           builder->setInsertionPointToStart(builder->getAllocaBlock());
           ivValue = builder->createTemporaryAlloc(
-              loc, ivTy, toStringRef(name.symbol->name()));
+              loc, idxTy, toStringRef(name.symbol->name()));
           builder->restoreInsertionPoint(insPt);
         }
 
@@ -3186,7 +3181,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
         // Bind the symbol to the declared variable
         bindSymbol(*name.symbol, ivValue);
         ivValues.push_back(ivValue);
-        ivTypes.push_back(ivTy);
+        ivTypes.push_back(idxTy);
         ivLocs.push_back(loc);
       }
     } else {
diff --git a/flang/test/Lower/CUDA/cuda-doconc.cuf b/flang/test/Lower/CUDA/cuda-doconc.cuf
index 32cd1676b22f4..e240b1adc206a 100644
--- a/flang/test/Lower/CUDA/cuda-doconc.cuf
+++ b/flang/test/Lower/CUDA/cuda-doconc.cuf
@@ -15,9 +15,9 @@ subroutine doconc1
 end
 
 ! CHECK: func.func @_QPdoconc1() {
-! CHECK: %[[DECL:.*]]:2 = hlfir.declare %{{.*}}#0 {uniq_name = "_QFdoconc1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
+! CHECK: %[[DECL:.*]]:2 = hlfir.declare %{{.*}}#0 {uniq_name = "_QFdoconc1Ei"} : (!fir.ref<index>) -> (!fir.ref<index>, !fir.ref<index>)
 ! CHECK: cuf.kernel<<<*, *>>>
-! CHECK: %{{.*}} = fir.load %[[DECL]]#0 : !fir.ref<i32>
+! CHECK: %{{.*}} = fir.load %[[DECL]]#0 : !fir.ref<index>
 
 subroutine doconc2
   integer :: i, j, m, n
@@ -32,8 +32,8 @@ subroutine doconc2
 end
 
 ! CHECK: func.func @_QPdoconc2() {
-! CHECK: %[[DECLI:.*]]:2 = hlfir.declare %{{.*}}#0 {uniq_name = "_QFdoconc2Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
-! CHECK: %[[DECLJ:.*]]:2 = hlfir.declare %{{.*}}#0 {uniq_name = "_QFdoconc2Ej"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
-! CHECK: cuf.kernel<<<*, *>>> (%arg0 : i32, %arg1 : i32) = (%{{.*}}, %{{.*}} : index, index) to (%{{.*}}, %{{.*}} : index, index)  step (%{{.*}}, %{{.*}} : index, index) {
-! CHECK: %{{.*}} = fir.load %[[DECLI]]#0 : !fir.ref<i32>
-! CHECK: %{{.*}} = fir.load %[[DECLJ]]#0 : !fir.ref<i32> 
+! CHECK: %[[DECLI:.*]]:2 = hlfir.declare %{{.*}}#0 {uniq_name = "_QFdoconc2Ei"} : (!fir.ref<index>) -> (!fir.ref<index>, !fir.ref<index>)
+! CHECK: %[[DECLJ:.*]]:2 = hlfir.declare %{{.*}}#0 {uniq_name = "_QFdoconc2Ej"} : (!fir.ref<index>) -> (!fir.ref<index>, !fir.ref<index>)
+! CHECK: cuf.kernel<<<*, *>>> (%arg0 : index, %arg1 : index) = (%{{.*}}, %{{.*}} : index, index) to (%{{.*}}, %{{.*}} : index, index)  step (%{{.*}}, %{{.*}} : index, index) {
+! CHECK: %{{.*}} = fir.load %[[DECLI]]#0 : !fir.ref<index>
+! CHECK: %{{.*}} = fir.load %[[DECLJ]]#0 : !fir.ref<index>

@clementval clementval changed the title Change induction variable from i32 to index for doconcurrent inside cuf kernel directive [flang][cuda] Change induction variable from i32 to index for doconcurrent inside cuf kernel directive Mar 5, 2025
std::size_t ivTypeSize = name.symbol->size();
if (ivTypeSize == 0)
llvm::report_fatal_error("unexpected induction variable size");
mlir::Type ivTy = builder->getIntegerType(ivTypeSize * 8);

if (!ivValue) {
// DO CONCURRENT induction variables are not mapped yet since they are
// local to the DO CONCURRENT scope.
mlir::OpBuilder::InsertPoint insPt = builder->saveInsertionPoint();
builder->setInsertionPointToStart(builder->getAllocaBlock());
ivValue = builder->createTemporaryAlloc(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should map the iv symbol to the block argument directly so we don't need a temporary. This can be done in a separate patch.

@wangzpgi wangzpgi merged commit d1abbb4 into llvm:main Mar 5, 2025
14 checks passed
@wangzpgi wangzpgi deleted the doconc_index branch March 5, 2025 22:50
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
…rrent inside cuf kernel directive (llvm#129924)

Use `index` instead of `i32` for induction variables for doconcurrent
inside cuf kernel directive. Regular do loop inside cuf kernel directive
also uses `index`:
```
cuf.kernel<<<*, *>>> (%arg0 : index) = ...
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants