Skip to content

[flang][OpenMP] Add frontend support for ompx_bare clause #111106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 13, 2024

Conversation

ivanradanov
Copy link
Contributor

No description provided.

@llvmbot llvmbot added mlir flang Flang issues not falling into any other category mlir:openmp flang:fir-hlfir flang:openmp flang:parser clang:openmp OpenMP related changes to Clang labels Oct 4, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-flang-semantics
@llvm/pr-subscribers-flang-parser
@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-mlir-openmp

Author: Ivan R. Ivanov (ivanradanov)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/111106.diff

11 Files Affected:

  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+4)
  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+1)
  • (modified) flang/lib/Lower/OpenMP/Clauses.h (+1-1)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2)
  • (modified) flang/lib/Parser/openmp-parsers.cpp (+1)
  • (added) flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90 (+10)
  • (modified) llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h (+9)
  • (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+1)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+25)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+2-1)
  • (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+2-1)
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index a4d2524bccf5c3..c7f34b3d65f324 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -354,6 +354,10 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const {
   return markClauseOccurrence<omp::clause::Nowait>(result.nowait);
 }
 
+bool ClauseProcessor::processBare(mlir::omp::BareClauseOps &result) const {
+  return markClauseOccurrence<omp::clause::OmpxBare>(result.bare);
+}
+
 bool ClauseProcessor::processNumTeams(
     lower::StatementContext &stmtCtx,
     mlir::omp::NumTeamsClauseOps &result) const {
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index 0c8e7bd47ab5a6..a04731f94ede37 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -74,6 +74,7 @@ class ClauseProcessor {
   bool processHint(mlir::omp::HintClauseOps &result) const;
   bool processMergeable(mlir::omp::MergeableClauseOps &result) const;
   bool processNowait(mlir::omp::NowaitClauseOps &result) const;
+  bool processBare(mlir::omp::BareClauseOps &result) const;
   bool processNumTeams(lower::StatementContext &stmtCtx,
                        mlir::omp::NumTeamsClauseOps &result) const;
   bool processNumThreads(lower::StatementContext &stmtCtx,
diff --git a/flang/lib/Lower/OpenMP/Clauses.h b/flang/lib/Lower/OpenMP/Clauses.h
index 51bf0eab0f8d07..10bd1439563244 100644
--- a/flang/lib/Lower/OpenMP/Clauses.h
+++ b/flang/lib/Lower/OpenMP/Clauses.h
@@ -229,8 +229,8 @@ using NumTasks = tomp::clause::NumTasksT<TypeTy, IdTy, ExprTy>;
 using NumTeams = tomp::clause::NumTeamsT<TypeTy, IdTy, ExprTy>;
 using NumThreads = tomp::clause::NumThreadsT<TypeTy, IdTy, ExprTy>;
 using OmpxAttribute = tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy>;
-using OmpxBare = tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy>;
 using OmpxDynCgroupMem = tomp::clause::OmpxDynCgroupMemT<TypeTy, IdTy, ExprTy>;
+using OmpxBare = tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy>;
 using Ordered = tomp::clause::OrderedT<TypeTy, IdTy, ExprTy>;
 using Order = tomp::clause::OrderT<TypeTy, IdTy, ExprTy>;
 using Partial = tomp::clause::PartialT<TypeTy, IdTy, ExprTy>;
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 60c83586e468b6..b8b1a0ba2a69b7 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -1180,6 +1180,7 @@ static void genTargetClauses(
     cp.processNowait(clauseOps);
 
   cp.processThreadLimit(stmtCtx, clauseOps);
+  cp.processBare(clauseOps);
 
   cp.processTODO<clause::Allocate, clause::Defaultmap, clause::Firstprivate,
                  clause::InReduction, clause::UsesAllocators>(
@@ -2764,6 +2765,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
         !std::holds_alternative<clause::ThreadLimit>(clause.u) &&
         !std::holds_alternative<clause::Threads>(clause.u) &&
         !std::holds_alternative<clause::UseDeviceAddr>(clause.u) &&
+        !std::holds_alternative<clause::OmpxBare>(clause.u) &&
         !std::holds_alternative<clause::UseDevicePtr>(clause.u)) {
       TODO(clauseLocation, "OpenMP Block construct clause");
     }
diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp
index cc2930cbd7ded5..f18a9929aa75ef 100644
--- a/flang/lib/Parser/openmp-parsers.cpp
+++ b/flang/lib/Parser/openmp-parsers.cpp
@@ -315,6 +315,7 @@ TYPE_PARSER(
                        parenthesized(scalarIntExpr))) ||
     "NUM_THREADS" >> construct<OmpClause>(construct<OmpClause::NumThreads>(
                          parenthesized(scalarIntExpr))) ||
+    "OMPX_BARE" >> construct<OmpClause>(construct<OmpClause::OmpxBare>()) ||
     "ORDER" >> construct<OmpClause>(construct<OmpClause::Order>(
                    parenthesized(Parser<OmpOrderClause>{}))) ||
     "ORDERED" >> construct<OmpClause>(construct<OmpClause::Ordered>(
diff --git a/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90 b/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90
new file mode 100644
index 00000000000000..2a97a14516ec3a
--- /dev/null
+++ b/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90
@@ -0,0 +1,10 @@
+! RUN: %flang_fc1 -emit-hlfir %openmp_flags -fopenmp-version=51 %s -o - | FileCheck %s
+
+program test
+    integer :: tmp
+    !$omp target teams ompx_bare num_teams(42) thread_limit(43)
+    tmp = 1
+    !$omp end target teams
+end program
+
+! CHECK: omp.target {{.*}} ompx_bare
diff --git a/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h b/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
index 4bdfa1cf4c1490..7c0dbe1aae1221 100644
--- a/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
+++ b/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
@@ -236,6 +236,8 @@ struct ConstructDecompositionT {
                    const ClauseTy *);
   bool applyClause(const tomp::clause::NowaitT<TypeTy, IdTy, ExprTy> &clause,
                    const ClauseTy *);
+  bool applyClause(const tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy> &clause,
+                   const ClauseTy *);
   bool
   applyClause(const tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy> &clause,
               const ClauseTy *);
@@ -1103,6 +1105,13 @@ bool ConstructDecompositionT<C, H>::applyClause(
   return applyToOutermost(node);
 }
 
+template <typename C, typename H>
+bool ConstructDecompositionT<C, H>::applyClause(
+    const tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy> &clause,
+    const ClauseTy *node) {
+  return applyToAll(node);
+}
+
 template <typename C, typename H>
 bool ConstructDecompositionT<C, H>::applyClause(
     const tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy> &clause,
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td
index fcf087d1f9c6e4..ac5cf1babb4176 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMP.td
+++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td
@@ -980,6 +980,7 @@ def OMP_Target : Directive<"target"> {
     VersionedClause<OMPC_Device>,
     VersionedClause<OMPC_If>,
     VersionedClause<OMPC_NoWait>,
+    VersionedClause<OMPC_OMPX_Bare>,
     VersionedClause<OMPC_OMPX_DynCGroupMem>,
     VersionedClause<OMPC_ThreadLimit, 51>,
   ];
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index 886554f66afffc..fe84aa744b9dbb 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1270,4 +1270,29 @@ class OpenMP_UseDevicePtrClauseSkip<
 
 def OpenMP_UseDevicePtrClause : OpenMP_UseDevicePtrClauseSkip<>;
 
+//===----------------------------------------------------------------------===//
+// LLVM OpenMP extension `ompx_bare` clause
+//===----------------------------------------------------------------------===//
+
+class OpenMP_BareClauseSkip<
+    bit traits = false, bit arguments = false, bit assemblyFormat = false,
+    bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause<traits, arguments, assemblyFormat, description,
+                    extraClassDeclaration> {
+  let arguments = (ins
+    UnitAttr:$bare
+  );
+
+  let optAssemblyFormat = [{
+    `ompx_bare` $bare
+  }];
+
+  let description = [{
+    ompx_bare allows `omp target teams` to be executed on a GPU with multi-dim
+    teams and threads.
+  }];
+}
+
+def OpenMP_BareClause : OpenMP_BareClauseSkip<>;
+
 #endif // OPENMP_CLAUSES
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index 66f63fc02fe2f3..ee8aeee1278b1c 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -1108,7 +1108,8 @@ def TargetOp : OpenMP_Op<"target", traits = [
     OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause,
     OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause,
     OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip<assemblyFormat = true>,
-    OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause
+    OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause,
+    OpenMP_BareClause,
   ], singleRegion = true> {
   let summary = "target construct";
   let description = [{
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index d516c8d9e0be6c..a7c35a3bf3365b 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -1677,7 +1677,8 @@ void TargetOp::build(OpBuilder &builder, OperationState &state,
                   /*in_reduction_vars=*/{}, /*in_reduction_byref=*/nullptr,
                   /*in_reduction_syms=*/nullptr, clauses.isDevicePtrVars,
                   clauses.mapVars, clauses.nowait, clauses.privateVars,
-                  makeArrayAttr(ctx, clauses.privateSyms), clauses.threadLimit);
+                  makeArrayAttr(ctx, clauses.privateSyms), clauses.threadLimit,
+                  clauses.bare);
 }
 
 LogicalResult TargetOp::verify() {

@llvmbot
Copy link
Member

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-mlir

Author: Ivan R. Ivanov (ivanradanov)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/111106.diff

11 Files Affected:

  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+4)
  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+1)
  • (modified) flang/lib/Lower/OpenMP/Clauses.h (+1-1)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2)
  • (modified) flang/lib/Parser/openmp-parsers.cpp (+1)
  • (added) flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90 (+10)
  • (modified) llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h (+9)
  • (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+1)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+25)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+2-1)
  • (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+2-1)
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index a4d2524bccf5c3..c7f34b3d65f324 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -354,6 +354,10 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const {
   return markClauseOccurrence<omp::clause::Nowait>(result.nowait);
 }
 
+bool ClauseProcessor::processBare(mlir::omp::BareClauseOps &result) const {
+  return markClauseOccurrence<omp::clause::OmpxBare>(result.bare);
+}
+
 bool ClauseProcessor::processNumTeams(
     lower::StatementContext &stmtCtx,
     mlir::omp::NumTeamsClauseOps &result) const {
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index 0c8e7bd47ab5a6..a04731f94ede37 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -74,6 +74,7 @@ class ClauseProcessor {
   bool processHint(mlir::omp::HintClauseOps &result) const;
   bool processMergeable(mlir::omp::MergeableClauseOps &result) const;
   bool processNowait(mlir::omp::NowaitClauseOps &result) const;
+  bool processBare(mlir::omp::BareClauseOps &result) const;
   bool processNumTeams(lower::StatementContext &stmtCtx,
                        mlir::omp::NumTeamsClauseOps &result) const;
   bool processNumThreads(lower::StatementContext &stmtCtx,
diff --git a/flang/lib/Lower/OpenMP/Clauses.h b/flang/lib/Lower/OpenMP/Clauses.h
index 51bf0eab0f8d07..10bd1439563244 100644
--- a/flang/lib/Lower/OpenMP/Clauses.h
+++ b/flang/lib/Lower/OpenMP/Clauses.h
@@ -229,8 +229,8 @@ using NumTasks = tomp::clause::NumTasksT<TypeTy, IdTy, ExprTy>;
 using NumTeams = tomp::clause::NumTeamsT<TypeTy, IdTy, ExprTy>;
 using NumThreads = tomp::clause::NumThreadsT<TypeTy, IdTy, ExprTy>;
 using OmpxAttribute = tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy>;
-using OmpxBare = tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy>;
 using OmpxDynCgroupMem = tomp::clause::OmpxDynCgroupMemT<TypeTy, IdTy, ExprTy>;
+using OmpxBare = tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy>;
 using Ordered = tomp::clause::OrderedT<TypeTy, IdTy, ExprTy>;
 using Order = tomp::clause::OrderT<TypeTy, IdTy, ExprTy>;
 using Partial = tomp::clause::PartialT<TypeTy, IdTy, ExprTy>;
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 60c83586e468b6..b8b1a0ba2a69b7 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -1180,6 +1180,7 @@ static void genTargetClauses(
     cp.processNowait(clauseOps);
 
   cp.processThreadLimit(stmtCtx, clauseOps);
+  cp.processBare(clauseOps);
 
   cp.processTODO<clause::Allocate, clause::Defaultmap, clause::Firstprivate,
                  clause::InReduction, clause::UsesAllocators>(
@@ -2764,6 +2765,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
         !std::holds_alternative<clause::ThreadLimit>(clause.u) &&
         !std::holds_alternative<clause::Threads>(clause.u) &&
         !std::holds_alternative<clause::UseDeviceAddr>(clause.u) &&
+        !std::holds_alternative<clause::OmpxBare>(clause.u) &&
         !std::holds_alternative<clause::UseDevicePtr>(clause.u)) {
       TODO(clauseLocation, "OpenMP Block construct clause");
     }
diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp
index cc2930cbd7ded5..f18a9929aa75ef 100644
--- a/flang/lib/Parser/openmp-parsers.cpp
+++ b/flang/lib/Parser/openmp-parsers.cpp
@@ -315,6 +315,7 @@ TYPE_PARSER(
                        parenthesized(scalarIntExpr))) ||
     "NUM_THREADS" >> construct<OmpClause>(construct<OmpClause::NumThreads>(
                          parenthesized(scalarIntExpr))) ||
+    "OMPX_BARE" >> construct<OmpClause>(construct<OmpClause::OmpxBare>()) ||
     "ORDER" >> construct<OmpClause>(construct<OmpClause::Order>(
                    parenthesized(Parser<OmpOrderClause>{}))) ||
     "ORDERED" >> construct<OmpClause>(construct<OmpClause::Ordered>(
diff --git a/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90 b/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90
new file mode 100644
index 00000000000000..2a97a14516ec3a
--- /dev/null
+++ b/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90
@@ -0,0 +1,10 @@
+! RUN: %flang_fc1 -emit-hlfir %openmp_flags -fopenmp-version=51 %s -o - | FileCheck %s
+
+program test
+    integer :: tmp
+    !$omp target teams ompx_bare num_teams(42) thread_limit(43)
+    tmp = 1
+    !$omp end target teams
+end program
+
+! CHECK: omp.target {{.*}} ompx_bare
diff --git a/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h b/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
index 4bdfa1cf4c1490..7c0dbe1aae1221 100644
--- a/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
+++ b/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
@@ -236,6 +236,8 @@ struct ConstructDecompositionT {
                    const ClauseTy *);
   bool applyClause(const tomp::clause::NowaitT<TypeTy, IdTy, ExprTy> &clause,
                    const ClauseTy *);
+  bool applyClause(const tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy> &clause,
+                   const ClauseTy *);
   bool
   applyClause(const tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy> &clause,
               const ClauseTy *);
@@ -1103,6 +1105,13 @@ bool ConstructDecompositionT<C, H>::applyClause(
   return applyToOutermost(node);
 }
 
+template <typename C, typename H>
+bool ConstructDecompositionT<C, H>::applyClause(
+    const tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy> &clause,
+    const ClauseTy *node) {
+  return applyToAll(node);
+}
+
 template <typename C, typename H>
 bool ConstructDecompositionT<C, H>::applyClause(
     const tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy> &clause,
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td
index fcf087d1f9c6e4..ac5cf1babb4176 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMP.td
+++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td
@@ -980,6 +980,7 @@ def OMP_Target : Directive<"target"> {
     VersionedClause<OMPC_Device>,
     VersionedClause<OMPC_If>,
     VersionedClause<OMPC_NoWait>,
+    VersionedClause<OMPC_OMPX_Bare>,
     VersionedClause<OMPC_OMPX_DynCGroupMem>,
     VersionedClause<OMPC_ThreadLimit, 51>,
   ];
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index 886554f66afffc..fe84aa744b9dbb 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1270,4 +1270,29 @@ class OpenMP_UseDevicePtrClauseSkip<
 
 def OpenMP_UseDevicePtrClause : OpenMP_UseDevicePtrClauseSkip<>;
 
+//===----------------------------------------------------------------------===//
+// LLVM OpenMP extension `ompx_bare` clause
+//===----------------------------------------------------------------------===//
+
+class OpenMP_BareClauseSkip<
+    bit traits = false, bit arguments = false, bit assemblyFormat = false,
+    bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause<traits, arguments, assemblyFormat, description,
+                    extraClassDeclaration> {
+  let arguments = (ins
+    UnitAttr:$bare
+  );
+
+  let optAssemblyFormat = [{
+    `ompx_bare` $bare
+  }];
+
+  let description = [{
+    ompx_bare allows `omp target teams` to be executed on a GPU with multi-dim
+    teams and threads.
+  }];
+}
+
+def OpenMP_BareClause : OpenMP_BareClauseSkip<>;
+
 #endif // OPENMP_CLAUSES
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index 66f63fc02fe2f3..ee8aeee1278b1c 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -1108,7 +1108,8 @@ def TargetOp : OpenMP_Op<"target", traits = [
     OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause,
     OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause,
     OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip<assemblyFormat = true>,
-    OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause
+    OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause,
+    OpenMP_BareClause,
   ], singleRegion = true> {
   let summary = "target construct";
   let description = [{
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index d516c8d9e0be6c..a7c35a3bf3365b 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -1677,7 +1677,8 @@ void TargetOp::build(OpBuilder &builder, OperationState &state,
                   /*in_reduction_vars=*/{}, /*in_reduction_byref=*/nullptr,
                   /*in_reduction_syms=*/nullptr, clauses.isDevicePtrVars,
                   clauses.mapVars, clauses.nowait, clauses.privateVars,
-                  makeArrayAttr(ctx, clauses.privateSyms), clauses.threadLimit);
+                  makeArrayAttr(ctx, clauses.privateSyms), clauses.threadLimit,
+                  clauses.bare);
 }
 
 LogicalResult TargetOp::verify() {

@llvmbot
Copy link
Member

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-flang-openmp

Author: Ivan R. Ivanov (ivanradanov)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/111106.diff

11 Files Affected:

  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+4)
  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.h (+1)
  • (modified) flang/lib/Lower/OpenMP/Clauses.h (+1-1)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+2)
  • (modified) flang/lib/Parser/openmp-parsers.cpp (+1)
  • (added) flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90 (+10)
  • (modified) llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h (+9)
  • (modified) llvm/include/llvm/Frontend/OpenMP/OMP.td (+1)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td (+25)
  • (modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+2-1)
  • (modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+2-1)
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index a4d2524bccf5c3..c7f34b3d65f324 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -354,6 +354,10 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const {
   return markClauseOccurrence<omp::clause::Nowait>(result.nowait);
 }
 
+bool ClauseProcessor::processBare(mlir::omp::BareClauseOps &result) const {
+  return markClauseOccurrence<omp::clause::OmpxBare>(result.bare);
+}
+
 bool ClauseProcessor::processNumTeams(
     lower::StatementContext &stmtCtx,
     mlir::omp::NumTeamsClauseOps &result) const {
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.h b/flang/lib/Lower/OpenMP/ClauseProcessor.h
index 0c8e7bd47ab5a6..a04731f94ede37 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.h
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.h
@@ -74,6 +74,7 @@ class ClauseProcessor {
   bool processHint(mlir::omp::HintClauseOps &result) const;
   bool processMergeable(mlir::omp::MergeableClauseOps &result) const;
   bool processNowait(mlir::omp::NowaitClauseOps &result) const;
+  bool processBare(mlir::omp::BareClauseOps &result) const;
   bool processNumTeams(lower::StatementContext &stmtCtx,
                        mlir::omp::NumTeamsClauseOps &result) const;
   bool processNumThreads(lower::StatementContext &stmtCtx,
diff --git a/flang/lib/Lower/OpenMP/Clauses.h b/flang/lib/Lower/OpenMP/Clauses.h
index 51bf0eab0f8d07..10bd1439563244 100644
--- a/flang/lib/Lower/OpenMP/Clauses.h
+++ b/flang/lib/Lower/OpenMP/Clauses.h
@@ -229,8 +229,8 @@ using NumTasks = tomp::clause::NumTasksT<TypeTy, IdTy, ExprTy>;
 using NumTeams = tomp::clause::NumTeamsT<TypeTy, IdTy, ExprTy>;
 using NumThreads = tomp::clause::NumThreadsT<TypeTy, IdTy, ExprTy>;
 using OmpxAttribute = tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy>;
-using OmpxBare = tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy>;
 using OmpxDynCgroupMem = tomp::clause::OmpxDynCgroupMemT<TypeTy, IdTy, ExprTy>;
+using OmpxBare = tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy>;
 using Ordered = tomp::clause::OrderedT<TypeTy, IdTy, ExprTy>;
 using Order = tomp::clause::OrderT<TypeTy, IdTy, ExprTy>;
 using Partial = tomp::clause::PartialT<TypeTy, IdTy, ExprTy>;
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 60c83586e468b6..b8b1a0ba2a69b7 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -1180,6 +1180,7 @@ static void genTargetClauses(
     cp.processNowait(clauseOps);
 
   cp.processThreadLimit(stmtCtx, clauseOps);
+  cp.processBare(clauseOps);
 
   cp.processTODO<clause::Allocate, clause::Defaultmap, clause::Firstprivate,
                  clause::InReduction, clause::UsesAllocators>(
@@ -2764,6 +2765,7 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
         !std::holds_alternative<clause::ThreadLimit>(clause.u) &&
         !std::holds_alternative<clause::Threads>(clause.u) &&
         !std::holds_alternative<clause::UseDeviceAddr>(clause.u) &&
+        !std::holds_alternative<clause::OmpxBare>(clause.u) &&
         !std::holds_alternative<clause::UseDevicePtr>(clause.u)) {
       TODO(clauseLocation, "OpenMP Block construct clause");
     }
diff --git a/flang/lib/Parser/openmp-parsers.cpp b/flang/lib/Parser/openmp-parsers.cpp
index cc2930cbd7ded5..f18a9929aa75ef 100644
--- a/flang/lib/Parser/openmp-parsers.cpp
+++ b/flang/lib/Parser/openmp-parsers.cpp
@@ -315,6 +315,7 @@ TYPE_PARSER(
                        parenthesized(scalarIntExpr))) ||
     "NUM_THREADS" >> construct<OmpClause>(construct<OmpClause::NumThreads>(
                          parenthesized(scalarIntExpr))) ||
+    "OMPX_BARE" >> construct<OmpClause>(construct<OmpClause::OmpxBare>()) ||
     "ORDER" >> construct<OmpClause>(construct<OmpClause::Order>(
                    parenthesized(Parser<OmpOrderClause>{}))) ||
     "ORDERED" >> construct<OmpClause>(construct<OmpClause::Ordered>(
diff --git a/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90 b/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90
new file mode 100644
index 00000000000000..2a97a14516ec3a
--- /dev/null
+++ b/flang/test/Lower/OpenMP/KernelLanguage/bare-clause.f90
@@ -0,0 +1,10 @@
+! RUN: %flang_fc1 -emit-hlfir %openmp_flags -fopenmp-version=51 %s -o - | FileCheck %s
+
+program test
+    integer :: tmp
+    !$omp target teams ompx_bare num_teams(42) thread_limit(43)
+    tmp = 1
+    !$omp end target teams
+end program
+
+! CHECK: omp.target {{.*}} ompx_bare
diff --git a/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h b/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
index 4bdfa1cf4c1490..7c0dbe1aae1221 100644
--- a/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
+++ b/llvm/include/llvm/Frontend/OpenMP/ConstructDecompositionT.h
@@ -236,6 +236,8 @@ struct ConstructDecompositionT {
                    const ClauseTy *);
   bool applyClause(const tomp::clause::NowaitT<TypeTy, IdTy, ExprTy> &clause,
                    const ClauseTy *);
+  bool applyClause(const tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy> &clause,
+                   const ClauseTy *);
   bool
   applyClause(const tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy> &clause,
               const ClauseTy *);
@@ -1103,6 +1105,13 @@ bool ConstructDecompositionT<C, H>::applyClause(
   return applyToOutermost(node);
 }
 
+template <typename C, typename H>
+bool ConstructDecompositionT<C, H>::applyClause(
+    const tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy> &clause,
+    const ClauseTy *node) {
+  return applyToAll(node);
+}
+
 template <typename C, typename H>
 bool ConstructDecompositionT<C, H>::applyClause(
     const tomp::clause::OmpxAttributeT<TypeTy, IdTy, ExprTy> &clause,
diff --git a/llvm/include/llvm/Frontend/OpenMP/OMP.td b/llvm/include/llvm/Frontend/OpenMP/OMP.td
index fcf087d1f9c6e4..ac5cf1babb4176 100644
--- a/llvm/include/llvm/Frontend/OpenMP/OMP.td
+++ b/llvm/include/llvm/Frontend/OpenMP/OMP.td
@@ -980,6 +980,7 @@ def OMP_Target : Directive<"target"> {
     VersionedClause<OMPC_Device>,
     VersionedClause<OMPC_If>,
     VersionedClause<OMPC_NoWait>,
+    VersionedClause<OMPC_OMPX_Bare>,
     VersionedClause<OMPC_OMPX_DynCGroupMem>,
     VersionedClause<OMPC_ThreadLimit, 51>,
   ];
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
index 886554f66afffc..fe84aa744b9dbb 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
@@ -1270,4 +1270,29 @@ class OpenMP_UseDevicePtrClauseSkip<
 
 def OpenMP_UseDevicePtrClause : OpenMP_UseDevicePtrClauseSkip<>;
 
+//===----------------------------------------------------------------------===//
+// LLVM OpenMP extension `ompx_bare` clause
+//===----------------------------------------------------------------------===//
+
+class OpenMP_BareClauseSkip<
+    bit traits = false, bit arguments = false, bit assemblyFormat = false,
+    bit description = false, bit extraClassDeclaration = false
+  > : OpenMP_Clause<traits, arguments, assemblyFormat, description,
+                    extraClassDeclaration> {
+  let arguments = (ins
+    UnitAttr:$bare
+  );
+
+  let optAssemblyFormat = [{
+    `ompx_bare` $bare
+  }];
+
+  let description = [{
+    ompx_bare allows `omp target teams` to be executed on a GPU with multi-dim
+    teams and threads.
+  }];
+}
+
+def OpenMP_BareClause : OpenMP_BareClauseSkip<>;
+
 #endif // OPENMP_CLAUSES
diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index 66f63fc02fe2f3..ee8aeee1278b1c 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -1108,7 +1108,8 @@ def TargetOp : OpenMP_Op<"target", traits = [
     OpenMP_AllocateClause, OpenMP_DependClause, OpenMP_DeviceClause,
     OpenMP_HasDeviceAddrClause, OpenMP_IfClause, OpenMP_InReductionClause,
     OpenMP_IsDevicePtrClause, OpenMP_MapClauseSkip<assemblyFormat = true>,
-    OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause
+    OpenMP_NowaitClause, OpenMP_PrivateClause, OpenMP_ThreadLimitClause,
+    OpenMP_BareClause,
   ], singleRegion = true> {
   let summary = "target construct";
   let description = [{
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index d516c8d9e0be6c..a7c35a3bf3365b 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -1677,7 +1677,8 @@ void TargetOp::build(OpBuilder &builder, OperationState &state,
                   /*in_reduction_vars=*/{}, /*in_reduction_byref=*/nullptr,
                   /*in_reduction_syms=*/nullptr, clauses.isDevicePtrVars,
                   clauses.mapVars, clauses.nowait, clauses.privateVars,
-                  makeArrayAttr(ctx, clauses.privateSyms), clauses.threadLimit);
+                  makeArrayAttr(ctx, clauses.privateSyms), clauses.threadLimit,
+                  clauses.bare);
 }
 
 LogicalResult TargetOp::verify() {

Copy link
Contributor

@mjklemm mjklemm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a TODO message in Lowering if there is no translation from OpenMP dialect to LLVM IR. Otherwise this will manifest as a crash.

Copy link
Member

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general approach to add the clause to the dialect looks good to me. One question I have is whether ompx_bare is only specifically supported for the combined target teams construct or if it can appear on a standalone target. If that's the case, I guess a semantics check for it would also be in order.

At the moment, the num_teams_upper and thread_limit arguments are defined as optional. I'm guessing they should become variadic and add some MLIR verification code to ensure the ompx_bare clause, num_teams and thread_limit are always in sync. Is that the plan or did you have something else in mind?

bool ConstructDecompositionT<C, H>::applyClause(
const tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy> &clause,
const ClauseTy *node) {
return applyToAll(node);
Copy link
Member

@skatrak skatrak Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be applyToOutermost? I'm not so familiar with this clause, but in this patch it's only processed for the target construct. For example, in target teams ompx_bare ... this actually results in the clause being applied to both target and teams, but it would be missing lowering for the latter.

Edit: Thinking about it a bit more, the clause seems like it should be applied to the teams leaf construct as well. Perhaps the actual change should be to add that clause to the omp.teams operation and process it in lowering too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent of ompx_bare is to support writing cuda/hip-like kernels on GPUs with a specific number of teams and threads, so it should be fine as is with only applying it to the target. The sema check should make sure it is only allowed if it is a part of target teams (thank you @kparzysz)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this a bit more, and perhaps we should apply it only to the innermost (i.e. teams). The reason I'm thinking this is that in that case it could be processed the same as a thread_limit + a num_teams clause attached to that construct. In that case, we wouldn't need to explicitly represent ompx_bare in the dialect either, we'd only update these other clauses to accept multiple dimensions.

Copy link
Contributor Author

@ivanradanov ivanradanov Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to know whether ompx_bare was present when we translate to llvm, because the runtime must run the target region with the number of threads and teams specified (instead of thread_limit and num_teams being just mere hints and depending on ICV's otherwise). So I think the easiest way to do that is to represent the clause in the dialect. I am fine with having in on either omp.target or omp.teams (or both) and which one we have it on only has minor implications on the op verification and translation implementation and does not matter much in my opinion, so having it on the omp.target should be fine in my opinion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, if it involves some slightly different semantics then I agree that it should be represented separately. In that case, having it attached to omp.target seems reasonable to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging, please remember to swap the call from applyToAll to applyToOutermost.

@kparzysz
Copy link
Contributor

kparzysz commented Oct 7, 2024

I guess a semantics check for it would also be in order.

I actually tried to implement the semantic check for ompx_bare a while back before I realized that the parser doesn't handle it. I might still have that code lying around somewhere. We should include the corresponding check in clang as well. I'll try to add a commit to this PR with these changes.

@kparzysz
Copy link
Contributor

kparzysz commented Oct 7, 2024

I'll try to add a commit to this PR with these changes.

I don't have permission to push to Ivan's fork, so I pushed it to my fork. It's the most recent commit:
https://github.com/kparzysz/llvm-project/commits/ivan/flang-bare-clause

@ivanradanov
Copy link
Contributor Author

@jdoerfert Could you have a quick look at the clang-side change if that is acceptable?

Copy link
Member

@skatrak skatrak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Ivan, Flang and MLIR changes look good to me. Sorry for the delay getting back to this!

@@ -350,6 +350,10 @@ bool ClauseProcessor::processNowait(mlir::omp::NowaitClauseOps &result) const {
return markClauseOccurrence<omp::clause::Nowait>(result.nowait);
}

bool ClauseProcessor::processBare(mlir::omp::BareClauseOps &result) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Move above ClauseProcessor::processCollapse.

bool ConstructDecompositionT<C, H>::applyClause(
const tomp::clause::OmpxBareT<TypeTy, IdTy, ExprTy> &clause,
const ClauseTy *node) {
return applyToAll(node);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging, please remember to swap the call from applyToAll to applyToOutermost.

Copy link

github-actions bot commented Dec 11, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Fix ompx_bare printing

Add test

test

Order

Only accept ompx_bare on a combined "TARGET TEAMS" construct

Alphabetical ordering

Fail in the backend if we find ompx_bare

Fix test

Fix `omp target` error in clang

Add one more test

Fix rebase error

%openmp_flags is only available when omprt is built

Update mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp

Co-authored-by: Sergio Afonso <[email protected]>

Address review comments

fix rebase

Remove stray include
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Dec 13, 2024
@llvmbot llvmbot added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Dec 13, 2024
@ivanradanov ivanradanov merged commit 7c9404c into llvm:main Dec 13, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category flang:fir-hlfir flang:openmp flang:parser flang:semantics flang Flang issues not falling into any other category mlir:llvm mlir:openmp mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants