-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[OpenACC][CIR] Implement basic lowering for combined constructs #139119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Combined constructs are emitted a little oddly, in that they are the only one where there are two operations for a single construct. First, the compute variant is emitted with 'combined(loop)', then the loop operation is emitted with 'combined(<variant>)'. Each gets its own normal terminator. This patch does not yet implement clauses at all, since that is going to require special attention to make sure we get the emitting of them correct, since certain clauses go to different locations, and need their insertion-points set correctly. So this patch sets it up so that we will emit the 'not implemented' diagnostic for all clauses.
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clangir Author: Erich Keane (erichkeane) ChangesCombined constructs are emitted a little oddly, in that they are the only one where there are two operations for a single construct. First, the compute variant is emitted with 'combined(loop)', then the loop operation is emitted with 'combined(<variant>)'. Each gets its own normal terminator. This patch does not yet implement clauses at all, since that is going to require special attention to make sure we get the emitting of them correct, since certain clauses go to different locations, and need their insertion-points set correctly. So this patch sets it up so that we will emit the 'not implemented' diagnostic for all clauses. Full diff: https://github.com/llvm/llvm-project/pull/139119.diff 4 Files Affected:
diff --git a/clang/lib/CIR/CodeGen/CIRGenFunction.h b/clang/lib/CIR/CodeGen/CIRGenFunction.h
index 9066107af595e..f7670eda7ef87 100644
--- a/clang/lib/CIR/CodeGen/CIRGenFunction.h
+++ b/clang/lib/CIR/CodeGen/CIRGenFunction.h
@@ -718,6 +718,12 @@ class CIRGenFunction : public CIRGenTypeCache {
SourceLocation dirLoc, llvm::ArrayRef<const OpenACCClause *> clauses,
const Stmt *associatedStmt);
+ template <typename Op, typename TermOp>
+ mlir::LogicalResult emitOpenACCOpCombinedConstruct(
+ mlir::Location start, mlir::Location end, OpenACCDirectiveKind dirKind,
+ SourceLocation dirLoc, llvm::ArrayRef<const OpenACCClause *> clauses,
+ const Stmt *loopStmt);
+
public:
mlir::LogicalResult
emitOpenACCComputeConstruct(const OpenACCComputeConstruct &s);
diff --git a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
index fbbbf70ea97c3..cc2470b395cd5 100644
--- a/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp
@@ -56,6 +56,65 @@ mlir::LogicalResult CIRGenFunction::emitOpenACCOpAssociatedStmt(
return res;
}
+namespace {
+template <typename Op> struct CombinedType;
+template <> struct CombinedType<ParallelOp> {
+ static constexpr mlir::acc::CombinedConstructsType value =
+ mlir::acc::CombinedConstructsType::ParallelLoop;
+};
+template <> struct CombinedType<SerialOp> {
+ static constexpr mlir::acc::CombinedConstructsType value =
+ mlir::acc::CombinedConstructsType::SerialLoop;
+};
+template <> struct CombinedType<KernelsOp> {
+ static constexpr mlir::acc::CombinedConstructsType value =
+ mlir::acc::CombinedConstructsType::KernelsLoop;
+};
+} // namespace
+
+template <typename Op, typename TermOp>
+mlir::LogicalResult CIRGenFunction::emitOpenACCOpCombinedConstruct(
+ mlir::Location start, mlir::Location end, OpenACCDirectiveKind dirKind,
+ SourceLocation dirLoc, llvm::ArrayRef<const OpenACCClause *> clauses,
+ const Stmt *loopStmt) {
+ mlir::LogicalResult res = mlir::success();
+
+ llvm::SmallVector<mlir::Type> retTy;
+ llvm::SmallVector<mlir::Value> operands;
+
+ auto computeOp = builder.create<Op>(start, retTy, operands);
+ computeOp.setCombinedAttr(builder.getUnitAttr());
+ mlir::acc::LoopOp loopOp;
+
+ // First, emit the bodies of both operations, with the loop inside the body of
+ // the combined construct.
+ {
+ mlir::Block &block = computeOp.getRegion().emplaceBlock();
+ mlir::OpBuilder::InsertionGuard guardCase(builder);
+ builder.setInsertionPointToEnd(&block);
+
+ LexicalScope ls{*this, start, builder.getInsertionBlock()};
+ auto loopOp = builder.create<LoopOp>(start, retTy, operands);
+ loopOp.setCombinedAttr(mlir::acc::CombinedConstructsTypeAttr::get(
+ builder.getContext(), CombinedType<Op>::value));
+
+ {
+ mlir::Block &innerBlock = loopOp.getRegion().emplaceBlock();
+ mlir::OpBuilder::InsertionGuard guardCase(builder);
+ builder.setInsertionPointToEnd(&innerBlock);
+
+ LexicalScope ls{*this, start, builder.getInsertionBlock()};
+ res = emitStmt(loopStmt, /*useCurrentScope=*/true);
+
+ builder.create<mlir::acc::YieldOp>(end);
+ }
+
+ builder.create<TermOp>(end);
+ }
+
+ return res;
+}
+
template <typename Op>
Op CIRGenFunction::emitOpenACCOp(
mlir::Location start, OpenACCDirectiveKind dirKind, SourceLocation dirLoc,
@@ -170,8 +229,25 @@ CIRGenFunction::emitOpenACCWaitConstruct(const OpenACCWaitConstruct &s) {
mlir::LogicalResult CIRGenFunction::emitOpenACCCombinedConstruct(
const OpenACCCombinedConstruct &s) {
- cgm.errorNYI(s.getSourceRange(), "OpenACC Combined Construct");
- return mlir::failure();
+ mlir::Location start = getLoc(s.getSourceRange().getBegin());
+ mlir::Location end = getLoc(s.getSourceRange().getEnd());
+
+ switch (s.getDirectiveKind()) {
+ case OpenACCDirectiveKind::ParallelLoop:
+ return emitOpenACCOpCombinedConstruct<ParallelOp, mlir::acc::YieldOp>(
+ start, end, s.getDirectiveKind(), s.getDirectiveLoc(), s.clauses(),
+ s.getLoop());
+ case OpenACCDirectiveKind::SerialLoop:
+ return emitOpenACCOpCombinedConstruct<SerialOp, mlir::acc::YieldOp>(
+ start, end, s.getDirectiveKind(), s.getDirectiveLoc(), s.clauses(),
+ s.getLoop());
+ case OpenACCDirectiveKind::KernelsLoop:
+ return emitOpenACCOpCombinedConstruct<KernelsOp, mlir::acc::TerminatorOp>(
+ start, end, s.getDirectiveKind(), s.getDirectiveLoc(), s.clauses(),
+ s.getLoop());
+ default:
+ llvm_unreachable("invalid compute construct kind");
+ }
}
mlir::LogicalResult CIRGenFunction::emitOpenACCEnterDataConstruct(
const OpenACCEnterDataConstruct &s) {
diff --git a/clang/test/CIR/CodeGenOpenACC/combined.cpp b/clang/test/CIR/CodeGenOpenACC/combined.cpp
new file mode 100644
index 0000000000000..4ea192cdcc9f0
--- /dev/null
+++ b/clang/test/CIR/CodeGenOpenACC/combined.cpp
@@ -0,0 +1,34 @@
+// RUN: %clang_cc1 -fopenacc -Wno-openacc-self-if-potential-conflict -emit-cir -fclangir %s -o - | FileCheck %s
+
+extern "C" void acc_combined(int N) {
+ // CHECK: cir.func @acc_combined(%[[ARG_N:.*]]: !s32i loc{{.*}}) {
+ // CHECK-NEXT: %[[ALLOCA_N:.*]] = cir.alloca !s32i, !cir.ptr<!s32i>, ["N", init]
+ // CHECK-NEXT: cir.store %[[ARG_N]], %[[ALLOCA_N]] : !s32i, !cir.ptr<!s32i>
+
+#pragma acc parallel loop
+ for(unsigned I = 0; I < N; ++I);
+ // CHECK: acc.parallel combined(loop) {
+ // CHECK: acc.loop combined(parallel) {
+ // CHECK: acc.yield
+ // CHECK-NEXT: } loc
+ // CHECK: acc.yield
+ // CHECK-NEXT: } loc
+
+#pragma acc serial loop
+ for(unsigned I = 0; I < N; ++I);
+ // CHECK: acc.serial combined(loop) {
+ // CHECK: acc.loop combined(serial) {
+ // CHECK: acc.yield
+ // CHECK-NEXT: } loc
+ // CHECK: acc.yield
+ // CHECK-NEXT: } loc
+#pragma acc kernels loop
+ for(unsigned I = 0; I < N; ++I);
+
+ // CHECK: acc.kernels combined(loop) {
+ // CHECK: acc.loop combined(kernels) {
+ // CHECK: acc.yield
+ // CHECK-NEXT: } loc
+ // CHECK: acc.terminator
+ // CHECK-NEXT: } loc
+}
diff --git a/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp b/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
index e95d4b8bfacbd..c560ab32aac31 100644
--- a/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
+++ b/clang/test/CIR/CodeGenOpenACC/openacc-not-implemented.cpp
@@ -3,11 +3,10 @@
void HelloWorld(int *A, int *B, int *C, int N) {
-// expected-error@+2{{ClangIR code gen Not Yet Implemented: OpenACC Combined Construct}}
+// expected-error@+2{{ClangIR code gen Not Yet Implemented: OpenACC Atomic Construct}}
// expected-error@+1{{ClangIR code gen Not Yet Implemented: statement}}
-#pragma acc parallel loop
- for (unsigned I = 0; I < N; ++I)
- A[I] = B[I] + C[I];
+#pragma acc atomic
+ N = N + 1;
// expected-error@+1{{ClangIR code gen Not Yet Implemented: OpenACC Declare Construct}}
#pragma acc declare create(A)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
https://mlir.llvm.org/docs/Dialects/OpenACCDialect/#operation-categories |
Thanks, corrected! I meant to more say "that different attributes and operation insertions are going to need to happen", but didn't do a good job of that. Corrected the commit message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Combined constructs are emitted a little oddly, in that they are the first ones where there are two operations for a single construct. First, the compute variant is emitted with 'combined(loop)', then the loop operation is emitted with 'combined()'. Each gets its own normal terminator.
This patch does not yet implement clauses at all, since that is going to require special attention to make sure we get the emitting of them correct, since certain clauses go to different locations, and need their insertion-points set correctly. So this patch sets it up so that we will emit the 'not implemented' diagnostic for all clauses.