Skip to content

[LLVM][NVPTX]: Add aligned versions of cluster barriers #77940

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 13, 2024

Conversation

durga4github
Copy link
Contributor

PTX Doc link for these intrinsics:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-barrier-cluster

This patch adds the '.aligned' variants of the
barrier.cluster intrinsics. lit tests are added
to verify the generated PTX.

PTX Doc for these intrinsics:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-barrier-cluster

This patch adds the '.aligned' variants of the
barrier.cluster intrinsics. lit tests are added
to verify the generated PTX.

Signed-off-by: Durgadoss R <[email protected]>
@llvmbot
Copy link
Member

llvmbot commented Jan 12, 2024

@llvm/pr-subscribers-llvm-ir

Author: Durgadoss R (durga4github)

Changes

PTX Doc link for these intrinsics:
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-barrier-cluster

This patch adds the '.aligned' variants of the
barrier.cluster intrinsics. lit tests are added
to verify the generated PTX.


Full diff: https://github.com/llvm/llvm-project/pull/77940.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/IntrinsicsNVVM.td (+8)
  • (modified) llvm/lib/Target/NVPTX/NVPTXIntrinsics.td (+10)
  • (modified) llvm/test/CodeGen/NVPTX/intrinsics-sm90.ll (+13)
diff --git a/llvm/include/llvm/IR/IntrinsicsNVVM.td b/llvm/include/llvm/IR/IntrinsicsNVVM.td
index cf50f2a59f602f..4665a1169ef4ee 100644
--- a/llvm/include/llvm/IR/IntrinsicsNVVM.td
+++ b/llvm/include/llvm/IR/IntrinsicsNVVM.td
@@ -1372,6 +1372,14 @@ let TargetPrefix = "nvvm" in {
   def int_nvvm_barrier_cluster_wait :
       Intrinsic<[], [], [IntrConvergent, IntrNoCallback]>;
 
+  // 'aligned' versions of the above barrier.cluster.* intrinsics
+  def int_nvvm_barrier_cluster_arrive_aligned :
+      Intrinsic<[], [], [IntrConvergent, IntrNoCallback]>;
+  def int_nvvm_barrier_cluster_arrive_relaxed_aligned :
+      Intrinsic<[], [], [IntrConvergent, IntrNoCallback]>;
+  def int_nvvm_barrier_cluster_wait_aligned :
+      Intrinsic<[], [], [IntrConvergent, IntrNoCallback]>;
+
   // Membar
   def int_nvvm_membar_cta : ClangBuiltin<"__nvvm_membar_cta">,
       Intrinsic<[], [], [IntrNoCallback]>;
diff --git a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
index 6b062a7f39127f..c5dbe350e44472 100644
--- a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+++ b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -132,6 +132,7 @@ def INT_BARRIER_SYNC_CNT_II : NVPTXInst<(outs), (ins i32imm:$id, i32imm:$cnt),
                  "barrier.sync \t$id, $cnt;",
                  [(int_nvvm_barrier_sync_cnt imm:$id, imm:$cnt)]>,
         Requires<[hasPTX<60>, hasSM<30>]>;
+
 class INT_BARRIER_CLUSTER<string variant, Intrinsic Intr,
                           list<Predicate> Preds = [hasPTX<78>, hasSM<90>]>:
         NVPTXInst<(outs), (ins), "barrier.cluster."# variant #";", [(Intr)]>,
@@ -145,6 +146,15 @@ def barrier_cluster_arrive_relaxed:
 def barrier_cluster_wait:
         INT_BARRIER_CLUSTER<"wait", int_nvvm_barrier_cluster_wait>;
 
+// 'aligned' versions of the cluster barrier intrinsics
+def barrier_cluster_arrive_aligned:
+        INT_BARRIER_CLUSTER<"arrive.aligned", int_nvvm_barrier_cluster_arrive_aligned>;
+def barrier_cluster_arrive_relaxed_aligned:
+        INT_BARRIER_CLUSTER<"arrive.relaxed.aligned",
+        int_nvvm_barrier_cluster_arrive_relaxed_aligned, [hasPTX<80>, hasSM<90>]>;
+def barrier_cluster_wait_aligned:
+        INT_BARRIER_CLUSTER<"wait.aligned", int_nvvm_barrier_cluster_wait_aligned>;
+
 class SHFL_INSTR<bit sync, string mode, string reg, bit return_pred,
                  bit offset_imm, bit mask_imm, bit threadmask_imm>
       : NVPTXInst<(outs), (ins), "?", []> {
diff --git a/llvm/test/CodeGen/NVPTX/intrinsics-sm90.ll b/llvm/test/CodeGen/NVPTX/intrinsics-sm90.ll
index a157616db9fb4f..181fbf21129102 100644
--- a/llvm/test/CodeGen/NVPTX/intrinsics-sm90.ll
+++ b/llvm/test/CodeGen/NVPTX/intrinsics-sm90.ll
@@ -133,6 +133,16 @@ define void @test_barrier_cluster() {
        ret void
 }
 
+; CHECK-LABEL: test_barrier_cluster_aligned(
+define void @test_barrier_cluster_aligned() {
+; CHECK: barrier.cluster.arrive.aligned;
+       call void @llvm.nvvm.barrier.cluster.arrive.aligned()
+; CHECK: barrier.cluster.arrive.relaxed.aligned;
+       call void @llvm.nvvm.barrier.cluster.arrive.relaxed.aligned()
+; CHECK: barrier.cluster.wait.aligned;
+       call void @llvm.nvvm.barrier.cluster.wait.aligned()
+       ret void
+}
 
 declare i1 @llvm.nvvm.isspacep.shared.cluster(ptr %p);
 declare ptr @llvm.nvvm.mapa(ptr %p, i32 %r);
@@ -153,4 +163,7 @@ declare i1 @llvm.nvvm.is_explicit_cluster()
 declare void @llvm.nvvm.barrier.cluster.arrive()
 declare void @llvm.nvvm.barrier.cluster.arrive.relaxed()
 declare void @llvm.nvvm.barrier.cluster.wait()
+declare void @llvm.nvvm.barrier.cluster.arrive.aligned()
+declare void @llvm.nvvm.barrier.cluster.arrive.relaxed.aligned()
+declare void @llvm.nvvm.barrier.cluster.wait.aligned()
 declare void @llvm.nvvm.fence.sc.cluster()

@durga4github
Copy link
Contributor Author

@Artem-B , @grypp , Kindly help review

Copy link
Member

@grypp grypp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. But I would wait @Artem-B to review

@durga4github
Copy link
Contributor Author

Could one of you please merge it?

@grypp grypp merged commit 8d817f6 into llvm:main Jan 13, 2024
@durga4github durga4github deleted the durgadossr/cluster_bar_aligned branch January 15, 2024 09:33
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants