Skip to content

[LLVM][NVPTX] Add support for div.full instruction #116482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

schwarzschild-radius
Copy link
Contributor

This commit adds NVPTX support for div.full PTX instruction with test under div.ll. For more information, see PTX ISA

@llvmbot
Copy link
Member

llvmbot commented Nov 16, 2024

@llvm/pr-subscribers-backend-nvptx

@llvm/pr-subscribers-llvm-ir

Author: Pradeep Kumar (schwarzschild-radius)

Changes

This commit adds NVPTX support for div.full PTX instruction with test under div.ll. For more information, see PTX ISA


Full diff: https://github.com/llvm/llvm-project/pull/116482.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/IntrinsicsNVVM.td (+7)
  • (modified) llvm/lib/Target/NVPTX/NVPTXIntrinsics.td (+12)
  • (added) llvm/test/CodeGen/NVPTX/div.ll (+10)
diff --git a/llvm/include/llvm/IR/IntrinsicsNVVM.td b/llvm/include/llvm/IR/IntrinsicsNVVM.td
index 115fcee0b04f22..8802ca2534355c 100644
--- a/llvm/include/llvm/IR/IntrinsicsNVVM.td
+++ b/llvm/include/llvm/IR/IntrinsicsNVVM.td
@@ -820,6 +820,13 @@ let TargetPrefix = "nvvm" in {
       DefaultAttrsIntrinsic<[llvm_double_ty], [llvm_double_ty, llvm_double_ty],
         [IntrNoMem]>;
 
+  def int_nvvm_div_full : ClangBuiltin<"__nvvm_div_full">,
+      DefaultAttrsIntrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty],
+        [IntrNoMem]>;
+  def int_nvvm_div_full_ftz : ClangBuiltin<"__nvvm_div_full_ftz">,
+      DefaultAttrsIntrinsic<[llvm_float_ty], [llvm_float_ty, llvm_float_ty],
+        [IntrNoMem]>;
+
 //
 // Sad
 //
diff --git a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
index 5878940812f62b..5528e7b9fe0dda 100644
--- a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+++ b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -1096,6 +1096,18 @@ def INT_NVVM_DIV_RM_D : F_MATH_2<"div.rm.f64 \t$dst, $src0, $src1;",
 def INT_NVVM_DIV_RP_D : F_MATH_2<"div.rp.f64 \t$dst, $src0, $src1;",
   Float64Regs, Float64Regs, Float64Regs, int_nvvm_div_rp_d>;
 
+def : Pat<(int_nvvm_div_full Float32Regs:$a, Float32Regs:$b),
+          (FDIV32rr Float32Regs:$a, Float32Regs:$b)>;
+
+def : Pat<(int_nvvm_div_full Float32Regs:$a, fpimm:$b),
+          (FDIV32ri Float32Regs:$a, f32imm:$b)>;
+
+def : Pat<(int_nvvm_div_full_ftz Float32Regs:$a, Float32Regs:$b),
+          (FDIV32rr_ftz Float32Regs:$a, Float32Regs:$b)>;
+
+def : Pat<(int_nvvm_div_full_ftz Float32Regs:$a, fpimm:$b),
+          (FDIV32ri_ftz Float32Regs:$a, f32imm:$b)>;
+
 //
 // Sad
 //
diff --git a/llvm/test/CodeGen/NVPTX/div.ll b/llvm/test/CodeGen/NVPTX/div.ll
new file mode 100644
index 00000000000000..e75461999c65e4
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/div.ll
@@ -0,0 +1,10 @@
+; RUN: llc < %s -march=nvptx64 | FileCheck %s
+; RUN: %if ptxas %{ llc < %s -march=nvptx64 | %ptxas-verify %}
+
+define float @div_full(float %a, float %b) {
+  ; CHECK: div.full.f32 {{%f[0-9]+}}, {{%f[0-9]+}}, {{%f[0-9]+}}
+  %1 = call float @llvm.nvvm.div.full(float %a, float %b)
+  ; CHECK: div.full.ftz.f32 {{%f[0-9]+}}, {{%f[0-9]+}}, {{%f[0-9]+}}
+  %2 = call float @llvm.nvvm.div.full.ftz(float %1, float %b)
+  ret float %2
+}
\ No newline at end of file

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w/ a test nit.

This commit adds NVPTX support for div.full PTX instruction with test under div.ll
@schwarzschild-radius schwarzschild-radius merged commit e846148 into llvm:main Nov 26, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants