[NVPTX] Use prmt.f4e to lower pointer alignment fshr idiom #143407

AlexMaclean · 2025-06-09T16:33:58Z

No description provided.

llvmbot · 2025-06-09T16:34:33Z

@llvm/pr-subscribers-backend-nvptx

Author: Alex MacLean (AlexMaclean)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/143407.diff

2 Files Affected:

(modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.td (+4)
(modified) llvm/test/CodeGen/NVPTX/prmt.ll (+19-2)

diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index b646d39194c7e..a2c85feda1d18 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -1636,6 +1636,10 @@ let hasSideEffects = false in {
 
 }
 
+// PRMT folding patterns
+def : Pat<(fshr i32:$hi, i32:$lo, (shl i32:$amt, (i32 3))),
+          (PRMT_B32rrr $lo, $hi, $amt, PrmtF4E)>;
+
 
 // byte extraction + signed/unsigned extension to i32.
 def : Pat<(i32 (sext_inreg (bfe i32:$s, i32:$o, 8), i8)),
diff --git a/llvm/test/CodeGen/NVPTX/prmt.ll b/llvm/test/CodeGen/NVPTX/prmt.ll
index 271e4c86cd23e..48b9eefb9fb30 100644
--- a/llvm/test/CodeGen/NVPTX/prmt.ll
+++ b/llvm/test/CodeGen/NVPTX/prmt.ll
@@ -1,6 +1,6 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
-; RUN: llc < %s -verify-machineinstrs | FileCheck %s
-; RUN: %if ptxas %{ llc < %s -verify-machineinstrs | %ptxas-verify %}
+; RUN: llc < %s -verify-machineinstrs -mcpu=sm_50 | FileCheck %s
+; RUN: %if ptxas %{ llc < %s -verify-machineinstrs -mcpu=sm_50 | %ptxas-verify %}
 
 target triple = "nvptx64-nvidia-cuda"
 
@@ -111,3 +111,20 @@ define i32 @test_prmt_rc16(i32 %lo, i32 %selector) {
   %val = call i32 @llvm.nvvm.prmt.rc16(i32 %lo, i32 %selector)
   ret i32 %val
 }
+
+define i32 @test_prmt_f4e_folding(i32 %lo, i32 %hi, i32 %ptr) {
+; CHECK-LABEL: test_prmt_f4e_folding(
+; CHECK:       {
+; CHECK-NEXT:    .reg .b32 %r<5>;
+; CHECK-EMPTY:
+; CHECK-NEXT:  // %bb.0:
+; CHECK-NEXT:    ld.param.b32 %r1, [test_prmt_f4e_folding_param_0];
+; CHECK-NEXT:    ld.param.b32 %r2, [test_prmt_f4e_folding_param_1];
+; CHECK-NEXT:    ld.param.b32 %r3, [test_prmt_f4e_folding_param_2];
+; CHECK-NEXT:    prmt.b32.f4e %r4, %r1, %r2, %r3;
+; CHECK-NEXT:    st.param.b32 [func_retval0], %r4;
+; CHECK-NEXT:    ret;
+  %sh_amt = shl i32 %ptr, 3
+  %val = call i32 @llvm.fshr.i32(i32 %hi, i32 %lo, i32 %sh_amt)
+  ret i32 %val
+}

justinfargnoli

Ignorable nit: could be nice to test the other cases (0, 1, 2) and add negative test cases.

LGTM regardless :)

[NVPTX] Use prmt.f4e to lower pointer alignment fshr idiom

85b6f5b

AlexMaclean requested review from Artem-B and justinfargnoli June 9, 2025 16:33

AlexMaclean self-assigned this Jun 9, 2025

llvmbot added the backend:NVPTX label Jun 9, 2025

justinfargnoli approved these changes Jun 9, 2025

View reviewed changes

Artem-B approved these changes Jun 9, 2025

View reviewed changes

AlexMaclean merged commit 4e80a03 into llvm:main Jun 13, 2025
9 checks passed

tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025

[NVPTX] Use prmt.f4e to lower pointer alignment fshr idiom (llvm#143407)

706530a

akuhlens pushed a commit to akuhlens/llvm-project that referenced this pull request Jun 24, 2025

[NVPTX] Use prmt.f4e to lower pointer alignment fshr idiom (llvm#143407)

551b254

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NVPTX] Use prmt.f4e to lower pointer alignment fshr idiom #143407

[NVPTX] Use prmt.f4e to lower pointer alignment fshr idiom #143407

Uh oh!

AlexMaclean commented Jun 9, 2025

Uh oh!

llvmbot commented Jun 9, 2025

Uh oh!

justinfargnoli left a comment

Uh oh!

Uh oh!

Uh oh!

[NVPTX] Use prmt.f4e to lower pointer alignment fshr idiom #143407

[NVPTX] Use prmt.f4e to lower pointer alignment fshr idiom #143407

Uh oh!

Conversation

AlexMaclean commented Jun 9, 2025

Uh oh!

llvmbot commented Jun 9, 2025

Uh oh!

justinfargnoli left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!