[SelectionDAG][RISCV] Preserve nneg flag when folding (trunc (zext X))->(zext X). #144807

topperc · 2025-06-18T22:18:46Z

If X is known non-negative, that's still true if we fold the truncate
to create a smaller zext.

In the i128 tests, SelectionDAGBuilder aggressively truncates the
zext nneg to i64 to match getShiftAmountTy. If we don't preserve
the nneg we can't see that the shift amount argument being signext
means we don't need to do any extension

…)->(zext X). If X is known non-negative, that's still true if we fold the truncate to create a smaller zext. In the 128 tests, SelectionDAGBuilder agressively truncates the zext nneg to i64 to match getShiftAmountTy. If we don't preserve the nneg we can't see that the shift amount argument be signext mean we don't need to do any extension.

llvmbot · 2025-06-18T22:19:22Z

@llvm/pr-subscribers-llvm-selectiondag

Author: Craig Topper (topperc)

Changes

If X is known non-negative, that's still true if we fold the truncate
to create a smaller zext.

In the i128 tests, SelectionDAGBuilder aggressively truncates the
zext nneg to i64 to match getShiftAmountTy. If we don't preserve
the nneg we can't see that the shift amount argument being signext
means we don't need to do any extension

Full diff: https://github.com/llvm/llvm-project/pull/144807.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+6-2)
(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+6-2)
(modified) llvm/test/CodeGen/RISCV/shifts.ll (+295)

diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 0e078f9dd88b4..a6b9cc81edde6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -15740,8 +15740,12 @@ SDValue DAGCombiner::visitTRUNCATE(SDNode *N) {
       N0.getOpcode() == ISD::SIGN_EXTEND ||
       N0.getOpcode() == ISD::ANY_EXTEND) {
     // if the source is smaller than the dest, we still need an extend.
-    if (N0.getOperand(0).getValueType().bitsLT(VT))
-      return DAG.getNode(N0.getOpcode(), DL, VT, N0.getOperand(0));
+    if (N0.getOperand(0).getValueType().bitsLT(VT)) {
+      SDNodeFlags Flags;
+      if (N0.getOpcode() == ISD::ZERO_EXTEND)
+        Flags.setNonNeg(N0->getFlags().hasNonNeg());
+      return DAG.getNode(N0.getOpcode(), DL, VT, N0.getOperand(0), Flags);
+    }
     // if the source is larger than the dest, than we just need the truncate.
     if (N0.getOperand(0).getValueType().bitsGT(VT))
       return DAG.getNode(ISD::TRUNCATE, DL, VT, N0.getOperand(0));
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index b0e3f534e2aaa..5d8db8be9731f 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -6474,8 +6474,12 @@ SDValue SelectionDAG::getNode(unsigned Opcode, const SDLoc &DL, EVT VT,
         OpOpcode == ISD::ANY_EXTEND) {
       // If the source is smaller than the dest, we still need an extend.
       if (N1.getOperand(0).getValueType().getScalarType().bitsLT(
-              VT.getScalarType()))
-        return getNode(OpOpcode, DL, VT, N1.getOperand(0));
+              VT.getScalarType())) {
+        SDNodeFlags Flags;
+        if (OpOpcode == ISD::ZERO_EXTEND)
+          Flags.setNonNeg(N1->getFlags().hasNonNeg());
+        return getNode(OpOpcode, DL, VT, N1.getOperand(0), Flags);
+      }
       if (N1.getOperand(0).getValueType().bitsGT(VT))
         return getNode(ISD::TRUNCATE, DL, VT, N1.getOperand(0));
       return N1.getOperand(0);
diff --git a/llvm/test/CodeGen/RISCV/shifts.ll b/llvm/test/CodeGen/RISCV/shifts.ll
index 249dabba0cc28..32a037918a5a7 100644
--- a/llvm/test/CodeGen/RISCV/shifts.ll
+++ b/llvm/test/CodeGen/RISCV/shifts.ll
@@ -484,3 +484,298 @@ define i128 @fshr128_minsize(i128 %a, i128 %b) minsize nounwind {
   %res = tail call i128 @llvm.fshr.i128(i128 %a, i128 %a, i128 %b)
   ret i128 %res
 }
+
+define i64 @lshr64_shamt32(i64 %a, i32 signext %b) nounwind {
+; RV32I-LABEL: lshr64_shamt32:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi a4, a2, -32
+; RV32I-NEXT:    srl a3, a1, a2
+; RV32I-NEXT:    bltz a4, .LBB11_2
+; RV32I-NEXT:  # %bb.1:
+; RV32I-NEXT:    mv a0, a3
+; RV32I-NEXT:    j .LBB11_3
+; RV32I-NEXT:  .LBB11_2:
+; RV32I-NEXT:    srl a0, a0, a2
+; RV32I-NEXT:    not a2, a2
+; RV32I-NEXT:    slli a1, a1, 1
+; RV32I-NEXT:    sll a1, a1, a2
+; RV32I-NEXT:    or a0, a0, a1
+; RV32I-NEXT:  .LBB11_3:
+; RV32I-NEXT:    srai a1, a4, 31
+; RV32I-NEXT:    and a1, a1, a3
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: lshr64_shamt32:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    srl a0, a0, a1
+; RV64I-NEXT:    ret
+  %zext = zext nneg i32 %b to i64
+  %1 = lshr i64 %a, %zext
+  ret i64 %1
+}
+
+define i64 @ashr64_shamt32(i64 %a, i32 signext %b) nounwind {
+; RV32I-LABEL: ashr64_shamt32:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    mv a3, a1
+; RV32I-NEXT:    addi a4, a2, -32
+; RV32I-NEXT:    sra a1, a1, a2
+; RV32I-NEXT:    bltz a4, .LBB12_2
+; RV32I-NEXT:  # %bb.1:
+; RV32I-NEXT:    srai a3, a3, 31
+; RV32I-NEXT:    mv a0, a1
+; RV32I-NEXT:    mv a1, a3
+; RV32I-NEXT:    ret
+; RV32I-NEXT:  .LBB12_2:
+; RV32I-NEXT:    srl a0, a0, a2
+; RV32I-NEXT:    not a2, a2
+; RV32I-NEXT:    slli a3, a3, 1
+; RV32I-NEXT:    sll a2, a3, a2
+; RV32I-NEXT:    or a0, a0, a2
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: ashr64_shamt32:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sra a0, a0, a1
+; RV64I-NEXT:    ret
+  %zext = zext nneg i32 %b to i64
+  %1 = ashr i64 %a, %zext
+  ret i64 %1
+}
+
+define i64 @shl64_shamt32(i64 %a, i32 signext %b) nounwind {
+; RV32I-LABEL: shl64_shamt32:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi a4, a2, -32
+; RV32I-NEXT:    sll a3, a0, a2
+; RV32I-NEXT:    bltz a4, .LBB13_2
+; RV32I-NEXT:  # %bb.1:
+; RV32I-NEXT:    mv a1, a3
+; RV32I-NEXT:    j .LBB13_3
+; RV32I-NEXT:  .LBB13_2:
+; RV32I-NEXT:    sll a1, a1, a2
+; RV32I-NEXT:    not a2, a2
+; RV32I-NEXT:    srli a0, a0, 1
+; RV32I-NEXT:    srl a0, a0, a2
+; RV32I-NEXT:    or a1, a1, a0
+; RV32I-NEXT:  .LBB13_3:
+; RV32I-NEXT:    srai a0, a4, 31
+; RV32I-NEXT:    and a0, a0, a3
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: shl64_shamt32:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    sll a0, a0, a1
+; RV64I-NEXT:    ret
+  %zext = zext nneg i32 %b to i64
+  %1 = shl i64 %a, %zext
+  ret i64 %1
+}
+
+define i128 @lshr128_shamt32(i128 %a, i32 signext %b) nounwind {
+; RV32I-LABEL: lshr128_shamt32:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -32
+; RV32I-NEXT:    lw a3, 0(a1)
+; RV32I-NEXT:    lw a4, 4(a1)
+; RV32I-NEXT:    lw a5, 8(a1)
+; RV32I-NEXT:    lw a1, 12(a1)
+; RV32I-NEXT:    sw zero, 16(sp)
+; RV32I-NEXT:    sw zero, 20(sp)
+; RV32I-NEXT:    sw zero, 24(sp)
+; RV32I-NEXT:    sw zero, 28(sp)
+; RV32I-NEXT:    srli a6, a2, 3
+; RV32I-NEXT:    mv a7, sp
+; RV32I-NEXT:    andi t0, a2, 31
+; RV32I-NEXT:    andi a6, a6, 12
+; RV32I-NEXT:    xori t0, t0, 31
+; RV32I-NEXT:    add a6, a7, a6
+; RV32I-NEXT:    sw a3, 0(sp)
+; RV32I-NEXT:    sw a4, 4(sp)
+; RV32I-NEXT:    sw a5, 8(sp)
+; RV32I-NEXT:    sw a1, 12(sp)
+; RV32I-NEXT:    lw a1, 0(a6)
+; RV32I-NEXT:    lw a3, 4(a6)
+; RV32I-NEXT:    lw a4, 8(a6)
+; RV32I-NEXT:    lw a5, 12(a6)
+; RV32I-NEXT:    srl a1, a1, a2
+; RV32I-NEXT:    slli a6, a3, 1
+; RV32I-NEXT:    srl a3, a3, a2
+; RV32I-NEXT:    slli a7, a4, 1
+; RV32I-NEXT:    srl a4, a4, a2
+; RV32I-NEXT:    srl a2, a5, a2
+; RV32I-NEXT:    slli a5, a5, 1
+; RV32I-NEXT:    sll a6, a6, t0
+; RV32I-NEXT:    sll a7, a7, t0
+; RV32I-NEXT:    sll a5, a5, t0
+; RV32I-NEXT:    or a1, a1, a6
+; RV32I-NEXT:    or a3, a3, a7
+; RV32I-NEXT:    or a4, a4, a5
+; RV32I-NEXT:    sw a1, 0(a0)
+; RV32I-NEXT:    sw a3, 4(a0)
+; RV32I-NEXT:    sw a4, 8(a0)
+; RV32I-NEXT:    sw a2, 12(a0)
+; RV32I-NEXT:    addi sp, sp, 32
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: lshr128_shamt32:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    addi a4, a2, -64
+; RV64I-NEXT:    srl a3, a1, a2
+; RV64I-NEXT:    bltz a4, .LBB14_2
+; RV64I-NEXT:  # %bb.1:
+; RV64I-NEXT:    mv a0, a3
+; RV64I-NEXT:    j .LBB14_3
+; RV64I-NEXT:  .LBB14_2:
+; RV64I-NEXT:    srl a0, a0, a2
+; RV64I-NEXT:    not a2, a2
+; RV64I-NEXT:    slli a1, a1, 1
+; RV64I-NEXT:    sll a1, a1, a2
+; RV64I-NEXT:    or a0, a0, a1
+; RV64I-NEXT:  .LBB14_3:
+; RV64I-NEXT:    srai a1, a4, 63
+; RV64I-NEXT:    and a1, a1, a3
+; RV64I-NEXT:    ret
+  %zext = zext nneg i32 %b to i128
+  %1 = lshr i128 %a, %zext
+  ret i128 %1
+}
+
+define i128 @ashr128_shamt32(i128 %a, i32 signext %b) nounwind {
+; RV32I-LABEL: ashr128_shamt32:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -32
+; RV32I-NEXT:    lw a3, 0(a1)
+; RV32I-NEXT:    lw a4, 4(a1)
+; RV32I-NEXT:    lw a5, 8(a1)
+; RV32I-NEXT:    lw a1, 12(a1)
+; RV32I-NEXT:    srli a6, a2, 3
+; RV32I-NEXT:    mv a7, sp
+; RV32I-NEXT:    andi t0, a2, 31
+; RV32I-NEXT:    andi a6, a6, 12
+; RV32I-NEXT:    xori t0, t0, 31
+; RV32I-NEXT:    add a6, a7, a6
+; RV32I-NEXT:    sw a3, 0(sp)
+; RV32I-NEXT:    sw a4, 4(sp)
+; RV32I-NEXT:    sw a5, 8(sp)
+; RV32I-NEXT:    sw a1, 12(sp)
+; RV32I-NEXT:    srai a1, a1, 31
+; RV32I-NEXT:    sw a1, 16(sp)
+; RV32I-NEXT:    sw a1, 20(sp)
+; RV32I-NEXT:    sw a1, 24(sp)
+; RV32I-NEXT:    sw a1, 28(sp)
+; RV32I-NEXT:    lw a1, 0(a6)
+; RV32I-NEXT:    lw a3, 4(a6)
+; RV32I-NEXT:    lw a4, 8(a6)
+; RV32I-NEXT:    lw a5, 12(a6)
+; RV32I-NEXT:    srl a1, a1, a2
+; RV32I-NEXT:    slli a6, a3, 1
+; RV32I-NEXT:    srl a3, a3, a2
+; RV32I-NEXT:    slli a7, a4, 1
+; RV32I-NEXT:    srl a4, a4, a2
+; RV32I-NEXT:    sra a2, a5, a2
+; RV32I-NEXT:    slli a5, a5, 1
+; RV32I-NEXT:    sll a6, a6, t0
+; RV32I-NEXT:    sll a7, a7, t0
+; RV32I-NEXT:    sll a5, a5, t0
+; RV32I-NEXT:    or a1, a1, a6
+; RV32I-NEXT:    or a3, a3, a7
+; RV32I-NEXT:    or a4, a4, a5
+; RV32I-NEXT:    sw a1, 0(a0)
+; RV32I-NEXT:    sw a3, 4(a0)
+; RV32I-NEXT:    sw a4, 8(a0)
+; RV32I-NEXT:    sw a2, 12(a0)
+; RV32I-NEXT:    addi sp, sp, 32
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: ashr128_shamt32:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    mv a3, a1
+; RV64I-NEXT:    addi a4, a2, -64
+; RV64I-NEXT:    sra a1, a1, a2
+; RV64I-NEXT:    bltz a4, .LBB15_2
+; RV64I-NEXT:  # %bb.1:
+; RV64I-NEXT:    srai a3, a3, 63
+; RV64I-NEXT:    mv a0, a1
+; RV64I-NEXT:    mv a1, a3
+; RV64I-NEXT:    ret
+; RV64I-NEXT:  .LBB15_2:
+; RV64I-NEXT:    srl a0, a0, a2
+; RV64I-NEXT:    not a2, a2
+; RV64I-NEXT:    slli a3, a3, 1
+; RV64I-NEXT:    sll a2, a3, a2
+; RV64I-NEXT:    or a0, a0, a2
+; RV64I-NEXT:    ret
+  %zext = zext nneg i32 %b to i128
+  %1 = ashr i128 %a, %zext
+  ret i128 %1
+}
+
+define i128 @shl128_shamt32(i128 %a, i32 signext %b) nounwind {
+; RV32I-LABEL: shl128_shamt32:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    addi sp, sp, -32
+; RV32I-NEXT:    lw a3, 0(a1)
+; RV32I-NEXT:    lw a4, 4(a1)
+; RV32I-NEXT:    lw a5, 8(a1)
+; RV32I-NEXT:    lw a1, 12(a1)
+; RV32I-NEXT:    sw zero, 0(sp)
+; RV32I-NEXT:    sw zero, 4(sp)
+; RV32I-NEXT:    sw zero, 8(sp)
+; RV32I-NEXT:    sw zero, 12(sp)
+; RV32I-NEXT:    srli a6, a2, 3
+; RV32I-NEXT:    addi a7, sp, 16
+; RV32I-NEXT:    andi t0, a2, 31
+; RV32I-NEXT:    andi a6, a6, 12
+; RV32I-NEXT:    sub a6, a7, a6
+; RV32I-NEXT:    sw a3, 16(sp)
+; RV32I-NEXT:    sw a4, 20(sp)
+; RV32I-NEXT:    sw a5, 24(sp)
+; RV32I-NEXT:    sw a1, 28(sp)
+; RV32I-NEXT:    lw a1, 0(a6)
+; RV32I-NEXT:    lw a3, 4(a6)
+; RV32I-NEXT:    lw a4, 8(a6)
+; RV32I-NEXT:    lw a5, 12(a6)
+; RV32I-NEXT:    xori a6, t0, 31
+; RV32I-NEXT:    sll a7, a3, a2
+; RV32I-NEXT:    srli t0, a1, 1
+; RV32I-NEXT:    sll a5, a5, a2
+; RV32I-NEXT:    sll a1, a1, a2
+; RV32I-NEXT:    sll a2, a4, a2
+; RV32I-NEXT:    srli a3, a3, 1
+; RV32I-NEXT:    srli a4, a4, 1
+; RV32I-NEXT:    srl t0, t0, a6
+; RV32I-NEXT:    srl a3, a3, a6
+; RV32I-NEXT:    srl a4, a4, a6
+; RV32I-NEXT:    or a6, a7, t0
+; RV32I-NEXT:    or a2, a2, a3
+; RV32I-NEXT:    or a4, a5, a4
+; RV32I-NEXT:    sw a1, 0(a0)
+; RV32I-NEXT:    sw a6, 4(a0)
+; RV32I-NEXT:    sw a2, 8(a0)
+; RV32I-NEXT:    sw a4, 12(a0)
+; RV32I-NEXT:    addi sp, sp, 32
+; RV32I-NEXT:    ret
+;
+; RV64I-LABEL: shl128_shamt32:
+; RV64I:       # %bb.0:
+; RV64I-NEXT:    addi a4, a2, -64
+; RV64I-NEXT:    sll a3, a0, a2
+; RV64I-NEXT:    bltz a4, .LBB16_2
+; RV64I-NEXT:  # %bb.1:
+; RV64I-NEXT:    mv a1, a3
+; RV64I-NEXT:    j .LBB16_3
+; RV64I-NEXT:  .LBB16_2:
+; RV64I-NEXT:    sll a1, a1, a2
+; RV64I-NEXT:    not a2, a2
+; RV64I-NEXT:    srli a0, a0, 1
+; RV64I-NEXT:    srl a0, a0, a2
+; RV64I-NEXT:    or a1, a1, a0
+; RV64I-NEXT:  .LBB16_3:
+; RV64I-NEXT:    srai a0, a4, 63
+; RV64I-NEXT:    and a0, a0, a3
+; RV64I-NEXT:    ret
+  %zext = zext nneg i32 %b to i128
+  %1 = shl i128 %a, %zext
+  ret i128 %1
+}

RKSimon

LGTM

llvm-ci · 2025-06-19T15:13:52Z

LLVM Buildbot has detected a new failure on builder mlir-nvidia running on mlir-nvidia while building llvm at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/138/builds/14825

Here is the relevant piece of the build log for the reference

Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/GPU/CUDA/async.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-kernel-outlining  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -pass-pipeline='builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary="format=fatbin"  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-runner    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_async_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so    --entry-point-result=void -O0  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-kernel-outlining
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt '-pass-pipeline=builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary=format=fatbin
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0
# .---command stderr------------
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventSynchronize(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# `-----------------------------
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# .---command stderr------------
# | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir:68:12: error: CHECK: expected string not found in input
# |  // CHECK: [84, 84]
# |            ^
# | <stdin>:1:1: note: scanning from here
# | Unranked Memref base@ = 0x59153dd66fa0 rank = 1 offset = 0 sizes = [2] strides = [1] data = 
# | ^
# | <stdin>:2:1: note: possible intended match here
# | [42, 42]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Unranked Memref base@ = 0x59153dd66fa0 rank = 1 offset = 0 sizes = [2] strides = [1] data =  
# | check:68'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             2: [42, 42] 
# | check:68'0     ~~~~~~~~~
# | check:68'1     ?         possible intended match
...

topperc added 2 commits June 18, 2025 15:06

Pre-commit tests

ac62dbf

topperc requested review from nikic, preames and RKSimon June 18, 2025 22:18

llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Jun 18, 2025

RKSimon approved these changes Jun 19, 2025

View reviewed changes

nikic approved these changes Jun 19, 2025

View reviewed changes

topperc merged commit 5eb24fd into llvm:main Jun 19, 2025
9 checks passed

topperc deleted the pr/zext-nneg branch June 19, 2025 15:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SelectionDAG][RISCV] Preserve nneg flag when folding (trunc (zext X))->(zext X). #144807

[SelectionDAG][RISCV] Preserve nneg flag when folding (trunc (zext X))->(zext X). #144807

Uh oh!

topperc commented Jun 18, 2025

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

RKSimon left a comment

Uh oh!

Uh oh!

llvm-ci commented Jun 19, 2025

Uh oh!

Uh oh!

[SelectionDAG][RISCV] Preserve nneg flag when folding (trunc (zext X))->(zext X). #144807

[SelectionDAG][RISCV] Preserve nneg flag when folding (trunc (zext X))->(zext X). #144807

Uh oh!

Conversation

topperc commented Jun 18, 2025

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Jun 19, 2025

Uh oh!

Uh oh!