[PHIElimination] Reuse existing COPY in predecessor basic block #131837

guy-david · 2025-03-18T16:00:16Z

The insertion point of COPY isn't always optimal and could eventually lead to a worse block layout, see the regression test in the first commit.

This change affects many architectures but the amount of total instructions in the test cases seems too be slightly lower.

llvmbot · 2025-03-18T16:00:54Z

@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-regalloc
@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-backend-hexagon

Author: Guy David (guy-david)

Changes

The insertion point of COPY isn't always optimal and could lead to a worse block layout, see the regression test in the first commit (which needs to be reduced).

Patch is 2.30 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/131837.diff

127 Files Affected:

(modified) llvm/lib/CodeGen/PHIElimination.cpp (+9)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll (+8-8)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll (+24-24)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll (+24-24)
(modified) llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir (+1-1)
(added) llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir (+35)
(modified) llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll (+1-1)
(modified) llvm/test/CodeGen/AArch64/atomicrmw-O0.ll (+30-30)
(modified) llvm/test/CodeGen/AArch64/bfis-in-loop.ll (+1-1)
(added) llvm/test/CodeGen/AArch64/block-layout-regression.mir (+107)
(modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-crash.ll (+15-15)
(modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll (+14-14)
(modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions.ll (+6-6)
(modified) llvm/test/CodeGen/AArch64/phi.ll (+20-20)
(modified) llvm/test/CodeGen/AArch64/pr48188.ll (+6-6)
(modified) llvm/test/CodeGen/AArch64/ragreedy-csr.ll (+11-11)
(modified) llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll (+56-57)
(modified) llvm/test/CodeGen/AArch64/reduce-or-opt.ll (+12-12)
(modified) llvm/test/CodeGen/AArch64/sink-and-fold.ll (+3-3)
(modified) llvm/test/CodeGen/AArch64/sve-lsrchain.ll (+7-7)
(modified) llvm/test/CodeGen/AArch64/sve-ptest-removal-sink.ll (+4-4)
(modified) llvm/test/CodeGen/AArch64/swifterror.ll (+8-8)
(modified) llvm/test/CodeGen/AArch64/tbl-loops.ll (+8-8)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll (+74-72)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmin.ll (+74-72)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/divergence-temporal-divergent-i1.ll (+7-7)
(modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll (+832-789)
(modified) llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll (+110-100)
(modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fadd.ll (+1387-1378)
(modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fmax.ll (+924-908)
(modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fmin.ll (+924-908)
(modified) llvm/test/CodeGen/AMDGPU/div_i128.ll (+914-922)
(modified) llvm/test/CodeGen/AMDGPU/div_v2i128.ll (+114-114)
(modified) llvm/test/CodeGen/AMDGPU/divergent-branch-uniform-condition.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll (+29-33)
(modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll (+29-33)
(modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll (+952-950)
(modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll (+658-656)
(modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll (+658-656)
(modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll (+793-791)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+323-323)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+461-461)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fadd.ll (+255-255)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmax.ll (+225-225)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmin.ll (+225-225)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fsub.ll (+227-227)
(modified) llvm/test/CodeGen/AMDGPU/indirect-addressing-si.ll (+62-77)
(modified) llvm/test/CodeGen/AMDGPU/move-to-valu-atomicrmw-system.ll (+17-17)
(modified) llvm/test/CodeGen/AMDGPU/mul.ll (+12-12)
(modified) llvm/test/CodeGen/AMDGPU/rem_i128.ll (+869-871)
(modified) llvm/test/CodeGen/AMDGPU/sdiv64.ll (+117-117)
(modified) llvm/test/CodeGen/AMDGPU/srem64.ll (+117-117)
(modified) llvm/test/CodeGen/AMDGPU/udiv64.ll (+105-105)
(modified) llvm/test/CodeGen/AMDGPU/urem64.ll (+89-89)
(modified) llvm/test/CodeGen/AMDGPU/vni8-across-blocks.ll (+42-41)
(modified) llvm/test/CodeGen/AMDGPU/wave32.ll (+4-4)
(modified) llvm/test/CodeGen/ARM/and-cmp0-sink.ll (+11-11)
(modified) llvm/test/CodeGen/ARM/cttz.ll (+46-46)
(modified) llvm/test/CodeGen/ARM/select-imm.ll (+8-8)
(modified) llvm/test/CodeGen/ARM/struct-byval-loop.ll (+8-8)
(modified) llvm/test/CodeGen/ARM/swifterror.ll (+154-154)
(modified) llvm/test/CodeGen/AVR/bug-81911.ll (+17-17)
(modified) llvm/test/CodeGen/Hexagon/swp-conv3x3-nested.ll (+1-2)
(modified) llvm/test/CodeGen/Hexagon/swp-epilog-phi7.ll (+1)
(modified) llvm/test/CodeGen/Hexagon/swp-matmul-bitext.ll (+1-1)
(modified) llvm/test/CodeGen/Hexagon/swp-stages4.ll (+2-5)
(modified) llvm/test/CodeGen/Hexagon/tinycore.ll (+8-3)
(modified) llvm/test/CodeGen/LoongArch/machinelicm-address-pseudos.ll (+28-28)
(modified) llvm/test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir (+1-2)
(modified) llvm/test/CodeGen/PowerPC/disable-ctr-ppcf128.ll (+3-3)
(modified) llvm/test/CodeGen/PowerPC/phi-eliminate.mir (+3-6)
(modified) llvm/test/CodeGen/PowerPC/ppcf128-freeze.mir (+15-15)
(modified) llvm/test/CodeGen/PowerPC/pr116071.ll (+18-7)
(modified) llvm/test/CodeGen/PowerPC/sms-phi-2.ll (+6-7)
(modified) llvm/test/CodeGen/PowerPC/sms-phi-3.ll (+12-12)
(modified) llvm/test/CodeGen/PowerPC/stack-restore-with-setjmp.ll (+4-6)
(modified) llvm/test/CodeGen/PowerPC/subreg-postra-2.ll (+9-9)
(modified) llvm/test/CodeGen/PowerPC/vsx.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/abds.ll (+100-100)
(modified) llvm/test/CodeGen/RISCV/machine-pipeliner.ll (+13-11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+60-60)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+30-31)
(modified) llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/xcvbi.ll (+30-30)
(modified) llvm/test/CodeGen/SystemZ/swifterror.ll (+2-2)
(modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll (+48-48)
(modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-disabled-in-loloops.ll (+22-22)
(modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/varying-outer-2d-reduction.ll (+16-16)
(modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/while-loops.ll (+53-58)
(modified) llvm/test/CodeGen/Thumb2/mve-blockplacement.ll (+9-12)
(modified) llvm/test/CodeGen/Thumb2/mve-float32regloops.ll (+23-20)
(modified) llvm/test/CodeGen/Thumb2/mve-laneinterleaving-reduct.ll (+4-4)
(modified) llvm/test/CodeGen/Thumb2/mve-memtp-loop.ll (+50-51)
(modified) llvm/test/CodeGen/Thumb2/mve-phireg.ll (+7-7)
(modified) llvm/test/CodeGen/Thumb2/mve-pipelineloops.ll (+41-44)
(modified) llvm/test/CodeGen/Thumb2/mve-postinc-dct.ll (+8-11)
(modified) llvm/test/CodeGen/Thumb2/mve-postinc-distribute.ll (+9-8)
(modified) llvm/test/CodeGen/Thumb2/mve-postinc-lsr.ll (+22-22)
(modified) llvm/test/CodeGen/Thumb2/mve-satmul-loops.ll (+17-16)
(modified) llvm/test/CodeGen/Thumb2/pr52817.ll (+8-8)
(modified) llvm/test/CodeGen/VE/Scalar/br_jt.ll (+19-19)
(modified) llvm/test/CodeGen/X86/2012-01-10-UndefExceptionEdge.ll (+2-2)
(modified) llvm/test/CodeGen/X86/AMX/amx-ldtilecfg-insert.ll (+9-9)
(modified) llvm/test/CodeGen/X86/AMX/amx-spill-merge.ll (+16-16)
(modified) llvm/test/CodeGen/X86/atomic32.ll (+72-54)
(modified) llvm/test/CodeGen/X86/atomic64.ll (+20-15)
(modified) llvm/test/CodeGen/X86/atomic6432.ll (+36-36)
(modified) llvm/test/CodeGen/X86/callbr-asm-branch-folding.ll (+4-4)
(modified) llvm/test/CodeGen/X86/callbr-asm-kill.mir (+3-6)
(modified) llvm/test/CodeGen/X86/coalescer-breaks-subreg-to-reg-liveness-reduced.ll (+1-1)
(modified) llvm/test/CodeGen/X86/combine-pmuldq.ll (+4-4)
(modified) llvm/test/CodeGen/X86/fp128-select.ll (+11-10)
(modified) llvm/test/CodeGen/X86/madd.ll (+58-58)
(modified) llvm/test/CodeGen/X86/masked_load.ll (+13-14)
(modified) llvm/test/CodeGen/X86/min-legal-vector-width.ll (+15-15)
(modified) llvm/test/CodeGen/X86/pcsections-atomics.ll (+158-138)
(modified) llvm/test/CodeGen/X86/pr15705.ll (+9-8)
(modified) llvm/test/CodeGen/X86/pr32256.ll (+6-6)
(modified) llvm/test/CodeGen/X86/pr38795.ll (+9-6)
(modified) llvm/test/CodeGen/X86/pr49451.ll (+3-3)
(modified) llvm/test/CodeGen/X86/pr63108.ll (+1-1)
(modified) llvm/test/CodeGen/X86/sad.ll (+13-13)
(modified) llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll (+40-48)
(modified) llvm/test/CodeGen/X86/statepoint-cmp-sunk-past-statepoint.ll (+1-1)
(modified) llvm/test/CodeGen/X86/swifterror.ll (+9-8)
(modified) llvm/test/DebugInfo/MIR/InstrRef/phi-regallocd-to-stack.mir (+3-4)
(modified) llvm/test/Transforms/LoopStrengthReduce/RISCV/lsr-drop-solution.ll (+7-11)

diff --git a/llvm/lib/CodeGen/PHIElimination.cpp b/llvm/lib/CodeGen/PHIElimination.cpp
index 14f91a87f75b4..cc3d4aac55b9d 100644
--- a/llvm/lib/CodeGen/PHIElimination.cpp
+++ b/llvm/lib/CodeGen/PHIElimination.cpp
@@ -587,6 +587,15 @@ void PHIEliminationImpl::LowerPHINode(MachineBasicBlock &MBB,
     MachineBasicBlock::iterator InsertPos =
         findPHICopyInsertPoint(&opBlock, &MBB, SrcReg);
 
+    // Reuse an existing copy in the block if possible.
+    if (MachineInstr *DefMI = MRI->getUniqueVRegDef(SrcReg)) {
+      if (DefMI->isCopy() && DefMI->getParent() == &opBlock &&
+          MRI->use_empty(SrcReg)) {
+        DefMI->getOperand(0).setReg(IncomingReg);
+        continue;
+      }
+    }
+
     // Insert the copy.
     MachineInstr *NewSrcInstr = nullptr;
     if (!reusedIncoming && IncomingReg) {
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll
index c1c5c53aa7df2..6c300b04508b2 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll
@@ -118,8 +118,8 @@ define dso_local void @store_atomic_i64_aligned_seq_cst(i64 %value, ptr %ptr) {
 define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_unordered:
 ; -O0:    bl __aarch64_cas16_relax
-; -O0:    subs x10, x10, x11
-; -O0:    ccmp x8, x9, #0, eq
+; -O0:    subs x9, x0, x9
+; -O0:    ccmp x1, x8, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_unordered:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -131,8 +131,8 @@ define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr
 define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_monotonic:
 ; -O0:    bl __aarch64_cas16_relax
-; -O0:    subs x10, x10, x11
-; -O0:    ccmp x8, x9, #0, eq
+; -O0:    subs x9, x0, x9
+; -O0:    ccmp x1, x8, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_monotonic:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -144,8 +144,8 @@ define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr
 define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_release:
 ; -O0:    bl __aarch64_cas16_rel
-; -O0:    subs x10, x10, x11
-; -O0:    ccmp x8, x9, #0, eq
+; -O0:    subs x9, x0, x9
+; -O0:    ccmp x1, x8, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_release:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -157,8 +157,8 @@ define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr)
 define dso_local void @store_atomic_i128_aligned_seq_cst(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_seq_cst:
 ; -O0:    bl __aarch64_cas16_acq_rel
-; -O0:    subs x10, x10, x11
-; -O0:    ccmp x8, x9, #0, eq
+; -O0:    subs x9, x0, x9
+; -O0:    ccmp x1, x8, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_seq_cst:
 ; -O1:    ldaxp xzr, x8, [x2]
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll
index d1047d84e2956..2a7bbad9d6454 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll
@@ -117,13 +117,13 @@ define dso_local void @store_atomic_i64_aligned_seq_cst(i64 %value, ptr %ptr) {
 
 define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_unordered:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stxp w8, x14, x15, [x9]
-; -O0:    stxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stxp w12, x14, x15, [x13]
+; -O0:    stxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_unordered:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -134,13 +134,13 @@ define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr
 
 define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_monotonic:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stxp w8, x14, x15, [x9]
-; -O0:    stxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stxp w12, x14, x15, [x13]
+; -O0:    stxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_monotonic:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -151,13 +151,13 @@ define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr
 
 define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_release:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stlxp w8, x14, x15, [x9]
-; -O0:    stlxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stlxp w12, x14, x15, [x13]
+; -O0:    stlxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_release:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -168,13 +168,13 @@ define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr)
 
 define dso_local void @store_atomic_i128_aligned_seq_cst(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_seq_cst:
-; -O0:    ldaxp x10, x12, [x9]
+; -O0:    ldaxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stlxp w8, x14, x15, [x9]
-; -O0:    stlxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stlxp w12, x14, x15, [x13]
+; -O0:    stlxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_seq_cst:
 ; -O1:    ldaxp xzr, x8, [x2]
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll
index 1a79c73355143..493bc742f7663 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll
@@ -117,13 +117,13 @@ define dso_local void @store_atomic_i64_aligned_seq_cst(i64 %value, ptr %ptr) {
 
 define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_unordered:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stxp w8, x14, x15, [x9]
-; -O0:    stxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stxp w12, x14, x15, [x13]
+; -O0:    stxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_unordered:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -134,13 +134,13 @@ define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr
 
 define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_monotonic:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stxp w8, x14, x15, [x9]
-; -O0:    stxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stxp w12, x14, x15, [x13]
+; -O0:    stxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_monotonic:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -151,13 +151,13 @@ define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr
 
 define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_release:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stlxp w8, x14, x15, [x9]
-; -O0:    stlxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stlxp w12, x14, x15, [x13]
+; -O0:    stlxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_release:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -168,13 +168,13 @@ define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr)
 
 define dso_local void @store_atomic_i128_aligned_seq_cst(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_seq_cst:
-; -O0:    ldaxp x10, x12, [x9]
+; -O0:    ldaxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stlxp w8, x14, x15, [x9]
-; -O0:    stlxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stlxp w12, x14, x15, [x13]
+; -O0:    stlxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_seq_cst:
 ; -O1:    ldaxp xzr, x8, [x2]
diff --git a/llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir b/llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir
index 01c44e3f253bb..993d1c1f1b5f0 100644
--- a/llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir
+++ b/llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir
@@ -37,7 +37,7 @@ body: |
   bb.1:
     %x:gpr32 = COPY $wzr
   ; Test that the debug location is not copied into bb1!
-  ; CHECK: %3:gpr32 = COPY killed %x{{$}}
+  ; CHECK: %3:gpr32 = COPY $wzr
   ; CHECK-LABEL: bb.2:
   bb.2:
     %y:gpr32 = PHI %x:gpr32, %bb.1, undef %undef:gpr32, %bb.0, debug-location !14
diff --git a/llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir b/llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir
new file mode 100644
index 0000000000000..883d130bfac4e
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir
@@ -0,0 +1,35 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -run-pass=phi-node-elimination -mtriple=aarch64-linux-gnu -o - %s | FileCheck %s
+
+# Verify that the original COPY in bb.1 is reappropriated as the PHI source in bb.2,
+# instead of creating a new COPY with the same source register.
+
+---
+name: test
+tracksRegLiveness: true
+body: |
+  ; CHECK-LABEL: name: test
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; CHECK-NEXT:   liveins: $nzcv, $wzr
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[DEF:%[0-9]+]]:gpr32 = IMPLICIT_DEF
+  ; CHECK-NEXT:   Bcc 8, %bb.2, implicit $nzcv
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[DEF:%[0-9]+]]:gpr32 = COPY $wzr
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   %y:gpr32 = COPY [[DEF]]
+  ; CHECK-NEXT:   $wzr = COPY %y
+  bb.0:
+    liveins: $nzcv, $wzr
+    Bcc 8, %bb.2, implicit $nzcv
+  bb.1:
+    %x:gpr32 = COPY $wzr
+  bb.2:
+    %y:gpr32 = PHI %x:gpr32, %bb.1, undef %undef:gpr32, %bb.0
+    $wzr = COPY %y:gpr32
+...
diff --git a/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll b/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll
index fb6575cc0ee83..10fc431b07b18 100644
--- a/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll
+++ b/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll
@@ -587,8 +587,8 @@ define i16 @red_mla_dup_ext_u8_s8_s16(ptr noalias nocapture noundef readonly %A,
 ; CHECK-SD-NEXT:    mov w10, w2
 ; CHECK-SD-NEXT:    b.hi .LBB5_4
 ; CHECK-SD-NEXT:  // %bb.2:
-; CHECK-SD-NEXT:    mov x11, xzr
 ; CHECK-SD-NEXT:    mov w8, wzr
+; CHECK-SD-NEXT:    mov x11, xzr
 ; CHECK-SD-NEXT:    b .LBB5_7
 ; CHECK-SD-NEXT:  .LBB5_3:
 ; CHECK-SD-NEXT:    mov w8, wzr
diff --git a/llvm/test/CodeGen/AArch64/atomicrmw-O0.ll b/llvm/test/CodeGen/AArch64/atomicrmw-O0.ll
index 37a7782caeed9..cab6fba59cbd1 100644
--- a/llvm/test/CodeGen/AArch64/atomicrmw-O0.ll
+++ b/llvm/test/CodeGen/AArch64/atomicrmw-O0.ll
@@ -45,7 +45,7 @@ define i8 @test_rmw_add_8(ptr %dst)   {
 ;
 ; LSE-LABEL: test_rmw_add_8:
 ; LSE:       // %bb.0: // %entry
-; LSE-NEXT:    mov w8, #1
+; LSE-NEXT:    mov w8, #1 // =0x1
 ; LSE-NEXT:    ldaddalb w8, w0, [x0]
 ; LSE-NEXT:    ret
 entry:
@@ -94,7 +94,7 @@ define i16 @test_rmw_add_16(ptr %dst)   {
 ;
 ; LSE-LABEL: test_rmw_add_16:
 ; LSE:       // %bb.0: // %entry
-; LSE-NEXT:    mov w8, #1
+; LSE-NEXT:    mov w8, #1 // =0x1
 ; LSE-NEXT:    ldaddalh w8, w0, [x0]
 ; LSE-NEXT:    ret
 entry:
@@ -143,7 +143,7 @@ define i32 @test_rmw_add_32(ptr %dst)   {
 ;
 ; LSE-LABEL: test_rmw_add_32:
 ; LSE:       // %bb.0: // %entry
-; LSE-NEXT:    mov w8, #1
+; LSE-NEXT:    mov w8, #1 // =0x1
 ; LSE-NEXT:    ldaddal w8, w0, [x0]
 ; LSE-NEXT:    ret
 entry:
@@ -192,7 +192,7 @@ define i64 @test_rmw_add_64(ptr %dst)   {
 ;
 ; LSE-LABEL: test_rmw_add_64:
 ; LSE:       // %bb.0: // %entry
-; LSE-NEXT:    mov w8, #1
+; LSE-NEXT:    mov w8, #1 // =0x1
 ; LSE-NEXT:    // kill: def $x8 killed $w8
 ; LSE-NEXT:    ldaddal x8, x0, [x0]
 ; LSE-NEXT:    ret
@@ -207,16 +207,16 @@ define i128 @test_rmw_add_128(ptr %dst)   {
 ; NOLSE-NEXT:    sub sp, sp, #48
 ; NOLSE-NEXT:    .cfi_def_cfa_offset 48
 ; NOLSE-NEXT:    str x0, [sp, #24] // 8-byte Folded Spill
-; NOLSE-NEXT:    ldr x8, [x0, #8]
-; NOLSE-NEXT:    ldr x9, [x0]
+; NOLSE-NEXT:    ldr x9, [x0, #8]
+; NOLSE-NEXT:    ldr x8, [x0]
 ; NOLSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
 ; NOLSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
 ; NOLSE-NEXT:    b .LBB4_1
 ; NOLSE-NEXT:  .LBB4_1: // %atomicrmw.start
 ; NOLSE-NEXT:    // =>This Loop Header: Depth=1
 ; NOLSE-NEXT:    // Child Loop BB4_2 Depth 2
-; NOLSE-NEXT:    ldr x13, [sp, #40] // 8-byte Folded Reload
-; NOLSE-NEXT:    ldr x11, [sp, #32] // 8-byte Folded Reload
+; NOLSE-NEXT:    ldr x13, [sp, #32] // 8-byte Folded Reload
+; NOLSE-NEXT:    ldr x11, [sp, #40] // 8-byte Folded Reload
 ; NOLSE-NEXT:    ldr x9, [sp, #24] // 8-byte Folded Reload
 ; NOLSE-NEXT:    adds x14, x11, #1
 ; NOLSE-NEXT:    cinc x15, x13, hs
@@ -246,8 +246,8 @@ define i128 @test_rmw_add_128(ptr %dst)   {
 ; NOLSE-NEXT:    str x9, [sp, #16] // 8-byte Folded Spill
 ; NOLSE-NEXT:    subs x12, x12, x13
 ; NOLSE-NEXT:    ccmp x10, x11, #0, eq
-; NOLSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
-; NOLSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
+; NOLSE-NEXT:    str x9, [sp, #40] // 8-byte Folded Spill
+; NOLSE-NEXT:    str x8, [sp, #32] // 8-byte Folded Spill
 ; NOLSE-NEXT:    b.ne .LBB4_1
 ; NOLSE-NEXT:    b .LBB4_6
 ; NOLSE-NEXT:  .LBB4_6: // %atomicrmw.end
@@ -261,15 +261,15 @@ define i128 @test_rmw_add_128(ptr %dst)   {
 ; LSE-NEXT:    sub sp, sp, #48
 ; LSE-NEXT:    .cfi_def_cfa_offset 48
 ; LSE-NEXT:    str x0, [sp, #24] // 8-byte Folded Spill
-; LSE-NEXT:    ldr x8, [x0, #8]
-; LSE-NEXT:    ldr x9, [x0]
+; LSE-NEXT:    ldr x9, [x0, #8]
+; LSE-NEXT:    ldr x8, [x0]
 ; LSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
 ; LSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
 ; LSE-NEXT:    b .LBB4_1
 ; LSE-NEXT:  .LBB4_1: // %atomicrmw.start
 ; LSE-NEXT:    // =>This Inner Loop Header: Depth=1
-; LSE-NEXT:    ldr x11, [sp, #40] // 8-byte Folded Reload
-; LSE-NEXT:    ldr x10, [sp, #32] // 8-byte Folded Reload
+; LSE-NEXT:    ldr x11, [sp, #32] // 8-byte Folded Reload
+; LSE-NEXT:    ldr x10, [sp, #40] // 8-byte Folded Reload
 ; LSE-NEXT:    ldr x8, [sp, #24] // 8-byte Folded Reload
 ; LSE-NEXT:    mov x0, x10
 ; LSE-NEXT:    mov x1, x11
@@ -284,8 +284,8 @@ define i128 @test_rmw_add_128(ptr %dst)   {
 ; LSE-NEXT:    str x8, [sp, #16] // 8-byte Folded Spill
 ; LSE-NEXT:    subs x11, x8, x11
 ; LSE-NEXT:    ccmp x9, x10, #0, eq
-; LSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
-; LSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
+; LSE-NEXT:    str x9, [sp, #40] // 8-byte Folded Spill
+; LSE-NEXT:    str x8, [sp, #32] // 8-byte Folded Spill
 ; LSE-NEXT:    b.ne .LBB4_1
 ; LSE-NEXT:    b .LBB4_2
 ; LSE-NEXT:  .LBB4_2: // %atomicrmw.end
@@ -597,23 +597,23 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; NOLSE-NEXT:    sub sp, sp, #48
 ; NOLSE-NEXT:    .cfi_def_cfa_offset 48
 ; NOLSE-NEXT:    str x0, [sp, #24] // 8-byte Folded Spill
-; NOLSE-NEXT:    ldr x8, [x0, #8]
-; NOLSE-NEXT:    ldr x9, [x0]
+; NOLSE-NEXT:    ldr x9, [x0, #8]
+; NOLSE-NEXT:    ldr x8, [x0]
 ; NOLSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
 ; NOLSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
 ; NOLSE-NEXT:    b .LBB9_1
 ; NOLSE-NEXT:  .LBB9_1: // %atomicrmw.start
 ; NOLSE-NEXT:    // =>This Loop Header: Depth=1
 ; NOLSE-NEXT:    // Child Loop BB9_2 Depth 2
-; NOLSE-NEXT:    ldr x13, [sp, #40] // 8-byte Folded Reload
-; NOLSE-NEXT:    ldr x11, [sp, #32] // 8-byte Folded Reload
+; NOLSE-NEXT:    ldr x13, [sp, #32] // 8-byte Folded Reload
+; NOLSE-NEXT:    ldr x11, [sp, #40] // 8-byte Folded Reload
 ; NOLSE-NEXT:    ldr x9, [sp, #24] // 8-byte Folded Reload
 ; NOLSE-NEXT:    mov w8, w11
 ; NOLSE-NEXT:    mvn w10, w8
 ; NOLSE-NEXT:    // implicit-def: $x8
 ; NOLSE-NEXT:    mov w8, w10
 ; NOLSE-NEXT:    orr x14, x8, #0xfffffffffffffffe
-; NOLSE-NEXT:    mov x15, #-1
+; NOLSE-NEXT:    mov x15, #-1 // =0xffffffffffffffff
 ; NOLSE-NEXT:  .LBB9_2: // %atomicrmw.start
 ; NOLSE-NEXT:    // Parent Loop BB9_1 Depth=1
 ; NOLSE-NEXT:    // => This Inner Loop Header: Depth=2
@@ -640,8 +640,8 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; NOLSE-NEXT:    str x9, [sp, #16] // 8-byte Folded Spill
 ; NOLSE-NEXT:    subs x12, x12, x13
 ; NOLSE-NEXT:    ccmp x10, x11, #0, eq
-; NOLSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
-; NOLSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
+; NOLSE-NEXT:    str x9, [sp, #40] // 8-byte Folded Spill
+; NOLSE-NEXT:    str x8, [sp, #32] // 8-byte Folded Spill
 ; NOLSE-NEXT:    b.ne .LBB9_1
 ; NOLSE-NEXT:    b .LBB9_6
 ; NOLSE-NEXT:  .LBB9_6: // %atomicrmw.end
@@ -655,15 +655,15 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; LSE-NEXT:    sub sp, sp, #48
 ; LSE-NEXT:    .cfi_def_cfa_offset 48
 ; LSE-NEXT:    str x0, [sp, #24] // 8-byte Folded Spill
-; LSE-NEXT:    ldr x8, [x0, #8]
-; LSE-NEXT:    ldr x9, [x0]
+; LSE-NEXT:    ldr x9, [x0, #8]
+; LSE-NEXT:    ldr x8, [x0]
 ; LSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
 ; LSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
 ; LSE-NEXT:    b .LBB9_1
 ; LSE-NEXT:  .LBB9_1: // %atomicrmw.start
 ; LSE-NEXT:    // =>This Inner Loop Header: Depth=1
-; LSE-NEXT:    ldr x11, [sp, #40] // 8-byte Folded Reload
-; LSE-NEXT:    ldr x10, [sp, #32] // 8-byte Folded Reload
+; LSE-NEXT:    ldr x11, [sp, #32] // 8-byte Folded Reload
+; LSE-NEXT:    ldr x10, [sp, #40] // 8-byte Folded Reload
 ; LSE-NEXT:    ldr x8, [sp, #24] // 8-byte Folded Reload
 ; LSE-NEXT:    mov x0, x10
 ; LSE-NEXT:    mov x1, x11
@@ -672,7 +672,7 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; LSE-NEXT:    // implicit-def: $x9
 ; LSE-NEXT:    mov w9, w12
 ; LSE-NEXT:    orr x2, x9, #0xfffffffffffffffe
-; LSE-NEXT:    mov x9, #-1
+; LSE-NEXT:    mov x9, #-1 // =0xffffffffffffffff
 ; LSE-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; LSE-NEXT:    mov x3, x9
 ; LSE-NEXT:    caspal x0, x1, x2, x3, [x8]
@@ -682,8 +682,8 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; LSE-NEXT:    str x8, [sp, #16] // 8-byte Folded Spill
 ; LSE-NEXT:    subs x11, x8, x11
 ; LSE-NEXT:    ccmp x9, x10, #0, eq
-; LSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
-; LSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
+; LSE-NEXT:    str x9, [sp, #40] // 8-byte Folded Spill
+; LSE-NEXT:    str x8, [sp, #32] // 8-byte Folded Spill
 ; LSE-NEXT:    b.ne .LBB9_1
 ; LSE-NEXT:    b .LBB9_2
 ; LSE-NEXT:  .LBB9_2: // %atomicrmw.end
diff --git a/llvm/test/CodeGen/AArch64/bfis-in-loop.ll b/llvm/test/CodeGen/AArch64/bfis-in-loop.ll
index 43d49da1abd21..b0339222bc2df 100644
--- a/llvm/test/CodeGen/AArch64/bfis-in-loop.ll
+++ b/llvm/test/CodeGen/AArch64/bfis-in-loop.ll
@@ -14,8 +14,8 @@ define i64 @bfi...
[truncated]

llvmbot · 2025-03-18T16:00:55Z

@llvm/pr-subscribers-backend-loongarch

Author: Guy David (guy-david)

Changes

The insertion point of COPY isn't always optimal and could lead to a worse block layout, see the regression test in the first commit (which needs to be reduced).

Patch is 2.30 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/131837.diff

127 Files Affected:

(modified) llvm/lib/CodeGen/PHIElimination.cpp (+9)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll (+8-8)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll (+24-24)
(modified) llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll (+24-24)
(modified) llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir (+1-1)
(added) llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir (+35)
(modified) llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll (+1-1)
(modified) llvm/test/CodeGen/AArch64/atomicrmw-O0.ll (+30-30)
(modified) llvm/test/CodeGen/AArch64/bfis-in-loop.ll (+1-1)
(added) llvm/test/CodeGen/AArch64/block-layout-regression.mir (+107)
(modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-crash.ll (+15-15)
(modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions-predicated-scalable.ll (+14-14)
(modified) llvm/test/CodeGen/AArch64/complex-deinterleaving-reductions.ll (+6-6)
(modified) llvm/test/CodeGen/AArch64/phi.ll (+20-20)
(modified) llvm/test/CodeGen/AArch64/pr48188.ll (+6-6)
(modified) llvm/test/CodeGen/AArch64/ragreedy-csr.ll (+11-11)
(modified) llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll (+56-57)
(modified) llvm/test/CodeGen/AArch64/reduce-or-opt.ll (+12-12)
(modified) llvm/test/CodeGen/AArch64/sink-and-fold.ll (+3-3)
(modified) llvm/test/CodeGen/AArch64/sve-lsrchain.ll (+7-7)
(modified) llvm/test/CodeGen/AArch64/sve-ptest-removal-sink.ll (+4-4)
(modified) llvm/test/CodeGen/AArch64/swifterror.ll (+8-8)
(modified) llvm/test/CodeGen/AArch64/tbl-loops.ll (+8-8)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmax.ll (+74-72)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_fmin.ll (+74-72)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/divergence-temporal-divergent-i1.ll (+7-7)
(modified) llvm/test/CodeGen/AMDGPU/atomic_optimizations_global_pointer.ll (+832-789)
(modified) llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll (+110-100)
(modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fadd.ll (+1387-1378)
(modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fmax.ll (+924-908)
(modified) llvm/test/CodeGen/AMDGPU/buffer-fat-pointer-atomicrmw-fmin.ll (+924-908)
(modified) llvm/test/CodeGen/AMDGPU/div_i128.ll (+914-922)
(modified) llvm/test/CodeGen/AMDGPU/div_v2i128.ll (+114-114)
(modified) llvm/test/CodeGen/AMDGPU/divergent-branch-uniform-condition.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll (+29-33)
(modified) llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll (+29-33)
(modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll (+952-950)
(modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll (+658-656)
(modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll (+658-656)
(modified) llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll (+793-791)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_i32_system.ll (+323-323)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_i64_system.ll (+461-461)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fadd.ll (+255-255)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmax.ll (+225-225)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fmin.ll (+225-225)
(modified) llvm/test/CodeGen/AMDGPU/global_atomics_scan_fsub.ll (+227-227)
(modified) llvm/test/CodeGen/AMDGPU/indirect-addressing-si.ll (+62-77)
(modified) llvm/test/CodeGen/AMDGPU/move-to-valu-atomicrmw-system.ll (+17-17)
(modified) llvm/test/CodeGen/AMDGPU/mul.ll (+12-12)
(modified) llvm/test/CodeGen/AMDGPU/rem_i128.ll (+869-871)
(modified) llvm/test/CodeGen/AMDGPU/sdiv64.ll (+117-117)
(modified) llvm/test/CodeGen/AMDGPU/srem64.ll (+117-117)
(modified) llvm/test/CodeGen/AMDGPU/udiv64.ll (+105-105)
(modified) llvm/test/CodeGen/AMDGPU/urem64.ll (+89-89)
(modified) llvm/test/CodeGen/AMDGPU/vni8-across-blocks.ll (+42-41)
(modified) llvm/test/CodeGen/AMDGPU/wave32.ll (+4-4)
(modified) llvm/test/CodeGen/ARM/and-cmp0-sink.ll (+11-11)
(modified) llvm/test/CodeGen/ARM/cttz.ll (+46-46)
(modified) llvm/test/CodeGen/ARM/select-imm.ll (+8-8)
(modified) llvm/test/CodeGen/ARM/struct-byval-loop.ll (+8-8)
(modified) llvm/test/CodeGen/ARM/swifterror.ll (+154-154)
(modified) llvm/test/CodeGen/AVR/bug-81911.ll (+17-17)
(modified) llvm/test/CodeGen/Hexagon/swp-conv3x3-nested.ll (+1-2)
(modified) llvm/test/CodeGen/Hexagon/swp-epilog-phi7.ll (+1)
(modified) llvm/test/CodeGen/Hexagon/swp-matmul-bitext.ll (+1-1)
(modified) llvm/test/CodeGen/Hexagon/swp-stages4.ll (+2-5)
(modified) llvm/test/CodeGen/Hexagon/tinycore.ll (+8-3)
(modified) llvm/test/CodeGen/LoongArch/machinelicm-address-pseudos.ll (+28-28)
(modified) llvm/test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir (+1-2)
(modified) llvm/test/CodeGen/PowerPC/disable-ctr-ppcf128.ll (+3-3)
(modified) llvm/test/CodeGen/PowerPC/phi-eliminate.mir (+3-6)
(modified) llvm/test/CodeGen/PowerPC/ppcf128-freeze.mir (+15-15)
(modified) llvm/test/CodeGen/PowerPC/pr116071.ll (+18-7)
(modified) llvm/test/CodeGen/PowerPC/sms-phi-2.ll (+6-7)
(modified) llvm/test/CodeGen/PowerPC/sms-phi-3.ll (+12-12)
(modified) llvm/test/CodeGen/PowerPC/stack-restore-with-setjmp.ll (+4-6)
(modified) llvm/test/CodeGen/PowerPC/subreg-postra-2.ll (+9-9)
(modified) llvm/test/CodeGen/PowerPC/vsx.ll (+1-1)
(modified) llvm/test/CodeGen/RISCV/abds.ll (+100-100)
(modified) llvm/test/CodeGen/RISCV/machine-pipeliner.ll (+13-11)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+60-60)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-strided-load-store-asm.ll (+30-31)
(modified) llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/xcvbi.ll (+30-30)
(modified) llvm/test/CodeGen/SystemZ/swifterror.ll (+2-2)
(modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll (+48-48)
(modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/tail-pred-disabled-in-loloops.ll (+22-22)
(modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/varying-outer-2d-reduction.ll (+16-16)
(modified) llvm/test/CodeGen/Thumb2/LowOverheadLoops/while-loops.ll (+53-58)
(modified) llvm/test/CodeGen/Thumb2/mve-blockplacement.ll (+9-12)
(modified) llvm/test/CodeGen/Thumb2/mve-float32regloops.ll (+23-20)
(modified) llvm/test/CodeGen/Thumb2/mve-laneinterleaving-reduct.ll (+4-4)
(modified) llvm/test/CodeGen/Thumb2/mve-memtp-loop.ll (+50-51)
(modified) llvm/test/CodeGen/Thumb2/mve-phireg.ll (+7-7)
(modified) llvm/test/CodeGen/Thumb2/mve-pipelineloops.ll (+41-44)
(modified) llvm/test/CodeGen/Thumb2/mve-postinc-dct.ll (+8-11)
(modified) llvm/test/CodeGen/Thumb2/mve-postinc-distribute.ll (+9-8)
(modified) llvm/test/CodeGen/Thumb2/mve-postinc-lsr.ll (+22-22)
(modified) llvm/test/CodeGen/Thumb2/mve-satmul-loops.ll (+17-16)
(modified) llvm/test/CodeGen/Thumb2/pr52817.ll (+8-8)
(modified) llvm/test/CodeGen/VE/Scalar/br_jt.ll (+19-19)
(modified) llvm/test/CodeGen/X86/2012-01-10-UndefExceptionEdge.ll (+2-2)
(modified) llvm/test/CodeGen/X86/AMX/amx-ldtilecfg-insert.ll (+9-9)
(modified) llvm/test/CodeGen/X86/AMX/amx-spill-merge.ll (+16-16)
(modified) llvm/test/CodeGen/X86/atomic32.ll (+72-54)
(modified) llvm/test/CodeGen/X86/atomic64.ll (+20-15)
(modified) llvm/test/CodeGen/X86/atomic6432.ll (+36-36)
(modified) llvm/test/CodeGen/X86/callbr-asm-branch-folding.ll (+4-4)
(modified) llvm/test/CodeGen/X86/callbr-asm-kill.mir (+3-6)
(modified) llvm/test/CodeGen/X86/coalescer-breaks-subreg-to-reg-liveness-reduced.ll (+1-1)
(modified) llvm/test/CodeGen/X86/combine-pmuldq.ll (+4-4)
(modified) llvm/test/CodeGen/X86/fp128-select.ll (+11-10)
(modified) llvm/test/CodeGen/X86/madd.ll (+58-58)
(modified) llvm/test/CodeGen/X86/masked_load.ll (+13-14)
(modified) llvm/test/CodeGen/X86/min-legal-vector-width.ll (+15-15)
(modified) llvm/test/CodeGen/X86/pcsections-atomics.ll (+158-138)
(modified) llvm/test/CodeGen/X86/pr15705.ll (+9-8)
(modified) llvm/test/CodeGen/X86/pr32256.ll (+6-6)
(modified) llvm/test/CodeGen/X86/pr38795.ll (+9-6)
(modified) llvm/test/CodeGen/X86/pr49451.ll (+3-3)
(modified) llvm/test/CodeGen/X86/pr63108.ll (+1-1)
(modified) llvm/test/CodeGen/X86/sad.ll (+13-13)
(modified) llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll (+40-48)
(modified) llvm/test/CodeGen/X86/statepoint-cmp-sunk-past-statepoint.ll (+1-1)
(modified) llvm/test/CodeGen/X86/swifterror.ll (+9-8)
(modified) llvm/test/DebugInfo/MIR/InstrRef/phi-regallocd-to-stack.mir (+3-4)
(modified) llvm/test/Transforms/LoopStrengthReduce/RISCV/lsr-drop-solution.ll (+7-11)

diff --git a/llvm/lib/CodeGen/PHIElimination.cpp b/llvm/lib/CodeGen/PHIElimination.cpp
index 14f91a87f75b4..cc3d4aac55b9d 100644
--- a/llvm/lib/CodeGen/PHIElimination.cpp
+++ b/llvm/lib/CodeGen/PHIElimination.cpp
@@ -587,6 +587,15 @@ void PHIEliminationImpl::LowerPHINode(MachineBasicBlock &MBB,
     MachineBasicBlock::iterator InsertPos =
         findPHICopyInsertPoint(&opBlock, &MBB, SrcReg);
 
+    // Reuse an existing copy in the block if possible.
+    if (MachineInstr *DefMI = MRI->getUniqueVRegDef(SrcReg)) {
+      if (DefMI->isCopy() && DefMI->getParent() == &opBlock &&
+          MRI->use_empty(SrcReg)) {
+        DefMI->getOperand(0).setReg(IncomingReg);
+        continue;
+      }
+    }
+
     // Insert the copy.
     MachineInstr *NewSrcInstr = nullptr;
     if (!reusedIncoming && IncomingReg) {
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll
index c1c5c53aa7df2..6c300b04508b2 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-outline_atomics.ll
@@ -118,8 +118,8 @@ define dso_local void @store_atomic_i64_aligned_seq_cst(i64 %value, ptr %ptr) {
 define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_unordered:
 ; -O0:    bl __aarch64_cas16_relax
-; -O0:    subs x10, x10, x11
-; -O0:    ccmp x8, x9, #0, eq
+; -O0:    subs x9, x0, x9
+; -O0:    ccmp x1, x8, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_unordered:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -131,8 +131,8 @@ define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr
 define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_monotonic:
 ; -O0:    bl __aarch64_cas16_relax
-; -O0:    subs x10, x10, x11
-; -O0:    ccmp x8, x9, #0, eq
+; -O0:    subs x9, x0, x9
+; -O0:    ccmp x1, x8, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_monotonic:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -144,8 +144,8 @@ define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr
 define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_release:
 ; -O0:    bl __aarch64_cas16_rel
-; -O0:    subs x10, x10, x11
-; -O0:    ccmp x8, x9, #0, eq
+; -O0:    subs x9, x0, x9
+; -O0:    ccmp x1, x8, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_release:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -157,8 +157,8 @@ define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr)
 define dso_local void @store_atomic_i128_aligned_seq_cst(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_seq_cst:
 ; -O0:    bl __aarch64_cas16_acq_rel
-; -O0:    subs x10, x10, x11
-; -O0:    ccmp x8, x9, #0, eq
+; -O0:    subs x9, x0, x9
+; -O0:    ccmp x1, x8, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_seq_cst:
 ; -O1:    ldaxp xzr, x8, [x2]
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll
index d1047d84e2956..2a7bbad9d6454 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-rcpc.ll
@@ -117,13 +117,13 @@ define dso_local void @store_atomic_i64_aligned_seq_cst(i64 %value, ptr %ptr) {
 
 define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_unordered:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stxp w8, x14, x15, [x9]
-; -O0:    stxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stxp w12, x14, x15, [x13]
+; -O0:    stxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_unordered:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -134,13 +134,13 @@ define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr
 
 define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_monotonic:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stxp w8, x14, x15, [x9]
-; -O0:    stxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stxp w12, x14, x15, [x13]
+; -O0:    stxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_monotonic:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -151,13 +151,13 @@ define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr
 
 define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_release:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stlxp w8, x14, x15, [x9]
-; -O0:    stlxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stlxp w12, x14, x15, [x13]
+; -O0:    stlxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_release:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -168,13 +168,13 @@ define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr)
 
 define dso_local void @store_atomic_i128_aligned_seq_cst(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_seq_cst:
-; -O0:    ldaxp x10, x12, [x9]
+; -O0:    ldaxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stlxp w8, x14, x15, [x9]
-; -O0:    stlxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stlxp w12, x14, x15, [x13]
+; -O0:    stlxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_seq_cst:
 ; -O1:    ldaxp xzr, x8, [x2]
diff --git a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll
index 1a79c73355143..493bc742f7663 100644
--- a/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll
+++ b/llvm/test/CodeGen/AArch64/Atomics/aarch64_be-atomic-store-v8a.ll
@@ -117,13 +117,13 @@ define dso_local void @store_atomic_i64_aligned_seq_cst(i64 %value, ptr %ptr) {
 
 define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_unordered:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stxp w8, x14, x15, [x9]
-; -O0:    stxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stxp w12, x14, x15, [x13]
+; -O0:    stxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_unordered:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -134,13 +134,13 @@ define dso_local void @store_atomic_i128_aligned_unordered(i128 %value, ptr %ptr
 
 define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_monotonic:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stxp w8, x14, x15, [x9]
-; -O0:    stxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stxp w12, x14, x15, [x13]
+; -O0:    stxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_monotonic:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -151,13 +151,13 @@ define dso_local void @store_atomic_i128_aligned_monotonic(i128 %value, ptr %ptr
 
 define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_release:
-; -O0:    ldxp x10, x12, [x9]
+; -O0:    ldxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stlxp w8, x14, x15, [x9]
-; -O0:    stlxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stlxp w12, x14, x15, [x13]
+; -O0:    stlxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_release:
 ; -O1:    ldxp xzr, x8, [x2]
@@ -168,13 +168,13 @@ define dso_local void @store_atomic_i128_aligned_release(i128 %value, ptr %ptr)
 
 define dso_local void @store_atomic_i128_aligned_seq_cst(i128 %value, ptr %ptr) {
 ; -O0-LABEL: store_atomic_i128_aligned_seq_cst:
-; -O0:    ldaxp x10, x12, [x9]
+; -O0:    ldaxp x8, x10, [x13]
+; -O0:    cmp x8, x9
 ; -O0:    cmp x10, x11
-; -O0:    cmp x12, x13
-; -O0:    stlxp w8, x14, x15, [x9]
-; -O0:    stlxp w8, x10, x12, [x9]
-; -O0:    subs x12, x12, x13
-; -O0:    ccmp x10, x11, #0, eq
+; -O0:    stlxp w12, x14, x15, [x13]
+; -O0:    stlxp w12, x8, x10, [x13]
+; -O0:    subs x10, x10, x11
+; -O0:    ccmp x8, x9, #0, eq
 ;
 ; -O1-LABEL: store_atomic_i128_aligned_seq_cst:
 ; -O1:    ldaxp xzr, x8, [x2]
diff --git a/llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir b/llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir
index 01c44e3f253bb..993d1c1f1b5f0 100644
--- a/llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir
+++ b/llvm/test/CodeGen/AArch64/PHIElimination-debugloc.mir
@@ -37,7 +37,7 @@ body: |
   bb.1:
     %x:gpr32 = COPY $wzr
   ; Test that the debug location is not copied into bb1!
-  ; CHECK: %3:gpr32 = COPY killed %x{{$}}
+  ; CHECK: %3:gpr32 = COPY $wzr
   ; CHECK-LABEL: bb.2:
   bb.2:
     %y:gpr32 = PHI %x:gpr32, %bb.1, undef %undef:gpr32, %bb.0, debug-location !14
diff --git a/llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir b/llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir
new file mode 100644
index 0000000000000..883d130bfac4e
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir
@@ -0,0 +1,35 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -run-pass=phi-node-elimination -mtriple=aarch64-linux-gnu -o - %s | FileCheck %s
+
+# Verify that the original COPY in bb.1 is reappropriated as the PHI source in bb.2,
+# instead of creating a new COPY with the same source register.
+
+---
+name: test
+tracksRegLiveness: true
+body: |
+  ; CHECK-LABEL: name: test
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.2(0x40000000), %bb.1(0x40000000)
+  ; CHECK-NEXT:   liveins: $nzcv, $wzr
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[DEF:%[0-9]+]]:gpr32 = IMPLICIT_DEF
+  ; CHECK-NEXT:   Bcc 8, %bb.2, implicit $nzcv
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[DEF:%[0-9]+]]:gpr32 = COPY $wzr
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   %y:gpr32 = COPY [[DEF]]
+  ; CHECK-NEXT:   $wzr = COPY %y
+  bb.0:
+    liveins: $nzcv, $wzr
+    Bcc 8, %bb.2, implicit $nzcv
+  bb.1:
+    %x:gpr32 = COPY $wzr
+  bb.2:
+    %y:gpr32 = PHI %x:gpr32, %bb.1, undef %undef:gpr32, %bb.0
+    $wzr = COPY %y:gpr32
+...
diff --git a/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll b/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll
index fb6575cc0ee83..10fc431b07b18 100644
--- a/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll
+++ b/llvm/test/CodeGen/AArch64/aarch64-matrix-umull-smull.ll
@@ -587,8 +587,8 @@ define i16 @red_mla_dup_ext_u8_s8_s16(ptr noalias nocapture noundef readonly %A,
 ; CHECK-SD-NEXT:    mov w10, w2
 ; CHECK-SD-NEXT:    b.hi .LBB5_4
 ; CHECK-SD-NEXT:  // %bb.2:
-; CHECK-SD-NEXT:    mov x11, xzr
 ; CHECK-SD-NEXT:    mov w8, wzr
+; CHECK-SD-NEXT:    mov x11, xzr
 ; CHECK-SD-NEXT:    b .LBB5_7
 ; CHECK-SD-NEXT:  .LBB5_3:
 ; CHECK-SD-NEXT:    mov w8, wzr
diff --git a/llvm/test/CodeGen/AArch64/atomicrmw-O0.ll b/llvm/test/CodeGen/AArch64/atomicrmw-O0.ll
index 37a7782caeed9..cab6fba59cbd1 100644
--- a/llvm/test/CodeGen/AArch64/atomicrmw-O0.ll
+++ b/llvm/test/CodeGen/AArch64/atomicrmw-O0.ll
@@ -45,7 +45,7 @@ define i8 @test_rmw_add_8(ptr %dst)   {
 ;
 ; LSE-LABEL: test_rmw_add_8:
 ; LSE:       // %bb.0: // %entry
-; LSE-NEXT:    mov w8, #1
+; LSE-NEXT:    mov w8, #1 // =0x1
 ; LSE-NEXT:    ldaddalb w8, w0, [x0]
 ; LSE-NEXT:    ret
 entry:
@@ -94,7 +94,7 @@ define i16 @test_rmw_add_16(ptr %dst)   {
 ;
 ; LSE-LABEL: test_rmw_add_16:
 ; LSE:       // %bb.0: // %entry
-; LSE-NEXT:    mov w8, #1
+; LSE-NEXT:    mov w8, #1 // =0x1
 ; LSE-NEXT:    ldaddalh w8, w0, [x0]
 ; LSE-NEXT:    ret
 entry:
@@ -143,7 +143,7 @@ define i32 @test_rmw_add_32(ptr %dst)   {
 ;
 ; LSE-LABEL: test_rmw_add_32:
 ; LSE:       // %bb.0: // %entry
-; LSE-NEXT:    mov w8, #1
+; LSE-NEXT:    mov w8, #1 // =0x1
 ; LSE-NEXT:    ldaddal w8, w0, [x0]
 ; LSE-NEXT:    ret
 entry:
@@ -192,7 +192,7 @@ define i64 @test_rmw_add_64(ptr %dst)   {
 ;
 ; LSE-LABEL: test_rmw_add_64:
 ; LSE:       // %bb.0: // %entry
-; LSE-NEXT:    mov w8, #1
+; LSE-NEXT:    mov w8, #1 // =0x1
 ; LSE-NEXT:    // kill: def $x8 killed $w8
 ; LSE-NEXT:    ldaddal x8, x0, [x0]
 ; LSE-NEXT:    ret
@@ -207,16 +207,16 @@ define i128 @test_rmw_add_128(ptr %dst)   {
 ; NOLSE-NEXT:    sub sp, sp, #48
 ; NOLSE-NEXT:    .cfi_def_cfa_offset 48
 ; NOLSE-NEXT:    str x0, [sp, #24] // 8-byte Folded Spill
-; NOLSE-NEXT:    ldr x8, [x0, #8]
-; NOLSE-NEXT:    ldr x9, [x0]
+; NOLSE-NEXT:    ldr x9, [x0, #8]
+; NOLSE-NEXT:    ldr x8, [x0]
 ; NOLSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
 ; NOLSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
 ; NOLSE-NEXT:    b .LBB4_1
 ; NOLSE-NEXT:  .LBB4_1: // %atomicrmw.start
 ; NOLSE-NEXT:    // =>This Loop Header: Depth=1
 ; NOLSE-NEXT:    // Child Loop BB4_2 Depth 2
-; NOLSE-NEXT:    ldr x13, [sp, #40] // 8-byte Folded Reload
-; NOLSE-NEXT:    ldr x11, [sp, #32] // 8-byte Folded Reload
+; NOLSE-NEXT:    ldr x13, [sp, #32] // 8-byte Folded Reload
+; NOLSE-NEXT:    ldr x11, [sp, #40] // 8-byte Folded Reload
 ; NOLSE-NEXT:    ldr x9, [sp, #24] // 8-byte Folded Reload
 ; NOLSE-NEXT:    adds x14, x11, #1
 ; NOLSE-NEXT:    cinc x15, x13, hs
@@ -246,8 +246,8 @@ define i128 @test_rmw_add_128(ptr %dst)   {
 ; NOLSE-NEXT:    str x9, [sp, #16] // 8-byte Folded Spill
 ; NOLSE-NEXT:    subs x12, x12, x13
 ; NOLSE-NEXT:    ccmp x10, x11, #0, eq
-; NOLSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
-; NOLSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
+; NOLSE-NEXT:    str x9, [sp, #40] // 8-byte Folded Spill
+; NOLSE-NEXT:    str x8, [sp, #32] // 8-byte Folded Spill
 ; NOLSE-NEXT:    b.ne .LBB4_1
 ; NOLSE-NEXT:    b .LBB4_6
 ; NOLSE-NEXT:  .LBB4_6: // %atomicrmw.end
@@ -261,15 +261,15 @@ define i128 @test_rmw_add_128(ptr %dst)   {
 ; LSE-NEXT:    sub sp, sp, #48
 ; LSE-NEXT:    .cfi_def_cfa_offset 48
 ; LSE-NEXT:    str x0, [sp, #24] // 8-byte Folded Spill
-; LSE-NEXT:    ldr x8, [x0, #8]
-; LSE-NEXT:    ldr x9, [x0]
+; LSE-NEXT:    ldr x9, [x0, #8]
+; LSE-NEXT:    ldr x8, [x0]
 ; LSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
 ; LSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
 ; LSE-NEXT:    b .LBB4_1
 ; LSE-NEXT:  .LBB4_1: // %atomicrmw.start
 ; LSE-NEXT:    // =>This Inner Loop Header: Depth=1
-; LSE-NEXT:    ldr x11, [sp, #40] // 8-byte Folded Reload
-; LSE-NEXT:    ldr x10, [sp, #32] // 8-byte Folded Reload
+; LSE-NEXT:    ldr x11, [sp, #32] // 8-byte Folded Reload
+; LSE-NEXT:    ldr x10, [sp, #40] // 8-byte Folded Reload
 ; LSE-NEXT:    ldr x8, [sp, #24] // 8-byte Folded Reload
 ; LSE-NEXT:    mov x0, x10
 ; LSE-NEXT:    mov x1, x11
@@ -284,8 +284,8 @@ define i128 @test_rmw_add_128(ptr %dst)   {
 ; LSE-NEXT:    str x8, [sp, #16] // 8-byte Folded Spill
 ; LSE-NEXT:    subs x11, x8, x11
 ; LSE-NEXT:    ccmp x9, x10, #0, eq
-; LSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
-; LSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
+; LSE-NEXT:    str x9, [sp, #40] // 8-byte Folded Spill
+; LSE-NEXT:    str x8, [sp, #32] // 8-byte Folded Spill
 ; LSE-NEXT:    b.ne .LBB4_1
 ; LSE-NEXT:    b .LBB4_2
 ; LSE-NEXT:  .LBB4_2: // %atomicrmw.end
@@ -597,23 +597,23 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; NOLSE-NEXT:    sub sp, sp, #48
 ; NOLSE-NEXT:    .cfi_def_cfa_offset 48
 ; NOLSE-NEXT:    str x0, [sp, #24] // 8-byte Folded Spill
-; NOLSE-NEXT:    ldr x8, [x0, #8]
-; NOLSE-NEXT:    ldr x9, [x0]
+; NOLSE-NEXT:    ldr x9, [x0, #8]
+; NOLSE-NEXT:    ldr x8, [x0]
 ; NOLSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
 ; NOLSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
 ; NOLSE-NEXT:    b .LBB9_1
 ; NOLSE-NEXT:  .LBB9_1: // %atomicrmw.start
 ; NOLSE-NEXT:    // =>This Loop Header: Depth=1
 ; NOLSE-NEXT:    // Child Loop BB9_2 Depth 2
-; NOLSE-NEXT:    ldr x13, [sp, #40] // 8-byte Folded Reload
-; NOLSE-NEXT:    ldr x11, [sp, #32] // 8-byte Folded Reload
+; NOLSE-NEXT:    ldr x13, [sp, #32] // 8-byte Folded Reload
+; NOLSE-NEXT:    ldr x11, [sp, #40] // 8-byte Folded Reload
 ; NOLSE-NEXT:    ldr x9, [sp, #24] // 8-byte Folded Reload
 ; NOLSE-NEXT:    mov w8, w11
 ; NOLSE-NEXT:    mvn w10, w8
 ; NOLSE-NEXT:    // implicit-def: $x8
 ; NOLSE-NEXT:    mov w8, w10
 ; NOLSE-NEXT:    orr x14, x8, #0xfffffffffffffffe
-; NOLSE-NEXT:    mov x15, #-1
+; NOLSE-NEXT:    mov x15, #-1 // =0xffffffffffffffff
 ; NOLSE-NEXT:  .LBB9_2: // %atomicrmw.start
 ; NOLSE-NEXT:    // Parent Loop BB9_1 Depth=1
 ; NOLSE-NEXT:    // => This Inner Loop Header: Depth=2
@@ -640,8 +640,8 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; NOLSE-NEXT:    str x9, [sp, #16] // 8-byte Folded Spill
 ; NOLSE-NEXT:    subs x12, x12, x13
 ; NOLSE-NEXT:    ccmp x10, x11, #0, eq
-; NOLSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
-; NOLSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
+; NOLSE-NEXT:    str x9, [sp, #40] // 8-byte Folded Spill
+; NOLSE-NEXT:    str x8, [sp, #32] // 8-byte Folded Spill
 ; NOLSE-NEXT:    b.ne .LBB9_1
 ; NOLSE-NEXT:    b .LBB9_6
 ; NOLSE-NEXT:  .LBB9_6: // %atomicrmw.end
@@ -655,15 +655,15 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; LSE-NEXT:    sub sp, sp, #48
 ; LSE-NEXT:    .cfi_def_cfa_offset 48
 ; LSE-NEXT:    str x0, [sp, #24] // 8-byte Folded Spill
-; LSE-NEXT:    ldr x8, [x0, #8]
-; LSE-NEXT:    ldr x9, [x0]
+; LSE-NEXT:    ldr x9, [x0, #8]
+; LSE-NEXT:    ldr x8, [x0]
 ; LSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
 ; LSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
 ; LSE-NEXT:    b .LBB9_1
 ; LSE-NEXT:  .LBB9_1: // %atomicrmw.start
 ; LSE-NEXT:    // =>This Inner Loop Header: Depth=1
-; LSE-NEXT:    ldr x11, [sp, #40] // 8-byte Folded Reload
-; LSE-NEXT:    ldr x10, [sp, #32] // 8-byte Folded Reload
+; LSE-NEXT:    ldr x11, [sp, #32] // 8-byte Folded Reload
+; LSE-NEXT:    ldr x10, [sp, #40] // 8-byte Folded Reload
 ; LSE-NEXT:    ldr x8, [sp, #24] // 8-byte Folded Reload
 ; LSE-NEXT:    mov x0, x10
 ; LSE-NEXT:    mov x1, x11
@@ -672,7 +672,7 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; LSE-NEXT:    // implicit-def: $x9
 ; LSE-NEXT:    mov w9, w12
 ; LSE-NEXT:    orr x2, x9, #0xfffffffffffffffe
-; LSE-NEXT:    mov x9, #-1
+; LSE-NEXT:    mov x9, #-1 // =0xffffffffffffffff
 ; LSE-NEXT:    // kill: def $x2 killed $x2 def $x2_x3
 ; LSE-NEXT:    mov x3, x9
 ; LSE-NEXT:    caspal x0, x1, x2, x3, [x8]
@@ -682,8 +682,8 @@ define i128 @test_rmw_nand_128(ptr %dst)   {
 ; LSE-NEXT:    str x8, [sp, #16] // 8-byte Folded Spill
 ; LSE-NEXT:    subs x11, x8, x11
 ; LSE-NEXT:    ccmp x9, x10, #0, eq
-; LSE-NEXT:    str x9, [sp, #32] // 8-byte Folded Spill
-; LSE-NEXT:    str x8, [sp, #40] // 8-byte Folded Spill
+; LSE-NEXT:    str x9, [sp, #40] // 8-byte Folded Spill
+; LSE-NEXT:    str x8, [sp, #32] // 8-byte Folded Spill
 ; LSE-NEXT:    b.ne .LBB9_1
 ; LSE-NEXT:    b .LBB9_2
 ; LSE-NEXT:  .LBB9_2: // %atomicrmw.end
diff --git a/llvm/test/CodeGen/AArch64/bfis-in-loop.ll b/llvm/test/CodeGen/AArch64/bfis-in-loop.ll
index 43d49da1abd21..b0339222bc2df 100644
--- a/llvm/test/CodeGen/AArch64/bfis-in-loop.ll
+++ b/llvm/test/CodeGen/AArch64/bfis-in-loop.ll
@@ -14,8 +14,8 @@ define i64 @bfi...
[truncated]

llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir

guy-david · 2025-03-30T08:45:52Z

ping :)

llvm-ci · 2025-06-29T19:25:53Z

LLVM Buildbot has detected a new failure on builder clang-ppc64le-linux-multistage running on ppc64le-clang-multistage-test while building llvm at step 10 "build stage 2".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/76/builds/10862

Here is the relevant piece of the build log for the reference

Step 10 (build stage 2) failure: 'ninja' (failure)
...
[2413/6442] Building CXX object lib/Passes/CMakeFiles/LLVMPasses.dir/CodeGenPassBuilder.cpp.o
[2414/6442] Building CXX object lib/Transforms/Coroutines/CMakeFiles/LLVMCoroutines.dir/CoroSplit.cpp.o
[2415/6442] Building CXX object lib/IR/CMakeFiles/LLVMCore.dir/Dominators.cpp.o
[2416/6442] Building CXX object lib/Transforms/IPO/CMakeFiles/LLVMipo.dir/GlobalOpt.cpp.o
[2417/6442] Building CXX object lib/Passes/CMakeFiles/LLVMPasses.dir/PassBuilderBindings.cpp.o
[2418/6442] Building CXX object utils/TableGen/CMakeFiles/llvm-tblgen.dir/GlobalISelEmitter.cpp.o
[2419/6442] Building CXX object lib/Transforms/Scalar/CMakeFiles/LLVMScalarOpts.dir/LoopIdiomRecognize.cpp.o
[2420/6442] Building CXX object lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/MachineBasicBlock.cpp.o
[2421/6442] Building CXX object lib/Transforms/Scalar/CMakeFiles/LLVMScalarOpts.dir/ConstraintElimination.cpp.o
[2422/6442] Building CXX object lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/LiveDebugVariables.cpp.o
FAILED: lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/LiveDebugVariables.cpp.o 
ccache /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/clang++ -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/lib/CodeGen -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/llvm/lib/CodeGen -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fPIC  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/LiveDebugVariables.cpp.o -MF lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/LiveDebugVariables.cpp.o.d -o lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/LiveDebugVariables.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/llvm/lib/CodeGen/LiveDebugVariables.cpp
clang++: /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/llvm/include/llvm/CodeGen/Register.h:83: unsigned int llvm::Register::virtRegIndex() const: Assertion `isVirtual() && "Not a virtual register"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/clang++ -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -std=c++17 -fPIC -fno-exceptions -funwind-tables -fno-rtti -DGTEST_HAS_RTTI=0 -DLLVM_EXPORTS -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/lib/CodeGen -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/llvm/lib/CodeGen -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage2/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/llvm/include -DNDEBUG -UNDEBUG -c -o lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/LiveDebugVariables.cpp.o /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/llvm/lib/CodeGen/LiveDebugVariables.cpp
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module '/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/llvm/lib/CodeGen/LiveDebugVariables.cpp'.
4.	Running pass 'Register Coalescer' on function '@_ZN4llvm18LiveDebugVariables7LDVImpl18collectDebugValuesERNS_15MachineFunctionEb'
 #0 0x00007fff8def6e40 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMSupport.so.21.0git+0x256e40)
 #1 0x00007fff8def4754 llvm::sys::CleanupOnSignal(unsigned long) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMSupport.so.21.0git+0x254754)
 #2 0x00007fff8dd95858 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007fff9bbe04d8 (linux-vdso64.so.1+0x4d8)
 #4 0x00007fff8d71a4c8 raise (/lib64/libc.so.6+0x4a4c8)
 #5 0x00007fff8d6f4a54 abort (/lib64/libc.so.6+0x24a54)
 #6 0x00007fff8d70dcb0 __assert_fail_base (/lib64/libc.so.6+0x3dcb0)
 #7 0x00007fff8d70dd54 __assert_fail (/lib64/libc.so.6+0x3dd54)
 #8 0x00007fff922a2154 llvm::VirtReg2IndexFunctor::operator()(llvm::Register) const (.isra.72.part.73) InlineSpiller.cpp:0:0
 #9 0x00007fff922a382c llvm::MachineRegisterInfo::getRegClass(llvm::Register) const (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMCodeGen.so.21.0git+0x31382c)
#10 0x00007fff926cb3e0 llvm::CoalescerPair::setRegisters(llvm::MachineInstr const*) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMCodeGen.so.21.0git+0x73b3e0)
#11 0x00007fff926d6d14 (anonymous namespace)::RegisterCoalescer::copyCoalesceWorkList(llvm::MutableArrayRef<llvm::MachineInstr*>) RegisterCoalescer.cpp:0:0
#12 0x00007fff926dd1f0 (anonymous namespace)::RegisterCoalescer::run(llvm::MachineFunction&) RegisterCoalescer.cpp:0:0
#13 0x00007fff926de290 (anonymous namespace)::RegisterCoalescerLegacy::runOnMachineFunction(llvm::MachineFunction&) RegisterCoalescer.cpp:0:0
#14 0x00007fff92424e80 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.100) MachineFunctionPass.cpp:0:0
#15 0x00007fff8e40fb5c llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMCore.so.21.0git+0x32fb5c)
#16 0x00007fff8e40feb8 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMCore.so.21.0git+0x32feb8)
#17 0x00007fff8e4110dc llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libLLVMCore.so.21.0git+0x3310dc)
#18 0x00007fff92eedfdc clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libclangCodeGen.so.21.0git+0x14dfdc)
#19 0x00007fff9335fa1c clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libclangCodeGen.so.21.0git+0x5bfa1c)
#20 0x00007fff89f54204 clang::ParseAST(clang::Sema&, bool, bool) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/../lib/libclangParse.so.21.0git+0x44204)
#21 0x00007fff90ff920c clang::ASTFrontendAction::ExecuteAction() (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libclangFrontend.so.21.0git+0x17920c)
#22 0x00007fff93360700 clang::CodeGenAction::ExecuteAction() (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libclangCodeGen.so.21.0git+0x5c0700)
#23 0x00007fff910000c8 clang::FrontendAction::Execute() (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libclangFrontend.so.21.0git+0x1800c8)
#24 0x00007fff90f79340 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libclangFrontend.so.21.0git+0xf9340)
#25 0x00007fff952a6094 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/../lib/libclangFrontendTool.so.21.0git+0x6094)
#26 0x000000001001b2ac cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1.install/bin/clang+++0x1001b2ac)
#27 0x00000000100107e0 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#28 0x00007fff90b0c688 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::'lambda'()>(long) Job.cpp:0:0

mikaelholmen · 2025-06-30T07:57:17Z

Hello @guy-david

The following starts crashing with this patch:

llc -O0 -o /dev/null bbi-108462.ll -enable-subreg-liveness=1 -optimize-regalloc -mtriple=aarch64-none-linux-gnu

It crashes like

llc: ../include/llvm/CodeGen/Register.h:83: unsigned int llvm::Register::virtRegIndex() const: Assertion `isVirtual() && "Not a virtual register"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build-all/bin/llc -O0 -o /dev/null bbi-108462.ll -enable-subreg-liveness=1 -optimize-regalloc -mtriple=aarch64-none-linux-gnu
1.	Running pass 'Function Pass Manager' on module 'bbi-108462.ll'.
2.	Running pass 'Register Coalescer' on function '@f3'
 #0 0x000055cf75860066 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all/bin/llc+0x76a7066)
 #1 0x000055cf7585db85 llvm::sys::RunSignalHandlers() (build-all/bin/llc+0x76a4b85)
 #2 0x000055cf75860799 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x00007f433c539d10 __restore_rt (/lib64/libpthread.so.0+0x12d10)
 #4 0x00007f4339ed952f raise (/lib64/libc.so.6+0x4e52f)
 #5 0x00007f4339eace65 abort (/lib64/libc.so.6+0x21e65)
 #6 0x00007f4339eacd39 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21d39)
 #7 0x00007f4339ed1e86 (/lib64/libc.so.6+0x46e86)
 #8 0x000055cf74ac72d5 llvm::CoalescerPair::setRegisters(llvm::MachineInstr const*) (build-all/bin/llc+0x690e2d5)
 #9 0x000055cf74acc643 (anonymous namespace)::RegisterCoalescer::joinCopy(llvm::MachineInstr*, bool&, llvm::SmallPtrSetImpl<llvm::MachineInstr*>&) RegisterCoalescer.cpp:0:0
#10 0x000055cf74acbd38 (anonymous namespace)::RegisterCoalescer::copyCoalesceWorkList(llvm::MutableArrayRef<llvm::MachineInstr*>) RegisterCoalescer.cpp:0:0
#11 0x000055cf74ac9220 (anonymous namespace)::RegisterCoalescer::run(llvm::MachineFunction&) RegisterCoalescer.cpp:0:0
#12 0x000055cf74aca7a6 (anonymous namespace)::RegisterCoalescerLegacy::runOnMachineFunction(llvm::MachineFunction&) RegisterCoalescer.cpp:0:0
#13 0x000055cf74879de7 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (build-all/bin/llc+0x66c0de7)
#14 0x000055cf74dd6209 llvm::FPPassManager::runOnFunction(llvm::Function&) (build-all/bin/llc+0x6c1d209)
#15 0x000055cf74dde7e2 llvm::FPPassManager::runOnModule(llvm::Module&) (build-all/bin/llc+0x6c257e2)
#16 0x000055cf74dd6cc8 llvm::legacy::PassManagerImpl::run(llvm::Module&) (build-all/bin/llc+0x6c1dcc8)
#17 0x000055cf72819c70 compileModule(char**, llvm::LLVMContext&) llc.cpp:0:0
#18 0x000055cf72817380 main (build-all/bin/llc+0x465e380)
#19 0x00007f4339ec57e5 __libc_start_main (/lib64/libc.so.6+0x3a7e5)
#20 0x000055cf728167ee _start (build-all/bin/llc+0x465d7ee)
Abort (core dumped)

I originally saw the same crash without any special flags for my out-of-tree target and then saw I could reproduce on aarch64 (with added flags) as well.

PHI elimination tturns

dead %5:gpr64 = COPY $xzr

into

dead $noreg = COPY $xzr

and I think that is what then trips the coalescer over.
bbi-108462.ll.gz

guy-david · 2025-06-30T09:08:43Z

Thanks for looking into this, issued a fix in: #146320.

mikaelholmen · 2025-06-30T09:48:33Z

Thanks for looking into this, issued a fix in: #146320.

Thanks, that fix seems to solve that problem.

It looks like there are other problems as well though. I don't have a reproducer I can share now but if we have virtual registers of two different register classes "32BitRC" with 32 bit registers and "16BitRC" with 16 bit registers it looks like it turns

  %8:32BitRC = COPY %7:32BitRC
  [...]
  %2:16BitRC = PHI %8.low16:32BitRC, %bb.0, %1:16BitRC, %bb.1

into

  %9:16BitRC = COPY %7:32BitRC
  [...]
  %2:16BitRC = COPY killed %9:16BitRC

i.e. it ignores the sub register in the PHI?

PR which introduced the bug: #131837. Fixes a crash around dead registers which started in f5c62ee by verifying that the reused incoming register is also virtual.

PR which introduced the bug: llvm/llvm-project#131837. Fixes a crash around dead registers which started in f5c62ee by verifying that the reused incoming register is also virtual.

mikaelholmen · 2025-06-30T11:21:58Z

It looks like there are other problems as well though. I don't have a reproducer I can share now

I fiddled with the repro for my out-of-tree target and changed it to something for aarch64:
llc bbi-108462_2_aarch64.mir -mtriple=aarch64 -o - -run-pass phi-node-elimination

Now, I don't know aarch64 and its register classes but the sub register access "%1.sub_32" is just dropped in the output.
For my target at least, that is wrong because in the case where it originally broke, the sub registers specifies if we should use the high or low 16 bits of the 32 bit operand.

bbi-108462_2_aarch64.mir.gz

guy-david · 2025-06-30T11:40:33Z

Now I feel bad 😆 Can you verify whether c4c9e0e solves the issue?

mikaelholmen · 2025-06-30T11:48:19Z

Now I feel bad 😆 Can you verify whether c4c9e0e solves the issue?

It does. Thanks! :)

jayfoad · 2025-06-30T12:30:23Z

The insertion point of COPY isn't always optimal and could eventually lead to a worse block layout, see the regression test in the first commit.

This change affects many architectures but the amount of total instructions in the test cases seems too be slightly lower.

The COPY you reuse is dead (right?) so why is reusing it any better than inserting a new one and allowing the dead one to be DCEd? (Or, why wasn't the dead COPY already DCEd before we got to this point?)

arsenm · 2025-06-30T12:34:45Z

llvm/lib/CodeGen/PHIElimination.cpp

+    // Reuse an existing copy in the block if possible.
+    if (MachineInstr *DefMI = MRI->getUniqueVRegDef(SrcReg)) {
+      if (DefMI->isCopy() && DefMI->getParent() == &opBlock &&
+          MRI->use_empty(SrcReg)) {


use_nodbg_empty to avoid debug instruction effects

guy-david · 2025-06-30T12:43:11Z

The insertion point of COPY isn't always optimal and could eventually lead to a worse block layout, see the regression test in the first commit.
This change affects many architectures but the amount of total instructions in the test cases seems too be slightly lower.

The COPY you reuse is dead (right?) so why is reusing it any better than inserting a new one and allowing the dead one to be DCEd? (Or, why wasn't the dead COPY already DCEd before we got to this point?)

That's an unfortunate edge case for the problem I was trying to solve, for which I added a regression test in #146320. In the original issue there were no dead instructions.

jayfoad · 2025-06-30T12:57:53Z

The insertion point of COPY isn't always optimal and could eventually lead to a worse block layout, see the regression test in the first commit.
This change affects many architectures but the amount of total instructions in the test cases seems too be slightly lower.

The COPY you reuse is dead (right?) so why is reusing it any better than inserting a new one and allowing the dead one to be DCEd? (Or, why wasn't the dead COPY already DCEd before we got to this point?)

That's an unfortunate edge case for the problem I was trying to solve, for which I added a regression test in #146320. In the original issue there were no dead instructions.

No, I mean in your patch DefMI is a COPY which defines SrcReg which has no uses, therefore it's a dead COPY, right?

…ck (llvm#131837)" hangs hipCatch2 tests This reverts commit f5c62ee.

guy-david · 2025-06-30T13:23:10Z

The insertion point of COPY isn't always optimal and could eventually lead to a worse block layout, see the regression test in the first commit.
This change affects many architectures but the amount of total instructions in the test cases seems too be slightly lower.

The COPY you reuse is dead (right?) so why is reusing it any better than inserting a new one and allowing the dead one to be DCEd? (Or, why wasn't the dead COPY already DCEd before we got to this point?)

That's an unfortunate edge case for the problem I was trying to solve, for which I added a regression test in #146320. In the original issue there were no dead instructions.

No, I mean in your patch DefMI is a COPY which defines SrcReg which has no uses, therefore it's a dead COPY, right?

The user is the PHI node which is being removed in flight.

mikaelholmen · 2025-07-01T04:44:02Z

Hi @guy-david,

Another verifier error like this:

llc bbi-108462_3_aarch64.mir -o - -verify-machineinstrs -run-pass livevars,phi-node-elimination -mtriple=aarch64

It fails with:

# After Eliminate PHI nodes for register allocation
# Machine code for function main: NoPHIs, TracksLiveness

bb.0:
  successors: %bb.1(0x80000000); %bb.1(100.00%)
  liveins: $w0, $w1, $nzcv
  %0:gpr32 = COPY killed $w0
  %4:gpr32 = COPY killed $w1
  B %bb.1

bb.1:
; predecessors: %bb.0, %bb.1, %bb.2
  successors: %bb.2(0x40000000), %bb.1(0x40000000); %bb.2(50.00%), %bb.1(50.00%)
  liveins: $nzcv
  dead %2:gpr32 = COPY killed %4:gpr32
  %4:gpr32 = COPY %0:gpr32
  Bcc 1, %bb.1, implicit $nzcv

bb.2:
; predecessors: %bb.1
  successors: %bb.1(0x80000000); %bb.1(100.00%)
  liveins: $nzcv
  %4:gpr32 = IMPLICIT_DEF
  B %bb.1

# End machine code for function main.

*** Bad machine code: LiveVariables: Block should not be in AliveBlocks ***
- function:    main
- basic block: %bb.2  (0x5620b8ee9ba0)
Virtual register %3 is not needed live through the block.
LLVM ERROR: Found 1 machine code errors.

bbi-108462_3_aarch64.mir.gz

mstorsjo · 2025-07-01T12:43:47Z

I'm also running into miscompilations caused by this in libvpx, for armv7 targets, with the latest git main of llvm.

The reproducer for the miscompile is https://martin.st/temp/y4minput-preproc.c, compiled like this:

$ clang -target armv7-w64-mingw32 y4minput-preproc.c -c -o y4minput.c.o -O2

guy-david · 2025-07-01T14:13:33Z

I'm also running into miscompilations caused by this in libvpx, for armv7 targets, with the latest git main of llvm.

The reproducer for the miscompile is https://martin.st/temp/y4minput-preproc.c, compiled like this:
$ clang -target armv7-w64-mingw32 y4minput-preproc.c -c -o y4minput.c.o -O2

Sorry for the inconvenience. I was not able to reproduce locally, can you test whether #146337 fixes the issue?

…#131837) The insertion point of COPY isn't always optimal and could eventually lead to a worse block layout, see the regression test in the first commit. This change affects many architectures but the amount of total instructions in the test cases seems too be slightly lower.

PR which introduced the bug: llvm#131837. Fixes a crash around dead registers which started in f5c62ee by verifying that the reused incoming register is also virtual.

…#131837) The insertion point of COPY isn't always optimal and could eventually lead to a worse block layout, see the regression test in the first commit. This change affects many architectures but the amount of total instructions in the test cases seems too be slightly lower.

PR which introduced the bug: llvm#131837. Fixes a crash around dead registers which started in f5c62ee by verifying that the reused incoming register is also virtual.

…ass, update livevars. (#146337) Follow up to the second bug that #131837 introduced, described in #131837 (comment).

…register class, update livevars. (#146337) Follow up to the second bug that llvm/llvm-project#131837 introduced, described in llvm/llvm-project#131837 (comment).

guy-david requested a review from jcohen-apple March 18, 2025 16:00

llvmbot added backend:ARM backend:AArch64 backend:AMDGPU backend:Hexagon backend:PowerPC backend:SystemZ backend:X86 llvm:regalloc debuginfo llvm:globalisel backend:loongarch llvm:transforms labels Mar 18, 2025

guy-david requested review from arsenm, RKSimon and wangpc-pp March 18, 2025 16:01

arsenm reviewed Mar 19, 2025

View reviewed changes

llvm/test/CodeGen/AArch64/PHIElimination-reuse-copy.mir Outdated Show resolved Hide resolved

guy-david force-pushed the users/guy-david/phi-elimination-reuse-copy branch from 1f7635b to 3593737 Compare March 20, 2025 12:26

guy-david requested a review from arsenm March 20, 2025 12:26

guy-david force-pushed the users/guy-david/phi-elimination-reuse-copy branch 2 times, most recently from 0ae66b8 to d87dc5b Compare March 24, 2025 07:57

guy-david force-pushed the users/guy-david/phi-elimination-reuse-copy branch from d87dc5b to c7d638d Compare March 30, 2025 07:50

guy-david force-pushed the users/guy-david/phi-elimination-reuse-copy branch from c7d638d to 8848f2e Compare April 2, 2025 15:38

guy-david force-pushed the users/guy-david/phi-elimination-reuse-copy branch 2 times, most recently from e549696 to 045edd6 Compare April 20, 2025 19:43

guy-david force-pushed the users/guy-david/phi-elimination-reuse-copy branch from 045edd6 to 5d768f6 Compare April 28, 2025 09:38

guy-david force-pushed the users/guy-david/phi-elimination-reuse-copy branch from 5d768f6 to 8cb3107 Compare June 26, 2025 14:17

mtrofin added a commit to mtrofin/llvm-project that referenced this pull request Jun 30, 2025

[mlgo][regalloc] Fix after PR llvm#131837

831f14c

mtrofin added a commit that referenced this pull request Jun 30, 2025

[mlgo][regalloc] Fix after PR #131837 (#146297)

9a6e068

guy-david mentioned this pull request Jun 30, 2025

[PHIElimination] Fix bug around $noreg assignment #146320

Merged

guy-david mentioned this pull request Jun 30, 2025

[PHIElimination] Verify reappropriated COPY is of similar register class, update livevars. #146337

Merged

arsenm reviewed Jun 30, 2025

View reviewed changes

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Jun 30, 2025

Revert "[PHIElimination] Reuse existing COPY in predecessor basic blo…

2105da7

…ck (llvm#131837)" hangs hipCatch2 tests This reverts commit f5c62ee.

rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025

[mlgo][regalloc] Fix after PR llvm#131837 (llvm#146297)

b89d651

rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025

[mlgo][regalloc] Fix after PR llvm#131837 (llvm#146297)

3c5196e

guy-david added a commit that referenced this pull request Jul 1, 2025

[PHIElimination] Verify reappropriated COPY is of similar register cl…

01a6c08

…ass, update livevars. (#146337) Follow up to the second bug that #131837 introduced, described in #131837 (comment).

[PHIElimination] Reuse existing COPY in predecessor basic block #131837

[PHIElimination] Reuse existing COPY in predecessor basic block #131837

Conversation

guy-david commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 18, 2025

Uh oh!

Uh oh!

guy-david commented Mar 30, 2025

Uh oh!

llvm-ci commented Jun 29, 2025

Uh oh!

mikaelholmen commented Jun 30, 2025

Uh oh!

guy-david commented Jun 30, 2025

Uh oh!

mikaelholmen commented Jun 30, 2025

Uh oh!

mikaelholmen commented Jun 30, 2025

Uh oh!

guy-david commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mikaelholmen commented Jun 30, 2025

Uh oh!

jayfoad commented Jun 30, 2025

Uh oh!

arsenm Jun 30, 2025

Choose a reason for hiding this comment

Uh oh!

guy-david commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jayfoad commented Jun 30, 2025

Uh oh!

guy-david commented Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mikaelholmen commented Jul 1, 2025

Uh oh!

mstorsjo commented Jul 1, 2025

Uh oh!

guy-david commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

guy-david commented Mar 18, 2025 •

edited

Loading

llvmbot commented Mar 18, 2025 •

edited

Loading

guy-david commented Jun 30, 2025 •

edited

Loading

guy-david commented Jun 30, 2025 •

edited

Loading

guy-david commented Jun 30, 2025 •

edited

Loading

guy-david commented Jul 1, 2025 •

edited

Loading