-
Notifications
You must be signed in to change notification settings - Fork 14.3k
RegAllocGreedy: Fix use after free during last chance recoloring #120697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RegAllocGreedy: Fix use after free during last chance recoloring #120697
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-regalloc Author: Matt Arsenault (arsenm) ChangesLast chance recoloring can delete the current fixed interval I have only seen this occur in error situations where the allocation This feels very brute force, but I've spent over a week debugging Full diff: https://github.com/llvm/llvm-project/pull/120697.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/RegAllocGreedy.cpp b/llvm/lib/CodeGen/RegAllocGreedy.cpp
index 4fa2bc76b38b4b..d81262de0e73c9 100644
--- a/llvm/lib/CodeGen/RegAllocGreedy.cpp
+++ b/llvm/lib/CodeGen/RegAllocGreedy.cpp
@@ -2029,6 +2029,9 @@ unsigned RAGreedy::tryLastChanceRecoloring(const LiveInterval &VirtReg,
// available colors.
Matrix->assign(VirtReg, PhysReg);
+ // VirtReg may be deleted during tryRecoloringCandidates, save a copy.
+ Register ThisVirtReg = VirtReg.reg();
+
// Save the current recoloring state.
// If we cannot recolor all the interferences, we will have to start again
// at this point for the next physical register.
@@ -2040,8 +2043,16 @@ unsigned RAGreedy::tryLastChanceRecoloring(const LiveInterval &VirtReg,
NewVRegs.push_back(NewVReg);
// Do not mess up with the global assignment process.
// I.e., VirtReg must be unassigned.
- Matrix->unassign(VirtReg);
- return PhysReg;
+ if (VRM->hasPhys(ThisVirtReg)) {
+ Matrix->unassign(VirtReg);
+ return PhysReg;
+ }
+
+ // It is possible VirtReg will be deleted during tryRecoloringCandidates.
+ LLVM_DEBUG(dbgs() << "tryRecoloringCandidates deleted a fixed register "
+ << printReg(ThisVirtReg) << '\n');
+ FixedRegisters.erase(ThisVirtReg);
+ return 0;
}
LLVM_DEBUG(dbgs() << "Fail to assign: " << VirtReg << " to "
diff --git a/llvm/test/CodeGen/AMDGPU/inflated-reg-class-snippet-copy-use-after-free.mir b/llvm/test/CodeGen/AMDGPU/inflated-reg-class-snippet-copy-use-after-free.mir
new file mode 100644
index 00000000000000..503f27edf70df2
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/inflated-reg-class-snippet-copy-use-after-free.mir
@@ -0,0 +1,204 @@
+# RUN: not llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -simplify-mir -start-before=greedy,2 -stress-regalloc=4 -stop-before=virtregrewriter,2 -o - -verify-regalloc %s 2> %t.err | FileCheck %s
+# RUN: FileCheck -check-prefix=ERR %s < %t.err
+
+# To allocate the vreg_512_align2, the allocation will attempt to
+# inflate the register class to av_512_align2. This will ultimately
+# not work, and the allocation will fail. There is an unproductive
+# live range split, and we end up with a snippet copy of an
+# unspillable register. Recursive assignment of interfering ranges
+# during last chance recoloring would delete the unspillable snippet
+# live range. Make sure there's no use after free when rolling back
+# the last chance assignment.
+
+# ERR: error: <unknown>:0:0: ran out of registers during register allocation in function 'inflated_reg_class_copy_use_after_free'
+# ERR: error: <unknown>:0:0: ran out of registers during register allocation in function 'inflated_reg_class_copy_use_after_free_lane_subset'
+
+--- |
+ define amdgpu_kernel void @inflated_reg_class_copy_use_after_free() {
+ ret void
+ }
+
+ define amdgpu_kernel void @inflated_reg_class_copy_use_after_free_lane_subset() {
+ ret void
+ }
+
+...
+
+# CHECK-LABEL: name: inflated_reg_class_copy_use_after_free
+# CHECK: S_NOP 0, implicit-def [[ORIG_REG:%[0-9]+]].sub0_sub1_sub2_sub3
+# CHECK-NEXT: SI_SPILL_AV512_SAVE [[ORIG_REG]], %stack.0, $sgpr32, 0, implicit $exec :: (store (s512) into %stack.0, align 4, addrspace 5)
+# CHECK-NEXT: [[RESTORE0:%[0-9]+]]:vreg_512_align2 = SI_SPILL_V512_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s512) from %stack.0, align 4, addrspace 5)
+# CHECK-NEXT: early-clobber [[MFMA0:%[0-9]+]]:vreg_512_align2 = V_MFMA_F32_16X16X1F32_vgprcd_e64 undef %3:vgpr_32, undef %3:vgpr_32, [[RESTORE0]], 0, 0, 0, implicit $mode, implicit $exec, implicit $mode, implicit $exec
+# CHECK-NEXT: undef [[SPLIT0:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[MFMA0]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT0]].sub0:av_512_align2 = COPY [[MFMA0]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: undef [[SPLIT1:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[SPLIT0]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT1]].sub0:av_512_align2 = COPY [[SPLIT0]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: undef [[SPLIT2:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[SPLIT1]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT2]].sub0:av_512_align2 = COPY [[SPLIT1]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: SI_SPILL_AV512_SAVE [[SPLIT2]], %stack.1, $sgpr32, 0, implicit $exec :: (store (s512) into %stack.1, align 4, addrspace 5)
+# CHECK-NEXT: [[RESTORE1:%[0-9]+]]:av_512_align2 = SI_SPILL_AV512_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s512) from %stack.0, align 4, addrspace 5)
+# CHECK-NEXT: undef [[SPLIT3:%[0-9]+]].sub0_sub1:av_512_align2 = COPY [[RESTORE1]].sub0_sub1
+# CHECK-NEXT: [[RESTORE2:%[0-9]+]]:av_512_align2 = SI_SPILL_AV512_RESTORE %stack.1, $sgpr32, 0, implicit $exec :: (load (s512) from %stack.1, align 4, addrspace 5)
+# CHECK-NEXT: undef [[SPLIT3:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[RESTORE2]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT3]].sub0:av_512_align2 = COPY [[RESTORE2]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: undef [[SPLIT4:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[SPLIT3]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT4]].sub0:av_512_align2 = COPY [[SPLIT3]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: [[SPLIT5:%[0-9]+]].sub2:av_512_align2 = COPY [[SPLIT4]].sub3
+# CHECK-NEXT: undef [[SPLIT6:%[0-9]+]].sub0_sub1_sub2:av_512_align2 = COPY [[SPLIT5]].sub0_sub1_sub2
+# CHECK-NEXT: undef [[SPLIT7:%[0-9]+]].sub0_sub1_sub2:av_512_align2 = COPY [[SPLIT6]].sub0_sub1_sub2
+# CHECK-NEXT: undef [[SPLIT8:%[0-9]+]].sub0:av_512_align2 = COPY [[SPLIT4]].sub0 {
+# CHECK-NEXT: internal [[SPLIT8]].sub2:av_512_align2 = COPY [[SPLIT4]].sub2
+# CHECK-NEXT: }
+# CHECK-NEXT: [[SPLIT9:%[0-9]+]].sub3:av_512_align2 = COPY [[SPLIT8]].sub2
+# CHECK-NEXT: undef [[SPLIT10:%[0-9]+]].sub0_sub1_sub2_sub3:av_512_align2 = COPY [[SPLIT9]].sub0_sub1_sub2_sub3
+# CHECK-NEXT: undef [[SPLIT13:%[0-9]+]].sub0_sub1_sub2_sub3:vreg_512_align2 = COPY [[SPLIT10]].sub0_sub1_sub2_sub3
+# CHECK-NEXT: [[MFMA_USE1:%[0-9]+]].sub4:vreg_512_align2 = COPY [[SPLIT8]].sub0
+# CHECK-NEXT: [[MFMA_USE1]].sub5:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub6:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub7:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub8:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub9:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub10:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub11:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub12:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub13:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub14:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]].sub15:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[MFMA_USE1]]:vreg_512_align2 = V_MFMA_F32_16X16X1F32_mac_vgprcd_e64 undef %3:vgpr_32, undef %3:vgpr_32, [[MFMA_USE1]], 0, 0, 0, implicit $mode, implicit $exec
+
+---
+name: inflated_reg_class_copy_use_after_free
+tracksRegLiveness: true
+machineFunctionInfo:
+ isEntryFunction: true
+ scratchRSrcReg: '$sgpr72_sgpr73_sgpr74_sgpr75'
+ stackPtrOffsetReg: '$sgpr32'
+ occupancy: 7
+ vgprForAGPRCopy: '$vgpr255'
+ sgprForEXECCopy: '$sgpr74_sgpr75'
+body: |
+ bb.0:
+ liveins: $vgpr0, $sgpr4_sgpr5
+
+ %0:vgpr_32 = IMPLICIT_DEF
+ renamable $sgpr0_sgpr1 = S_LOAD_DWORDX2_IMM killed undef renamable $sgpr4_sgpr5, 0, 0 :: (load (s64), addrspace 4)
+ S_NOP 0, implicit-def undef %1.sub12_sub13_sub14_sub15:vreg_512_align2
+ S_NOP 0, implicit-def %1.sub8_sub9_sub10_sub11:vreg_512_align2
+ S_NOP 0, implicit-def %1.sub4_sub5_sub6_sub7:vreg_512_align2
+ S_NOP 0, implicit-def %1.sub0_sub1_sub2_sub3:vreg_512_align2
+ early-clobber %2:vreg_512_align2 = V_MFMA_F32_16X16X1F32_vgprcd_e64 undef %3:vgpr_32, undef %3:vgpr_32, %1, 0, 0, 0, implicit $mode, implicit $exec, implicit $mode, implicit $exec
+ %1.sub2:vreg_512_align2 = COPY %2.sub3
+ %1.sub3:vreg_512_align2 = COPY %2.sub2
+ %1.sub4:vreg_512_align2 = COPY %2.sub0
+ %1.sub5:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub6:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub7:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub8:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub9:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub10:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub11:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub12:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub13:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub14:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub15:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1:vreg_512_align2 = V_MFMA_F32_16X16X1F32_mac_vgprcd_e64 undef %3:vgpr_32, undef %3:vgpr_32, %1, 0, 0, 0, implicit $mode, implicit $exec
+ GLOBAL_STORE_DWORDX4_SADDR undef %3:vgpr_32, %1.sub12_sub13_sub14_sub15, undef renamable $sgpr0_sgpr1, 96, 0, implicit $exec :: (store (s128), addrspace 1)
+ S_ENDPGM 0
+
+...
+
+# This test is similar to except it is still broken when the use
+# instruction does not read the full set of lanes after one attempted fix.
+
+# CHECK-LABEL: name: inflated_reg_class_copy_use_after_free_lane_subset
+# CHECK: S_NOP 0, implicit-def [[ORIG_REG:%[0-9]+]].sub0_sub1_sub2_sub3
+# CHECK-NEXT: SI_SPILL_AV512_SAVE [[ORIG_REG]], %stack.0, $sgpr32, 0, implicit $exec :: (store (s512) into %stack.0, align 4, addrspace 5)
+# CHECK-NEXT: [[RESTORE_0:%[0-9]+]]:av_512_align2 = SI_SPILL_AV512_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s512) from %stack.0, align 4, addrspace 5)
+# CHECK-NEXT: S_NOP 0, implicit-def early-clobber [[REG1:%[0-9]+]], implicit [[RESTORE_0]].sub0_sub1_sub2_sub3, implicit [[RESTORE_0]].sub4_sub5_sub6_sub7
+# CHECK-NEXT: undef [[SPLIT0:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[REG1]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT0]].sub0:av_512_align2 = COPY [[REG1]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: undef [[SPLIT1:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[SPLIT0]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT1]].sub0:av_512_align2 = COPY [[SPLIT0]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: undef [[SPLIT2:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[SPLIT1]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT2]].sub0:av_512_align2 = COPY [[SPLIT1]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: SI_SPILL_AV512_SAVE [[SPLIT2]], %stack.1, $sgpr32, 0, implicit $exec :: (store (s512) into %stack.1, align 4, addrspace 5)
+# CHECK-NEXT: [[RESTORE_1:%[0-9]+]]:av_512_align2 = SI_SPILL_AV512_RESTORE %stack.0, $sgpr32, 0, implicit $exec :: (load (s512) from %stack.0, align 4, addrspace 5)
+# CHECK-NEXT: undef [[SPLIT3:%[0-9]+]].sub0_sub1:av_512_align2 = COPY [[RESTORE_1]].sub0_sub1
+# CHECK-NEXT: [[RESTORE_2:%[0-9]+]]:av_512_align2 = SI_SPILL_AV512_RESTORE %stack.1, $sgpr32, 0, implicit $exec :: (load (s512) from %stack.1, align 4, addrspace 5)
+# CHECK-NEXT: undef [[SPLIT4:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[RESTORE_2]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT4]].sub0:av_512_align2 = COPY [[RESTORE_2]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: undef [[SPLIT5:%[0-9]+]].sub2_sub3:av_512_align2 = COPY [[SPLIT4]].sub2_sub3 {
+# CHECK-NEXT: internal [[SPLIT5]].sub0:av_512_align2 = COPY [[SPLIT4]].sub0
+# CHECK-NEXT: }
+# CHECK-NEXT: [[SPLIT3]].sub2:av_512_align2 = COPY [[SPLIT5]].sub3
+# CHECK-NEXT: undef [[SPLIT6:%[0-9]+]].sub0_sub1_sub2:av_512_align2 = COPY [[SPLIT3]].sub0_sub1_sub2
+# CHECK-NEXT: undef [[SPLIT7:%[0-9]+]].sub0_sub1_sub2:av_512_align2 = COPY [[SPLIT6]].sub0_sub1_sub2
+# CHECK-NEXT: undef [[SPLIT8:%[0-9]+]].sub0:av_512_align2 = COPY [[SPLIT5]].sub0 {
+# CHECK-NEXT: internal [[SPLIT8]].sub2:av_512_align2 = COPY [[SPLIT5]].sub2
+# CHECK-NEXT: }
+# CHECK-NEXT: [[SPLIT7]].sub3:av_512_align2 = COPY [[SPLIT8]].sub2
+# CHECK-NEXT: undef [[SPLIT9:%[0-9]+]].sub0_sub1_sub2_sub3:av_512_align2 = COPY [[SPLIT7]].sub0_sub1_sub2_sub3
+# CHECK-NEXT: undef [[LAST_USE:%[0-9]+]].sub0_sub1_sub2_sub3:vreg_512_align2 = COPY [[SPLIT9]].sub0_sub1_sub2_sub3
+# CHECK-NEXT: [[LAST_USE]].sub4:vreg_512_align2 = COPY [[SPLIT8]].sub0
+# CHECK-NEXT: [[LAST_USE]].sub5:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub6:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub7:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub8:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub9:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub10:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub11:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub12:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub13:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub14:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: [[LAST_USE]].sub15:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+# CHECK-NEXT: S_NOP 0, implicit-def [[LAST_USE]], implicit [[LAST_USE]].sub0_sub1_sub2_sub3, implicit [[LAST_USE]].sub4_sub5_sub6_sub7, implicit [[LAST_USE]].sub8_sub9_sub10_sub11
+
+---
+name: inflated_reg_class_copy_use_after_free_lane_subset
+tracksRegLiveness: true
+machineFunctionInfo:
+ isEntryFunction: true
+ scratchRSrcReg: '$sgpr72_sgpr73_sgpr74_sgpr75'
+ stackPtrOffsetReg: '$sgpr32'
+ occupancy: 7
+ vgprForAGPRCopy: '$vgpr255'
+ sgprForEXECCopy: '$sgpr74_sgpr75'
+body: |
+ bb.0:
+ liveins: $vgpr0, $sgpr4_sgpr5
+
+ %0:vgpr_32 = IMPLICIT_DEF
+ renamable $sgpr0_sgpr1 = S_LOAD_DWORDX2_IMM killed undef renamable $sgpr4_sgpr5, 0, 0 :: (load (s64), addrspace 4)
+ S_NOP 0, implicit-def undef %1.sub12_sub13_sub14_sub15:vreg_512_align2
+ S_NOP 0, implicit-def %1.sub8_sub9_sub10_sub11:vreg_512_align2
+ S_NOP 0, implicit-def %1.sub4_sub5_sub6_sub7:vreg_512_align2
+ S_NOP 0, implicit-def %1.sub0_sub1_sub2_sub3:vreg_512_align2
+ S_NOP 0, implicit-def early-clobber %2:vreg_512_align2, implicit %1.sub0_sub1_sub2_sub3, implicit %1.sub4_sub5_sub6_sub7
+ %1.sub2:vreg_512_align2 = COPY %2.sub3
+ %1.sub3:vreg_512_align2 = COPY %2.sub2
+ %1.sub4:vreg_512_align2 = COPY %2.sub0
+ %1.sub5:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub6:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub7:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub8:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub9:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub10:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub11:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub12:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub13:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub14:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ %1.sub15:vreg_512_align2 = V_MOV_B32_e32 0, implicit $exec
+ S_NOP 0, implicit-def %1:vreg_512_align2, implicit %1.sub0_sub1_sub2_sub3, implicit %1.sub4_sub5_sub6_sub7, implicit %1.sub8_sub9_sub10_sub11
+ GLOBAL_STORE_DWORDX4_SADDR undef %3:vgpr_32, %1.sub12_sub13_sub14_sub15, undef renamable $sgpr0_sgpr1, 96, 0, implicit $exec :: (store (s128), addrspace 1)
+ S_ENDPGM 0
+
+...
|
I have found a test which succeeds allocating after hitting the use after free. However I haven't managed to come up with a MIR test which I believe will be stable enough to test it forever |
|
Last chance recoloring can delete the current fixed interval during recursive assignment of interfering live intervals. Check if the virtual register value was assigned before attempting the unassignment, as is done in other scenarios. This relies on the fact that we do not recycle virtual register numbers. I have only seen this occur in error situations where the allocation will fail, but I think this can theoretically happen in working allocations. This feels very brute force, but I've spent over a week debugging this and this is what works without any lit regressions. The surprising piece to me was that unspillable live ranges may be spilled, and a number of tests rely on optimizations occurring on them. My other attempts to fixed this mostly revolved around not identifying unspillable live ranges as snippet copies. I've also discovered we're making some unproductive live range splits with subranges. If we avoid such splits, some of the unspillable copies disappear but mandating that be precise to fix a use after free doesn't sound right.
5398c21
to
c65ef86
Compare
Added a reasonably sized test which manages to hit the use after free, and succeeds after |
Probably, but it seems to be well baked in (and not really related to this patch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Out of curiosity does AMDGPU rely on last chance coloring to work?
I would have expected this stage to be prohibitively expensive on this target and that you would have just disabled it (by overriding the TargetRegisterInfo::shouldUseLastChanceRecoloringForVirtReg method.)
I encounter it semi-regularly. There are many issues related to subrange splitting (the most pressing may be that spill/restore loses live lanes). I think we need a new form of subrange split too |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/11499 Here is the relevant piece of the build log for the reference
|
@arsenm Please can you take a look at the expensive-checks buildbot failures? |
Probably should just put a UNSUPPORTED: expensive_checks for now |
also, got failed test on |
I xfailed this for expensive checks in a8f3eba |
Also it turned out the input was just broken, it wasn't an allocation induced error: 93e6346 |
Last chance recoloring can delete the current fixed interval
during recursive assignment of interfering live intervals. Check
if the virtual register value was assigned before attempting the
unassignment, as is done in other scenarios. This relies on the fact
that we do not recycle virtual register numbers.
I have only seen this occur in error situations where the allocation
will fail, but I think this can theoretically happen in working
allocations.
This feels very brute force, but I've spent over a week debugging
this and this is what works without any lit regressions. The surprising
piece to me was that unspillable live ranges may be spilled, and
a number of tests rely on optimizations occurring on them. My other
attempts to fixed this mostly revolved around not identifying unspillable
live ranges as snippet copies. I've also discovered we're making some
unproductive live range splits with subranges. If we avoid such splits,
some of the unspillable copies disappear but mandating that be precise
to fix a use after free doesn't sound right.