Skip to content

[llvm][CodeGen] update live intervals for ModuloScheduleExpanderMVE #132677

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 10, 2025

Conversation

huaatian
Copy link
Contributor

ModuloScheduleExpanderMVE and ModuloScheduleExpander are used
sequentially in certain use cases. It is necessary to update
live intervals for ModuloScheduleExpanderMVE; otherwise, crashes
may occur.

ModuloScheduleExpanderMVE and ModuloScheduleExpander are used
sequentially in certain use cases. It is necessary to update
live intervals for ModuloScheduleExpanderMVE; otherwise, crashes
may occur.
@llvmbot
Copy link
Member

llvmbot commented Mar 24, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Hua Tian (huaatian)

Changes

ModuloScheduleExpanderMVE and ModuloScheduleExpander are used
sequentially in certain use cases. It is necessary to update
live intervals for ModuloScheduleExpanderMVE; otherwise, crashes
may occur.


Full diff: https://github.com/llvm/llvm-project/pull/132677.diff

3 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/ModuloSchedule.h (+6)
  • (modified) llvm/lib/CodeGen/ModuloSchedule.cpp (+51-25)
  • (added) llvm/test/CodeGen/AArch64/aarch64-swp-ws-live-intervals-1.mir (+180)
diff --git a/llvm/include/llvm/CodeGen/ModuloSchedule.h b/llvm/include/llvm/CodeGen/ModuloSchedule.h
index b6000ba05d882..bfb302147b058 100644
--- a/llvm/include/llvm/CodeGen/ModuloSchedule.h
+++ b/llvm/include/llvm/CodeGen/ModuloSchedule.h
@@ -60,6 +60,7 @@
 #ifndef LLVM_CODEGEN_MODULOSCHEDULE_H
 #define LLVM_CODEGEN_MODULOSCHEDULE_H
 
+#include "llvm/ADT/SmallSet.h"
 #include "llvm/CodeGen/MachineFunction.h"
 #include "llvm/CodeGen/MachineLoopUtils.h"
 #include "llvm/CodeGen/TargetInstrInfo.h"
@@ -404,6 +405,9 @@ class ModuloScheduleExpanderMVE {
   /// NumUnroll = 1 means no unrolling.
   int NumUnroll;
 
+  /// Record the registers that need to compute live intervals.
+  SmallSet<Register, 8> NoIntervalRegs;
+
   void calcNumUnroll();
   void generatePipelinedLoop();
   void generateProlog(SmallVectorImpl<ValueMapTy> &VRMap);
@@ -436,6 +440,8 @@ class ModuloScheduleExpanderMVE {
                         MachineBasicBlock &GreaterThan,
                         MachineBasicBlock &Otherwise);
 
+  void calculateIntervals();
+
 public:
   ModuloScheduleExpanderMVE(MachineFunction &MF, ModuloSchedule &S,
                             LiveIntervals &LIS)
diff --git a/llvm/lib/CodeGen/ModuloSchedule.cpp b/llvm/lib/CodeGen/ModuloSchedule.cpp
index d208a62a99372..9cd4a177a8b6e 100644
--- a/llvm/lib/CodeGen/ModuloSchedule.cpp
+++ b/llvm/lib/CodeGen/ModuloSchedule.cpp
@@ -2159,7 +2159,8 @@ MachineInstr *ModuloScheduleExpanderMVE::cloneInstr(MachineInstr *OldMI) {
 /// If it is already dedicated exit, return it. Otherwise, insert a new
 /// block between them and return the new block.
 static MachineBasicBlock *createDedicatedExit(MachineBasicBlock *Loop,
-                                              MachineBasicBlock *Exit) {
+                                              MachineBasicBlock *Exit,
+                                              LiveIntervals &LIS) {
   if (Exit->pred_size() == 1)
     return Exit;
 
@@ -2169,6 +2170,7 @@ static MachineBasicBlock *createDedicatedExit(MachineBasicBlock *Loop,
   MachineBasicBlock *NewExit =
       MF->CreateMachineBasicBlock(Loop->getBasicBlock());
   MF->insert(Loop->getIterator(), NewExit);
+  LIS.insertMBBInMaps(NewExit);
 
   MachineBasicBlock *TBB = nullptr, *FBB = nullptr;
   SmallVector<MachineOperand, 4> Cond;
@@ -2213,6 +2215,14 @@ void ModuloScheduleExpanderMVE::insertCondBranch(MachineBasicBlock &MBB,
   }
 }
 
+/// Some registers are generated during the kernel expansion. We calculate the
+/// live intervals of these registers after the expansion.
+void ModuloScheduleExpanderMVE::calculateIntervals() {
+  for (Register Reg : NoIntervalRegs)
+    LIS.createAndComputeVirtRegInterval(Reg);
+  NoIntervalRegs.clear();
+}
+
 /// Generate a pipelined loop that is unrolled by using MVE algorithm and any
 /// other necessary blocks. The control flow is modified to execute the
 /// pipelined loop if the trip count satisfies the condition, otherwise the
@@ -2304,12 +2314,17 @@ void ModuloScheduleExpanderMVE::generatePipelinedLoop() {
   NewPreheader = MF.CreateMachineBasicBlock(OrigKernel->getBasicBlock());
 
   MF.insert(OrigKernel->getIterator(), Check);
+  LIS.insertMBBInMaps(Check);
   MF.insert(OrigKernel->getIterator(), Prolog);
+  LIS.insertMBBInMaps(Prolog);
   MF.insert(OrigKernel->getIterator(), NewKernel);
+  LIS.insertMBBInMaps(NewKernel);
   MF.insert(OrigKernel->getIterator(), Epilog);
+  LIS.insertMBBInMaps(Epilog);
   MF.insert(OrigKernel->getIterator(), NewPreheader);
+  LIS.insertMBBInMaps(NewPreheader);
 
-  NewExit = createDedicatedExit(OrigKernel, OrigExit);
+  NewExit = createDedicatedExit(OrigKernel, OrigExit, LIS);
 
   NewPreheader->transferSuccessorsAndUpdatePHIs(OrigPreheader);
   TII->insertUnconditionalBranch(*NewPreheader, OrigKernel, DebugLoc());
@@ -2339,6 +2354,8 @@ void ModuloScheduleExpanderMVE::generatePipelinedLoop() {
   generateProlog(PrologVRMap);
   generateKernel(PrologVRMap, KernelVRMap, LastStage0Insts);
   generateEpilog(KernelVRMap, EpilogVRMap, LastStage0Insts);
+
+  calculateIntervals();
 }
 
 /// Replace MI's use operands according to the maps.
@@ -2393,9 +2410,10 @@ void ModuloScheduleExpanderMVE::updateInstrUse(
       UseMO.setReg(NewReg);
     else {
       Register SplitReg = MRI.createVirtualRegister(MRI.getRegClass(OrigReg));
-      BuildMI(*OrigKernel, MI, MI->getDebugLoc(), TII->get(TargetOpcode::COPY),
-              SplitReg)
-          .addReg(NewReg);
+      MachineInstr *NewCopy = BuildMI(*OrigKernel, MI, MI->getDebugLoc(),
+                                      TII->get(TargetOpcode::COPY), SplitReg)
+                                  .addReg(NewReg);
+      LIS.InsertMachineInstrInMaps(*NewCopy);
       UseMO.setReg(SplitReg);
     }
   }
@@ -2479,12 +2497,14 @@ void ModuloScheduleExpanderMVE::generatePhi(
 
     assert(CorrespondReg.isValid());
     Register PhiReg = MRI.createVirtualRegister(MRI.getRegClass(OrigReg));
-    BuildMI(*NewKernel, NewKernel->getFirstNonPHI(), DebugLoc(),
-            TII->get(TargetOpcode::PHI), PhiReg)
-        .addReg(NewReg->second)
-        .addMBB(NewKernel)
-        .addReg(CorrespondReg)
-        .addMBB(Prolog);
+    MachineInstr *NewPhi =
+        BuildMI(*NewKernel, NewKernel->getFirstNonPHI(), DebugLoc(),
+                TII->get(TargetOpcode::PHI), PhiReg)
+            .addReg(NewReg->second)
+            .addMBB(NewKernel)
+            .addReg(CorrespondReg)
+            .addMBB(Prolog);
+    LIS.InsertMachineInstrInMaps(*NewPhi);
     PhiVRMap[UnrollNum][OrigReg] = PhiReg;
   }
 }
@@ -2522,18 +2542,19 @@ void ModuloScheduleExpanderMVE::mergeRegUsesAfterPipeline(Register OrigReg,
   // remaining iterations) with the route that execute the original loop.
   if (!UsesAfterLoop.empty()) {
     Register PhiReg = MRI.createVirtualRegister(MRI.getRegClass(OrigReg));
-    BuildMI(*NewExit, NewExit->getFirstNonPHI(), DebugLoc(),
-            TII->get(TargetOpcode::PHI), PhiReg)
-        .addReg(OrigReg)
-        .addMBB(OrigKernel)
-        .addReg(NewReg)
-        .addMBB(Epilog);
+    MachineInstr *NewPhi =
+        BuildMI(*NewExit, NewExit->getFirstNonPHI(), DebugLoc(),
+                TII->get(TargetOpcode::PHI), PhiReg)
+            .addReg(OrigReg)
+            .addMBB(OrigKernel)
+            .addReg(NewReg)
+            .addMBB(Epilog);
+    LIS.InsertMachineInstrInMaps(*NewPhi);
 
     for (MachineOperand *MO : UsesAfterLoop)
       MO->setReg(PhiReg);
 
-    if (!LIS.hasInterval(PhiReg))
-      LIS.createEmptyInterval(PhiReg);
+    NoIntervalRegs.insert(PhiReg);
   }
 
   // Merge routes from the pipelined loop and the bypassed route before the
@@ -2543,12 +2564,14 @@ void ModuloScheduleExpanderMVE::mergeRegUsesAfterPipeline(Register OrigReg,
       Register InitReg, LoopReg;
       getPhiRegs(*Phi, OrigKernel, InitReg, LoopReg);
       Register NewInit = MRI.createVirtualRegister(MRI.getRegClass(InitReg));
-      BuildMI(*NewPreheader, NewPreheader->getFirstNonPHI(), Phi->getDebugLoc(),
-              TII->get(TargetOpcode::PHI), NewInit)
-          .addReg(InitReg)
-          .addMBB(Check)
-          .addReg(NewReg)
-          .addMBB(Epilog);
+      MachineInstr *NewPhi =
+          BuildMI(*NewPreheader, NewPreheader->getFirstNonPHI(),
+                  Phi->getDebugLoc(), TII->get(TargetOpcode::PHI), NewInit)
+              .addReg(InitReg)
+              .addMBB(Check)
+              .addReg(NewReg)
+              .addMBB(Epilog);
+      LIS.InsertMachineInstrInMaps(*NewPhi);
       replacePhiSrc(*Phi, InitReg, NewInit, NewPreheader);
     }
   }
@@ -2571,6 +2594,7 @@ void ModuloScheduleExpanderMVE::generateProlog(
       updateInstrDef(NewMI, PrologVRMap[PrologNum], false);
       NewMIMap[NewMI] = {PrologNum, StageNum};
       Prolog->push_back(NewMI);
+      LIS.InsertMachineInstrInMaps(*NewMI);
     }
   }
 
@@ -2609,6 +2633,7 @@ void ModuloScheduleExpanderMVE::generateKernel(
       generatePhi(MI, UnrollNum, PrologVRMap, KernelVRMap, PhiVRMap);
       NewMIMap[NewMI] = {UnrollNum, StageNum};
       NewKernel->push_back(NewMI);
+      LIS.InsertMachineInstrInMaps(*NewMI);
     }
   }
 
@@ -2647,6 +2672,7 @@ void ModuloScheduleExpanderMVE::generateEpilog(
       updateInstrDef(NewMI, EpilogVRMap[EpilogNum], StageNum - 1 == EpilogNum);
       NewMIMap[NewMI] = {EpilogNum, StageNum};
       Epilog->push_back(NewMI);
+      LIS.InsertMachineInstrInMaps(*NewMI);
     }
   }
 
diff --git a/llvm/test/CodeGen/AArch64/aarch64-swp-ws-live-intervals-1.mir b/llvm/test/CodeGen/AArch64/aarch64-swp-ws-live-intervals-1.mir
new file mode 100644
index 0000000000000..7cc1bcb05c70a
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/aarch64-swp-ws-live-intervals-1.mir
@@ -0,0 +1,180 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc --mtriple=aarch64 %s -run-pass=pipeliner -o -| FileCheck %s
+
+...
+---
+name:            foo
+tracksRegLiveness: true
+body:             |
+  ; CHECK-LABEL: name: foo
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.12(0x80000000)
+  ; CHECK-NEXT:   liveins: $x0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr64common = COPY $x0
+  ; CHECK-NEXT:   [[COPY1:%[0-9]+]]:gpr64sp = COPY $xzr
+  ; CHECK-NEXT:   [[COPY2:%[0-9]+]]:gpr64all = COPY [[COPY1]]
+  ; CHECK-NEXT:   [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 1
+  ; CHECK-NEXT:   [[SUBREG_TO_REG:%[0-9]+]]:gpr64sp = SUBREG_TO_REG 0, [[MOVi32imm]], %subreg.sub_32
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.12:
+  ; CHECK-NEXT:   successors: %bb.13(0x80000000), %bb.14(0x00000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[LDRDui:%[0-9]+]]:fpr64 = LDRDui [[COPY1]], 0
+  ; CHECK-NEXT:   [[SUBSXri:%[0-9]+]]:gpr64 = nsw SUBSXri [[SUBREG_TO_REG]], 1, 0, implicit-def $nzcv
+  ; CHECK-NEXT:   [[COPY3:%[0-9]+]]:gpr64all = COPY [[SUBSXri]]
+  ; CHECK-NEXT:   [[SUBREG_TO_REG1:%[0-9]+]]:gpr64all = SUBREG_TO_REG 0, [[MOVi32imm]], %subreg.sub_32
+  ; CHECK-NEXT:   Bcc 0, %bb.14, implicit $nzcv
+  ; CHECK-NEXT:   B %bb.13
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.13:
+  ; CHECK-NEXT:   successors: %bb.14(0x04000000), %bb.13(0x7c000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[PHI:%[0-9]+]]:gpr64sp = PHI [[COPY3]], %bb.12, %55, %bb.13
+  ; CHECK-NEXT:   [[PHI1:%[0-9]+]]:gpr64 = PHI [[SUBREG_TO_REG1]], %bb.12, %54, %bb.13
+  ; CHECK-NEXT:   [[PHI2:%[0-9]+]]:fpr64 = PHI [[LDRDui]], %bb.12, %52, %bb.13
+  ; CHECK-NEXT:   STRDui [[PHI2]], [[COPY]], 0
+  ; CHECK-NEXT:   [[LDRDui1:%[0-9]+]]:fpr64 = LDRDui [[COPY1]], 0
+  ; CHECK-NEXT:   [[SUBSXri1:%[0-9]+]]:gpr64 = nsw SUBSXri [[PHI]], 1, 0, implicit-def $nzcv
+  ; CHECK-NEXT:   [[SUBREG_TO_REG2:%[0-9]+]]:gpr64all = SUBREG_TO_REG 0, [[MOVi32imm]], %subreg.sub_32
+  ; CHECK-NEXT:   [[COPY4:%[0-9]+]]:gpr64all = COPY [[SUBSXri1]]
+  ; CHECK-NEXT:   Bcc 1, %bb.13, implicit $nzcv
+  ; CHECK-NEXT:   B %bb.14
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.14:
+  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[PHI3:%[0-9]+]]:gpr64 = PHI [[COPY2]], %bb.12, [[PHI1]], %bb.13
+  ; CHECK-NEXT:   [[PHI4:%[0-9]+]]:fpr64 = PHI [[LDRDui]], %bb.12, [[LDRDui1]], %bb.13
+  ; CHECK-NEXT:   STRDui [[PHI4]], [[COPY]], 0
+  ; CHECK-NEXT:   B %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT:   successors: %bb.3(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.3:
+  ; CHECK-NEXT:   successors: %bb.4(0x04000000), %bb.3(0x7c000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   CBNZX [[PHI3]], %bb.3
+  ; CHECK-NEXT:   B %bb.4
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.4:
+  ; CHECK-NEXT:   successors: %bb.5(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[FMOVD0_:%[0-9]+]]:fpr64 = FMOVD0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.5:
+  ; CHECK-NEXT:   successors: %bb.7(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[PHI5:%[0-9]+]]:fpr64 = PHI [[FMOVD0_]], %bb.4, %44, %bb.11
+  ; CHECK-NEXT:   [[COPY5:%[0-9]+]]:gpr64all = COPY $xzr
+  ; CHECK-NEXT:   [[COPY6:%[0-9]+]]:gpr64common = COPY [[COPY5]]
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.7:
+  ; CHECK-NEXT:   successors: %bb.8(0x40000000), %bb.11(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[UBFMXri:%[0-9]+]]:gpr64common = UBFMXri [[COPY6]], 61, 60
+  ; CHECK-NEXT:   [[ADDSXri:%[0-9]+]]:gpr64 = ADDSXri [[COPY6]], 1, 0, implicit-def $nzcv
+  ; CHECK-NEXT:   [[COPY7:%[0-9]+]]:gpr64common = COPY [[ADDSXri]]
+  ; CHECK-NEXT:   Bcc 2, %bb.11, implicit $nzcv
+  ; CHECK-NEXT:   B %bb.8
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.8:
+  ; CHECK-NEXT:   successors: %bb.9(0x80000000), %bb.10(0x00000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[LDRDui2:%[0-9]+]]:fpr64 = LDRDui [[UBFMXri]], 1
+  ; CHECK-NEXT:   [[UBFMXri1:%[0-9]+]]:gpr64common = UBFMXri [[COPY7]], 61, 60
+  ; CHECK-NEXT:   [[ADDSXri1:%[0-9]+]]:gpr64 = ADDSXri [[COPY7]], 1, 0, implicit-def $nzcv
+  ; CHECK-NEXT:   [[COPY8:%[0-9]+]]:gpr64all = COPY [[ADDSXri1]]
+  ; CHECK-NEXT:   Bcc 2, %bb.10, implicit $nzcv
+  ; CHECK-NEXT:   B %bb.9
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.9:
+  ; CHECK-NEXT:   successors: %bb.10(0x04000000), %bb.9(0x7c000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[PHI6:%[0-9]+]]:gpr64common = PHI [[COPY8]], %bb.8, %33, %bb.9
+  ; CHECK-NEXT:   [[PHI7:%[0-9]+]]:fpr64 = PHI [[PHI5]], %bb.8, %32, %bb.9
+  ; CHECK-NEXT:   [[PHI8:%[0-9]+]]:gpr64common = PHI [[UBFMXri1]], %bb.8, %31, %bb.9
+  ; CHECK-NEXT:   [[PHI9:%[0-9]+]]:fpr64 = PHI [[LDRDui2]], %bb.8, %29, %bb.9
+  ; CHECK-NEXT:   [[LDRDui3:%[0-9]+]]:fpr64 = LDRDui [[PHI8]], 1
+  ; CHECK-NEXT:   [[ADDSXri2:%[0-9]+]]:gpr64 = ADDSXri [[PHI6]], 1, 0, implicit-def $nzcv
+  ; CHECK-NEXT:   [[UBFMXri2:%[0-9]+]]:gpr64common = UBFMXri [[PHI6]], 61, 60
+  ; CHECK-NEXT:   [[FADDDrr:%[0-9]+]]:fpr64 = nofpexcept FADDDrr [[PHI7]], [[PHI9]], implicit $fpcr
+  ; CHECK-NEXT:   [[COPY9:%[0-9]+]]:gpr64all = COPY [[ADDSXri2]]
+  ; CHECK-NEXT:   Bcc 2, %bb.10, implicit $nzcv
+  ; CHECK-NEXT:   B %bb.9
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.10:
+  ; CHECK-NEXT:   successors: %bb.11(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[PHI10:%[0-9]+]]:fpr64 = PHI [[PHI5]], %bb.8, [[FADDDrr]], %bb.9
+  ; CHECK-NEXT:   [[PHI11:%[0-9]+]]:gpr64common = PHI [[UBFMXri1]], %bb.8, [[UBFMXri2]], %bb.9
+  ; CHECK-NEXT:   [[PHI12:%[0-9]+]]:fpr64 = PHI [[LDRDui2]], %bb.8, [[LDRDui3]], %bb.9
+  ; CHECK-NEXT:   [[FADDDrr1:%[0-9]+]]:fpr64 = nofpexcept FADDDrr [[PHI10]], [[PHI12]], implicit $fpcr
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.11:
+  ; CHECK-NEXT:   successors: %bb.5(0x80000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   [[PHI13:%[0-9]+]]:fpr64 = PHI [[PHI5]], %bb.7, [[FADDDrr1]], %bb.10
+  ; CHECK-NEXT:   [[PHI14:%[0-9]+]]:gpr64common = PHI [[UBFMXri]], %bb.7, [[PHI11]], %bb.10
+  ; CHECK-NEXT:   [[LDRDui4:%[0-9]+]]:fpr64 = LDRDui [[PHI14]], 1
+  ; CHECK-NEXT:   [[FADDDrr2:%[0-9]+]]:fpr64 = nofpexcept FADDDrr [[PHI13]], [[LDRDui4]], implicit $fpcr
+  ; CHECK-NEXT:   B %bb.5
+  bb.0:
+    successors: %bb.1(0x80000000)
+    liveins: $x0
+
+    %0:gpr64common = COPY $x0
+    %1:gpr64sp = COPY $xzr
+    %2:gpr64all = COPY %1
+    %3:gpr32 = MOVi32imm 1
+    %4:gpr64all = SUBREG_TO_REG 0, %3, %subreg.sub_32
+
+  bb.1:
+    successors: %bb.2(0x04000000), %bb.1(0x7c000000)
+
+    %5:gpr64sp = PHI %4, %bb.0, %6, %bb.1
+    %7:gpr64 = PHI %2, %bb.0, %8, %bb.1
+    %9:fpr64 = LDRDui %1, 0
+    STRDui killed %9, %0, 0
+    %10:gpr64 = nsw SUBSXri %5, 1, 0, implicit-def $nzcv
+    %6:gpr64all = COPY %10
+    %8:gpr64all = SUBREG_TO_REG 0, %3, %subreg.sub_32
+    Bcc 1, %bb.1, implicit $nzcv
+    B %bb.2
+
+  bb.2:
+    successors: %bb.3(0x80000000)
+
+  bb.3:
+    successors: %bb.4(0x04000000), %bb.3(0x7c000000)
+
+    CBNZX %7, %bb.3
+    B %bb.4
+
+  bb.4:
+    successors: %bb.5(0x80000000)
+
+    %11:fpr64 = FMOVD0
+
+  bb.5:
+    successors: %bb.6(0x80000000)
+
+    %12:fpr64 = PHI %11, %bb.4, %13, %bb.6
+    %14:gpr64all = COPY $xzr
+    %15:gpr64all = COPY %14
+
+  bb.6:
+    successors: %bb.5(0x04000000), %bb.6(0x7c000000)
+
+    %16:gpr64common = PHI %15, %bb.5, %17, %bb.6
+    %18:fpr64 = PHI %12, %bb.5, %13, %bb.6
+    %19:gpr64common = UBFMXri %16, 61, 60
+    %20:fpr64 = LDRDui killed %19, 1
+    %13:fpr64 = nofpexcept FADDDrr %18, killed %20, implicit $fpcr
+    %21:gpr64 = ADDSXri %16, 1, 0, implicit-def $nzcv
+    %17:gpr64all = COPY %21
+    Bcc 2, %bb.5, implicit $nzcv
+    B %bb.6
+
+...

@huaatian huaatian requested a review from arsenm March 24, 2025 06:43
@huaatian
Copy link
Contributor Author

The reason for the crash is that the LIS information in ModuloScheduleExpanderMVE is incomplete, which prevents ModuloScheduleExpander from updating the LIS information.
image

@arsenm arsenm changed the title [llvm][CodeGen] update live intarvals for ModuloScheduleExpanderMVE [llvm][CodeGen] update live intervals for ModuloScheduleExpanderMVE Mar 24, 2025
/// Some registers are generated during the kernel expansion. We calculate the
/// live intervals of these registers after the expansion.
void ModuloScheduleExpanderMVE::calculateIntervals() {
for (Register Reg : NoIntervalRegs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're just creating fresh registers and need to compute the live intervals at the end, NewVRegs would be a better name. Can you seed the intervals from some other register?

Copy link
Contributor Author

@huaatian huaatian Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're just creating fresh registers and need to compute the live intervals at the end, NewVRegs would be a better name.

Updated.

Can you seed the intervals from some other register?

Some NewVRegs will still be used by other newly added instructions in the subsequent processing. It seems that calculating their live intervals before exiting is relatively a simple and reliable approach. I directly used the LIS interface and did not come up with any other methods.

@@ -404,6 +405,9 @@ class ModuloScheduleExpanderMVE {
/// NumUnroll = 1 means no unrolling.
int NumUnroll;

/// Record the registers that need to compute live intervals.
SmallSet<Register, 8> NewVRegs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to make this SmallVector. You can't re-create the same vreg twice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These NewVRegs are primarily used in scenarios where registers are replaced after pipelining in mergeRegUsesAfterPipeline(). In such scenarios, these registers might be used repeatedly. So I use SmallSet to handle this situation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used repeatedly is not created repeatedly. The one case here this is paired with is the direct result of createVirtualRegister

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. SmallVector is indeed more suitable for use here.
Updated.

Copy link
Contributor

@ytmukai ytmukai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix.

Comment on lines 2569 to 2570
if (LIS.hasInterval(OrigReg))
LIS.removeInterval(OrigReg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should remove the interval when it's invalidated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this piece of code to the position where the live interval of OrigReg becomes invalid.
Updated

@huaatian huaatian merged commit b122956 into llvm:main Apr 10, 2025
11 checks passed
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…lvm#132677)

ModuloScheduleExpanderMVE and ModuloScheduleExpander are used sequentially in 
certain use cases. It is necessary to update live intervals for ModuloScheduleExpanderMVE; 
otherwise, crashes may occur.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants