Skip to content

Commit d38b98f

Browse files
committed
[AArch64] Disable Pre-RA Scheduler for Neoverse V2
We would like to disable the pre-RA machine scheduler for the Neoverse V2 because we have a key workload that massively benefits from this (25% uplift). Despite the machine scheduler being register pressure aware, it results in spills for this workload. Disabling the scheduler seems a lot more attractive than trying to tweak regalloc heuristics: - We see no benefit of scheduling anyway on this big core, and have never seen this. I.e., when we added the V2 scheduling model, this wasn't for perf reasons, only to enable LLVM-MCA. - Scheduling can consume significant compile-time, not resulting in any perf gains. This is a bad deal. FWIW: the GCC folks realised the same not that long ago, and did exactly the same, see also: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/667074.html I guess other big cores could benefit from this too, but I would like to leave that decision to folks with more experience on those cores, so that's why I propose to change this for the V2 here only. Numbers: * We know the Eigen library is somewhat sensitive to scheduling, but I found one kernel to regress with ~2%, and another to improve with ~2%. They cancel each other out, and overall the result is neutral. * SPEC FP and INT seem totally unaffected. * LLVM test-suite: a little bit up and down, all within noise levels I think, so is neutral. * Compile-time numbers: I see a geomean 3% improvement for the LLVM test-suite, and a very decent one for the sqlite amalgamation version. I haven't looked at the post-RA scheduling, maybe that's interesting as a follow up.
1 parent 404f94a commit d38b98f

File tree

6 files changed

+165
-159
lines changed

6 files changed

+165
-159
lines changed

llvm/lib/Target/AArch64/AArch64Features.td

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -669,6 +669,9 @@ def FeatureExynosCheapAsMoveHandling : SubtargetFeature<"exynos-cheap-as-move",
669669
"HasExynosCheapAsMoveHandling", "true",
670670
"Use Exynos specific handling of cheap instructions">;
671671

672+
def FeatureDisablePreRAScheduler : SubtargetFeature<"use-prera-scheduler",
673+
"DisablePreRAScheduler", "true", "Disable scheduling before register allocation">;
674+
672675
def FeaturePostRAScheduler : SubtargetFeature<"use-postra-scheduler",
673676
"UsePostRAScheduler", "true", "Schedule again after register allocation">;
674677

llvm/lib/Target/AArch64/AArch64Processors.td

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -540,6 +540,7 @@ def TuneNeoverseV2 : SubtargetFeature<"neoversev2", "ARMProcFamily", "NeoverseV2
540540
FeatureCmpBccFusion,
541541
FeatureFuseAdrpAdd,
542542
FeatureALULSLFast,
543+
FeatureDisablePreRAScheduler,
543544
FeaturePostRAScheduler,
544545
FeatureEnableSelectOptimize,
545546
FeatureUseFixedOverScalableIfEqualCost,

llvm/lib/Target/AArch64/AArch64Subtarget.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -156,7 +156,9 @@ class AArch64Subtarget final : public AArch64GenSubtargetInfo {
156156
const LegalizerInfo *getLegalizerInfo() const override;
157157
const RegisterBankInfo *getRegBankInfo() const override;
158158
const Triple &getTargetTriple() const { return TargetTriple; }
159-
bool enableMachineScheduler() const override { return true; }
159+
bool enableMachineScheduler() const override {
160+
return !disablePreRAScheduler();
161+
}
160162
bool enablePostRAScheduler() const override { return usePostRAScheduler(); }
161163
bool enableSubRegLiveness() const override { return EnableSubregLiveness; }
162164

0 commit comments

Comments
 (0)