-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV] Enable merging of external globals by default #117880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] Enable merging of external globals by default #117880
Conversation
@llvm/pr-subscribers-backend-risc-v Author: Alex Bradbury (asb) ChangesThis follows up #115495 by enabling merging of external globals by default, which had been left as a next step in order to make the previous change more incremental and so we can more easily narrow down on any identified regressions. Enabling merging of external globals matches what Arm does (for non mach-o targets), though AArch64 doesn't as there were some concerns it might cause regressions in some cases. Here are the results I have from specrate benchmarks:
This is on the Spacemit X60, with essentially rva22+v. omnetpp is the main one that sees a potentially positive change, though there was an annoyingly high standard deviation on that benchmark. Looking at dhyrstone we see a positive codegen change in Proc0 which translates to a reduced instruction count, though not decreased runtime:
Full diff: https://github.com/llvm/llvm-project/pull/117880.diff 6 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
index fa507653264ccd..7a2e2a90699553 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp
@@ -51,12 +51,9 @@ static cl::opt<cl::boolOrDefault>
EnableGlobalMerge("riscv-enable-global-merge", cl::Hidden,
cl::desc("Enable the global merge pass"));
-static cl::opt<bool> ForceEnableGlobalMergeExternalGlobals(
- "riscv-force-enable-global-merge-external-globals", cl::Hidden,
- cl::init(false),
- cl::desc(
- "If the global merge pass is enabled, force enable global merging of "
- "external globals (overriding any logic that might disable it)"));
+static cl::opt<bool> DisableGlobalMergeExternalGlobals(
+ "riscv-disable-global-merge-external-globals", cl::Hidden, cl::init(false),
+ cl::desc("Disable global merging of external globals"));
static cl::opt<bool>
EnableMachineCombiner("riscv-enable-machine-combiner",
@@ -484,7 +481,7 @@ bool RISCVPassConfig::addPreISel() {
addPass(createGlobalMergePass(TM, /* MaxOffset */ 2047,
/* OnlyOptimizeForSize */ false,
/* MergeExternalByDefault */
- ForceEnableGlobalMergeExternalGlobals));
+ !DisableGlobalMergeExternalGlobals));
}
return false;
diff --git a/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll b/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll
index 156a34db827456..77fe1783bb5d5e 100644
--- a/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-nonzero.ll
@@ -1,8 +1,7 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv64 -riscv-force-enable-global-merge-external-globals \
-; RUN: -verify-machineinstrs < %s | FileCheck %s -check-prefix=SMALL-DATA
-; RUN: llc -mtriple=riscv64 -global-merge-min-data-size=0 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s \
+; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
+; RUN: | FileCheck %s -check-prefix=SMALL-DATA
+; RUN: llc -mtriple=riscv64 -global-merge-min-data-size=0 -verify-machineinstrs < %s \
; RUN: | FileCheck %s -check-prefix=MINSIZE
@ig1 = internal global i32 0, align 4
diff --git a/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll b/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll
index e41f14d394b7ca..c29749c17a5b5c 100644
--- a/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge-minsize-smalldata-zero.ll
@@ -1,8 +1,7 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -riscv-force-enable-global-merge-external-globals \
-; RUN: -verify-machineinstrs < %s | FileCheck %s -check-prefix=SMALL-DATA
-; RUN: llc -mtriple=riscv64 -global-merge-min-data-size=5 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s \
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN: | FileCheck %s -check-prefix=SMALL-DATA
+; RUN: llc -mtriple=riscv64 -global-merge-min-data-size=5 -verify-machineinstrs < %s \
; RUN: | FileCheck %s -check-prefix=MINSIZE
@ig1 = internal global i32 0, align 4
diff --git a/llvm/test/CodeGen/RISCV/global-merge-minsize.ll b/llvm/test/CodeGen/RISCV/global-merge-minsize.ll
index 1d65d9d1732ba3..915dde388cffdc 100644
--- a/llvm/test/CodeGen/RISCV/global-merge-minsize.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge-minsize.ll
@@ -1,8 +1,8 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -riscv-force-enable-global-merge-external-globals \
+; RUN: llc -mtriple=riscv32 \
; RUN: -verify-machineinstrs < %s | FileCheck %s -check-prefix=RV32
; RUN: llc -mtriple=riscv32 -global-merge-min-data-size=5 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s | FileCheck %s -check-prefix=RV32-MINSIZE
+; RUN: -verify-machineinstrs < %s | FileCheck %s -check-prefix=RV32-MINSIZE
@ig1 = internal global i32 0, align 4
@ig2 = internal global i32 0, align 4
diff --git a/llvm/test/CodeGen/RISCV/global-merge-offset.ll b/llvm/test/CodeGen/RISCV/global-merge-offset.ll
index bb8264ee438545..c1074bc8ca97ed 100644
--- a/llvm/test/CodeGen/RISCV/global-merge-offset.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge-offset.ll
@@ -1,12 +1,12 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: sed 's/ArrSize/100/g' %s | llc -mtriple=riscv32 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs | FileCheck %s
+; RUN: -verify-machineinstrs | FileCheck %s
; RUN: sed 's/ArrSize/100/g' %s | llc -mtriple=riscv64 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs | FileCheck %s
+; RUN: -verify-machineinstrs | FileCheck %s
; RUN: sed 's/ArrSize/101/g' %s | llc -mtriple=riscv32 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-TOOBIG
+; RUN: -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-TOOBIG
; RUN: sed 's/ArrSize/101/g' %s | llc -mtriple=riscv64 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-TOOBIG
+; RUN: -verify-machineinstrs | FileCheck %s -check-prefix=CHECK-TOOBIG
; This test demonstrates that the MaxOffset is set correctly for RISC-V by
; constructing an input that is at the limit and comparing.
diff --git a/llvm/test/CodeGen/RISCV/global-merge.ll b/llvm/test/CodeGen/RISCV/global-merge.ll
index 327a6b54f4be25..fa0dda3736906b 100644
--- a/llvm/test/CodeGen/RISCV/global-merge.ll
+++ b/llvm/test/CodeGen/RISCV/global-merge.ll
@@ -1,11 +1,11 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s | FileCheck %s
-; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s | FileCheck %s
-; RUN: llc -mtriple=riscv32 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s \
+; RUN: llc -mtriple=riscv32 -riscv-disable-global-merge-external-globals \
+; RUN: -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv64 -riscv-disable-global-merge-external-globals \
+; RUN: -verify-machineinstrs < %s | FileCheck %s
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
; RUN: | FileCheck -check-prefix=CHECK-WEXTERN %s
-; RUN: llc -mtriple=riscv64 \
-; RUN: -riscv-force-enable-global-merge-external-globals -verify-machineinstrs < %s \
+; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
; RUN: | FileCheck -check-prefix=CHECK-WEXTERN %s
@ig1 = internal global i32 0, align 4
|
…lobal-merge-on-external (thanks Craig!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Thank you for the performance data. I agree the performance is pretty much in the noise, but this makes sense in concept. Given that, I think moving forward here is the right answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…merging-mergeexternalbydefault
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/10066 Here is the relevant piece of the build log for the reference
|
This follows up #115495 by enabling merging of external globals by default, which had been left as a next step in order to make the previous change more incremental and so we can more easily narrow down on any identified regressions.
Enabling merging of external globals matches what Arm does (for non mach-o targets), though AArch64 doesn't as there were some concerns it might cause regressions in some cases.
Here are the results I have from specrate benchmarks (train workload, figures are execution time in seconds):
This is on the Spacemit X60, with essentially rva22+v. omnetpp is the main one that sees a potentially positive change, though there was an annoyingly high standard deviation on that benchmark.
Looking at dhyrstone we see a positive codegen change in Proc0 which translates to a reduced instruction count, though not decreased runtime:
EDIT: I should elaborate on my overall view here / what feedback I'm looking for. At least for the tests I've run, the differences seem mainly in the noise and the reduction in instructions in dhrystone is meaningful enough (even if execution time isn't affected for the spacemit x60 microarch) that it seems sensible to move forwards. But I'd welcome any opposing thoughts / concerns, or indeed further input from people who've enabled globals merging downstream. I'd understand this probably included external globals merging too as enabled in this patch, but it would be helpful to confirm.