-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SPARC] Prefer RDPC over CALL to implement GETPCX for 64-bit target #77196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Created using spr 1.3.4
@llvm/pr-subscribers-backend-sparc Author: Koakuma (koachan) ChangesOn 64-bit target, prefer usng RDPC over CALL to get the value of %pc. The old behavior of using a fake CALL is still done when tuning for classic A quick pgbench test on a SPARC T4 shows about 2% speedup on SELECT loads, Full diff: https://github.com/llvm/llvm-project/pull/77196.diff 3 Files Affected:
diff --git a/llvm/lib/Target/Sparc/Sparc.td b/llvm/lib/Target/Sparc/Sparc.td
index 1a71cfed3128f0..7b103395652433 100644
--- a/llvm/lib/Target/Sparc/Sparc.td
+++ b/llvm/lib/Target/Sparc/Sparc.td
@@ -62,6 +62,13 @@ def UsePopc : SubtargetFeature<"popc", "UsePopc", "true",
def FeatureSoftFloat : SubtargetFeature<"soft-float", "UseSoftFloat", "true",
"Use software emulation for floating point">;
+//===----------------------------------------------------------------------===//
+// SPARC Subtarget tuning features.
+//
+
+def TuneSlowRDPC : SubtargetFeature<"slow-rdpc", "HasSlowRDPC", "true",
+ "rd %pc, %XX is slow", [FeatureV9]>;
+
//==== Features added predmoninantly for LEON subtarget support
include "LeonFeatures.td"
@@ -89,8 +96,9 @@ def SparcAsmParserVariant : AsmParserVariant {
// SPARC processors supported.
//===----------------------------------------------------------------------===//
-class Proc<string Name, list<SubtargetFeature> Features>
- : Processor<Name, NoItineraries, Features>;
+class Proc<string Name, list<SubtargetFeature> Features,
+ list<SubtargetFeature> TuneFeatures = []>
+ : Processor<Name, NoItineraries, Features, TuneFeatures>;
def : Proc<"generic", []>;
def : Proc<"v7", [FeatureSoftMulDiv, FeatureNoFSMULD]>;
@@ -118,9 +126,11 @@ def : Proc<"ma2480", [FeatureLeon, LeonCASA]>;
def : Proc<"ma2485", [FeatureLeon, LeonCASA]>;
def : Proc<"ma2x8x", [FeatureLeon, LeonCASA]>;
def : Proc<"v9", [FeatureV9]>;
-def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
+def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated, FeatureVIS],
+ [TuneSlowRDPC]>;
def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated, FeatureVIS,
- FeatureVIS2]>;
+ FeatureVIS2],
+ [TuneSlowRDPC]>;
def : Proc<"niagara", [FeatureV9, FeatureV8Deprecated, FeatureVIS,
FeatureVIS2]>;
def : Proc<"niagara2", [FeatureV9, FeatureV8Deprecated, UsePopc,
diff --git a/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp b/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
index cca624e0926796..97abf10b18540d 100644
--- a/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
+++ b/llvm/lib/Target/Sparc/SparcAsmPrinter.cpp
@@ -13,6 +13,7 @@
#include "MCTargetDesc/SparcInstPrinter.h"
#include "MCTargetDesc/SparcMCExpr.h"
+#include "MCTargetDesc/SparcMCTargetDesc.h"
#include "MCTargetDesc/SparcTargetStreamer.h"
#include "Sparc.h"
#include "SparcInstrInfo.h"
@@ -111,6 +112,15 @@ static void EmitCall(MCStreamer &OutStreamer,
OutStreamer.emitInstruction(CallInst, STI);
}
+static void EmitRDPC(MCStreamer &OutStreamer, MCOperand &RD,
+ const MCSubtargetInfo &STI) {
+ MCInst RDPCInst;
+ RDPCInst.setOpcode(SP::RDASR);
+ RDPCInst.addOperand(RD);
+ RDPCInst.addOperand(MCOperand::createReg(SP::ASR5));
+ OutStreamer.emitInstruction(RDPCInst, STI);
+}
+
static void EmitSETHI(MCStreamer &OutStreamer,
MCOperand &Imm, MCOperand &RD,
const MCSubtargetInfo &STI)
@@ -234,8 +244,15 @@ void SparcAsmPrinter::LowerGETPCXAndEmitMCInsts(const MachineInstr *MI,
// add <MO>, %o7, <MO>
OutStreamer->emitLabel(StartLabel);
- MCOperand Callee = createPCXCallOP(EndLabel, OutContext);
- EmitCall(*OutStreamer, Callee, STI);
+ if (!STI.getTargetTriple().isSPARC64() ||
+ STI.hasFeature(Sparc::TuneSlowRDPC)) {
+ MCOperand Callee = createPCXCallOP(EndLabel, OutContext);
+ EmitCall(*OutStreamer, Callee, STI);
+ } else {
+ // TODO make it possible to store PC in other registers
+ // so that leaf function optimization becomes possible.
+ EmitRDPC(*OutStreamer, RegO7, STI);
+ }
OutStreamer->emitLabel(SethiLabel);
MCOperand hiImm = createPCXRelExprOp(SparcMCExpr::VK_Sparc_PC22,
GOTLabel, StartLabel, SethiLabel,
diff --git a/llvm/test/CodeGen/SPARC/tune-getpcx.ll b/llvm/test/CodeGen/SPARC/tune-getpcx.ll
new file mode 100644
index 00000000000000..7454fea0e38d57
--- /dev/null
+++ b/llvm/test/CodeGen/SPARC/tune-getpcx.ll
@@ -0,0 +1,18 @@
+; RUN: llc < %s -relocation-model=pic -mtriple=sparc | FileCheck --check-prefix=CALL %s
+; RUN: llc < %s -relocation-model=pic -mtriple=sparcv9 -mcpu=ultrasparc | FileCheck --check-prefix=CALL %s
+; RUN: llc < %s -relocation-model=pic -mtriple=sparcv9 | FileCheck --check-prefix=RDPC %s
+
+;; SPARC32 and SPARC64 for classic UltraSPARCs implement GETPCX
+;; with a fake `call`.
+;; All other SPARC64 targets implement it with `rd %pc, %o7`.
+
+@value = external global i32
+
+; CALL: call
+; CALL-NOT: rd %pc
+; RDPC: rd %pc
+; RDPC-not: call
+define i32 @test() {
+ %1 = load i32, i32* @value
+ ret i32 %1
+}
|
@@ -118,9 +126,11 @@ def : Proc<"ma2480", [FeatureLeon, LeonCASA]>; | |||
def : Proc<"ma2485", [FeatureLeon, LeonCASA]>; | |||
def : Proc<"ma2x8x", [FeatureLeon, LeonCASA]>; | |||
def : Proc<"v9", [FeatureV9]>; | |||
def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>; | |||
def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated, FeatureVIS], | |||
[TuneSlowRDPC]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it in TuneFeatures? Sparc doesn't seem to support -mtune.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-mtune enablement is at PR #77195.
clang already accepts, recognises, and passes the flag on to the backend, it's just the backend haven't made any use of the provided info yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. IIUC -mtune is translated into -tune-cpu llc option. If so, the test should use this option.
On 64-bit target, prefer usng RDPC over CALL to get the value of %pc. This is faster on modern processors (Niagara T1 and newer) and avoids polluting the processor's predictor state. The old behavior of using a fake CALL is still done when tuning for classic UltraSPARC processors, since RDPC is much slower there. A quick pgbench test on a SPARC T4 shows about 2% speedup on SELECT loads, and about 7% speedup on INSERT/UPDATE loads. Pull Request: llvm#77196
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
63f9829
into
users/koachan/main.sparc-prefer-rdpc-over-call-to-implement-getpcx-for-64-bit-target
I intentionally restored the branch as I only noticed after the merge that the target branch was set incorrectly. It wasn't merged into the LLVM repo. Is there a way of re-targeting the PR or make a new PR? |
Lemme see if I can do it |
Uh oh, seems like I couldn't merge it with the tool either... |
Yes, sure |
Ya, sure. Whatever path is easier and quickest for you. |
Okay, new PR is at #78280. |
On 64-bit target, prefer usng RDPC over CALL to get the value of %pc.
This is faster on modern processors (Niagara T1 and newer) and avoids polluting
the processor's predictor state.
The old behavior of using a fake CALL is still done when tuning for classic
UltraSPARC processors, since RDPC is much slower there.
A quick pgbench test on a SPARC T4 shows about 2% speedup on SELECT loads,
and about 7% speedup on INSERT/UPDATE loads.