-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV] Add a tune feature to disable stripping W suffix #86255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We have a hidden option to disable it, but I'd like to make it a tune feature. For some implementations, instructions with W suffix would be less costly as it only performs on 32 bits data. Though we may lose some chances to compress.
@llvm/pr-subscribers-backend-risc-v Author: Wang Pengcheng (wangpc-pp) ChangesWe have a hidden option to disable it, but I'd like to make it a For some implementations, instructions with W suffix would be less Full diff: https://github.com/llvm/llvm-project/pull/86255.diff 3 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td
index f3e641e250182b..6ef2289bb4beeb 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -1226,6 +1226,10 @@ def TuneNoSinkSplatOperands
"false", "Disable sink splat operands to enable .vx, .vf,"
".wx, and .wf instructions">;
+def TuneNoStripWSuffix
+ : SubtargetFeature<"no-strip-w-suffix", "EnableStripWSuffix", "false",
+ "Disable strip W suffix">;
+
def TuneConditionalCompressedMoveFusion
: SubtargetFeature<"conditional-cmv-fusion", "HasConditionalCompressedMoveFusion",
"true", "Enable branch+c.mv fusion">;
diff --git a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
index dcf70e8cad6408..39d420c2fbf080 100644
--- a/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+++ b/llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
@@ -672,7 +672,7 @@ bool RISCVOptWInstrs::stripWSuffixes(MachineFunction &MF,
const RISCVInstrInfo &TII,
const RISCVSubtarget &ST,
MachineRegisterInfo &MRI) {
- if (DisableStripWSuffix)
+ if (DisableStripWSuffix || !ST.enableStripWSuffix())
return false;
bool MadeChange = false;
diff --git a/llvm/test/CodeGen/RISCV/strip-w-suffix.ll b/llvm/test/CodeGen/RISCV/strip-w-suffix.ll
new file mode 100644
index 00000000000000..b403a5ea3bd898
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/strip-w-suffix.ll
@@ -0,0 +1,74 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=riscv64 -mattr=+m -verify-machineinstrs < %s \
+; RUN: | FileCheck -check-prefixes=STRIP %s
+; RUN: llc -mtriple=riscv64 -mattr=+m,+no-strip-w-suffix -verify-machineinstrs < %s \
+; RUN: | FileCheck -check-prefixes=NO-STRIP %s
+
+define i32 @addiw(i32 signext %a) nounwind {
+; STRIP-LABEL: addiw:
+; STRIP: # %bb.0:
+; STRIP-NEXT: lui a1, 1
+; STRIP-NEXT: addi a1, a1, -1
+; STRIP-NEXT: addw a0, a0, a1
+; STRIP-NEXT: ret
+;
+; NO-STRIP-LABEL: addiw:
+; NO-STRIP: # %bb.0:
+; NO-STRIP-NEXT: lui a1, 1
+; NO-STRIP-NEXT: addiw a1, a1, -1
+; NO-STRIP-NEXT: addw a0, a0, a1
+; NO-STRIP-NEXT: ret
+ %ret = add i32 %a, 4095
+ ret i32 %ret
+}
+
+define signext i32 @addw(i32 signext %a, i32 signext %b) nounwind {
+; STRIP-LABEL: addw:
+; STRIP: # %bb.0:
+; STRIP-NEXT: add a0, a0, a1
+; STRIP-NEXT: addiw a0, a0, 1024
+; STRIP-NEXT: ret
+;
+; NO-STRIP-LABEL: addw:
+; NO-STRIP: # %bb.0:
+; NO-STRIP-NEXT: addw a0, a0, a1
+; NO-STRIP-NEXT: addiw a0, a0, 1024
+; NO-STRIP-NEXT: ret
+ %add = add i32 %a, %b
+ %ret = add i32 %add, 1024
+ ret i32 %ret
+}
+
+define signext i32 @mulw(i32 signext %a, i32 signext %b) nounwind {
+; STRIP-LABEL: mulw:
+; STRIP: # %bb.0:
+; STRIP-NEXT: mul a0, a0, a1
+; STRIP-NEXT: addiw a0, a0, 1024
+; STRIP-NEXT: ret
+;
+; NO-STRIP-LABEL: mulw:
+; NO-STRIP: # %bb.0:
+; NO-STRIP-NEXT: mulw a0, a0, a1
+; NO-STRIP-NEXT: addiw a0, a0, 1024
+; NO-STRIP-NEXT: ret
+ %mul = mul i32 %a, %b
+ %ret = add i32 %mul, 1024
+ ret i32 %ret
+}
+
+define signext i32 @slliw(i32 signext %a) nounwind {
+; STRIP-LABEL: slliw:
+; STRIP: # %bb.0:
+; STRIP-NEXT: slli a0, a0, 1
+; STRIP-NEXT: addiw a0, a0, 1024
+; STRIP-NEXT: ret
+;
+; NO-STRIP-LABEL: slliw:
+; NO-STRIP: # %bb.0:
+; NO-STRIP-NEXT: slliw a0, a0, 1
+; NO-STRIP-NEXT: addiw a0, a0, 1024
+; NO-STRIP-NEXT: ret
+ %shl = shl i32 %a, 1
+ %ret = add i32 %shl, 1024
+ ret i32 %ret
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Do you know of a real implementation where this true? |
Yes, this requirement is from hardware team actually and I didn't notice this stripping W suffix pass before. |
Is that for all 4 instructions(ADDW, ADDIW, SLLIW, MULW) that are in that pass or just MULW? I know multiply complexity increases a lot when the size is doubled. |
My intent is for all W instructions, not only multiply. Some low resources/low power consumption scenarios may need this too. |
I was asking specifically about your hardware requirement. It's a bit questionable that MULW is in there. It's only useful with the Zcb extension and I think a cycle difference between MUL and MULW is more likely than ADD. I don't know if we're optimal on using W instructions so the existence of such hardware implies we might also need an "add W suffix" pass. |
I can't say too much details here, sorry for that. All I can say is that hardware can do some performance/power optimizations if there are more W instructions.
Yeah, I was thinking about the same thing and it's on my TODO list now. :-) |
Is it OK to you? @topperc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We have a hidden option to disable it, but I'd like to make it a
tune feature.
For some implementations, instructions with W suffix would be less
costly as they only perform on 32 bits data. Though we may lose some
chances to compress.