-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[GlobalISel][AArch64] Legalize G_SPLAT_VECTOR #114006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-aarch64 Author: Thorsten Schütt (tschuett) ChangesFull diff: https://github.com/llvm/llvm-project/pull/114006.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td b/llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td
index d1d0c5ff873410..79c07bc2fc9204 100644
--- a/llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td
+++ b/llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td
@@ -148,6 +148,7 @@ def : GINodeEquiv<G_INSERT_VECTOR_ELT, vector_insert>;
def : GINodeEquiv<G_CONCAT_VECTORS, concat_vectors>;
def : GINodeEquiv<G_BUILD_VECTOR, build_vector>;
def : GINodeEquiv<G_EXTRACT_SUBVECTOR, extract_subvector>;
+def : GINodeEquiv<G_SPLAT_VECTOR, splat_vector>;
def : GINodeEquiv<G_FCEIL, fceil>;
def : GINodeEquiv<G_FCOS, fcos>;
def : GINodeEquiv<G_FSIN, fsin>;
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index dd65dbe594a634..bcfb8370cb7bec 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -1316,6 +1316,10 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST)
.widenScalarOrEltToNextPow2(0)
.immIdx(0); // Inform verifier imm idx 0 is handled.
+ // TODO: {nxv8s16, s16}
+ getActionDefinitionsBuilder(G_SPLAT_VECTOR)
+ .legalFor(HasSVE, {{nxv4s32, s32}, {nxv2s64, s64}});
+
getLegacyLegalizerInfo().computeTables();
verify(*ST.getInstrInfo());
}
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-splat-vector.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-splat-vector.mir
new file mode 100644
index 00000000000000..1b9e4132a40751
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-splat-vector.mir
@@ -0,0 +1,143 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -O0 -debug -mtriple=aarch64-apple-ios -mattr=+sve -aarch64-enable-gisel-sve=1 -global-isel -start-before=legalizer -stop-after=instruction-select %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-SELECT
+# RUN: llc -O0 -mtriple=aarch64-apple-ios -mattr=+sve -aarch64-enable-gisel-sve=1 -global-isel -start-before=legalizer -stop-after=regbankselect %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-REGBANK
+# RUN: llc -O0 -mtriple=aarch64-apple-ios -mattr=+sve -aarch64-enable-gisel-sve=1 -global-isel -run-pass=legalizer %s -o - | FileCheck %s --check-prefixes=CHECK,CHECK-LEGAL
+
+
+---
+name: test_splat_vector_s64
+body: |
+ bb.1:
+ ; CHECK-SELECT-LABEL: name: test_splat_vector_s64
+ ; CHECK-SELECT: %imm:gpr64sp = COPY $x0
+ ; CHECK-SELECT-NEXT: %splat:zpr = DUP_ZR_D %imm
+ ; CHECK-SELECT-NEXT: $z0 = COPY %splat
+ ;
+ ; CHECK-REGBANK-LABEL: name: test_splat_vector_s64
+ ; CHECK-REGBANK: %imm:gpr(s64) = COPY $x0
+ ; CHECK-REGBANK-NEXT: %splat:fpr(<vscale x 2 x s64>) = G_SPLAT_VECTOR %imm(s64)
+ ; CHECK-REGBANK-NEXT: $z0 = COPY %splat(<vscale x 2 x s64>)
+ ;
+ ; CHECK-LEGAL-LABEL: name: test_splat_vector_s64
+ ; CHECK-LEGAL: %imm:_(s64) = COPY $x0
+ ; CHECK-LEGAL-NEXT: %splat:_(<vscale x 2 x s64>) = G_SPLAT_VECTOR %imm(s64)
+ ; CHECK-LEGAL-NEXT: $z0 = COPY %splat(<vscale x 2 x s64>)
+ %imm:_(s64) = COPY $x0
+ %splat:_(<vscale x 2 x s64>) = G_SPLAT_VECTOR %imm(s64)
+ $z0 = COPY %splat(<vscale x 2 x s64>)
+...
+---
+name: test_splat_vector_s64_const
+body: |
+ bb.1:
+ ; CHECK-SELECT-LABEL: name: test_splat_vector_s64_const
+ ; CHECK-SELECT: [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 9
+ ; CHECK-SELECT-NEXT: %imm:gpr64sp = SUBREG_TO_REG 0, [[MOVi32imm]], %subreg.sub_32
+ ; CHECK-SELECT-NEXT: %splat:zpr = DUP_ZR_D %imm
+ ; CHECK-SELECT-NEXT: $z0 = COPY %splat
+ ;
+ ; CHECK-REGBANK-LABEL: name: test_splat_vector_s64_const
+ ; CHECK-REGBANK: %imm:gpr(s64) = G_CONSTANT i64 9
+ ; CHECK-REGBANK-NEXT: %splat:fpr(<vscale x 2 x s64>) = G_SPLAT_VECTOR %imm(s64)
+ ; CHECK-REGBANK-NEXT: $z0 = COPY %splat(<vscale x 2 x s64>)
+ ;
+ ; CHECK-LEGAL-LABEL: name: test_splat_vector_s64_const
+ ; CHECK-LEGAL: %imm:_(s64) = G_CONSTANT i64 9
+ ; CHECK-LEGAL-NEXT: %splat:_(<vscale x 2 x s64>) = G_SPLAT_VECTOR %imm(s64)
+ ; CHECK-LEGAL-NEXT: $z0 = COPY %splat(<vscale x 2 x s64>)
+ %imm:_(s64) = G_CONSTANT i64 9
+ %splat:_(<vscale x 2 x s64>) = G_SPLAT_VECTOR %imm(s64)
+ $z0 = COPY %splat(<vscale x 2 x s64>)
+...
+---
+name: test_splat_vector_s64_fconst
+body: |
+ bb.1:
+ ; CHECK-SELECT-LABEL: name: test_splat_vector_s64_fconst
+ ; CHECK-SELECT: %imm:fpr64 = FMOVDi 34
+ ; CHECK-SELECT-NEXT: [[COPY:%[0-9]+]]:gpr64sp = COPY %imm
+ ; CHECK-SELECT-NEXT: %splat:zpr = DUP_ZR_D [[COPY]]
+ ; CHECK-SELECT-NEXT: $z0 = COPY %splat
+ ;
+ ; CHECK-REGBANK-LABEL: name: test_splat_vector_s64_fconst
+ ; CHECK-REGBANK: %imm:fpr(s64) = G_FCONSTANT double 9.000000e+00
+ ; CHECK-REGBANK-NEXT: [[COPY:%[0-9]+]]:gpr(s64) = COPY %imm(s64)
+ ; CHECK-REGBANK-NEXT: %splat:fpr(<vscale x 2 x s64>) = G_SPLAT_VECTOR [[COPY]](s64)
+ ; CHECK-REGBANK-NEXT: $z0 = COPY %splat(<vscale x 2 x s64>)
+ ;
+ ; CHECK-LEGAL-LABEL: name: test_splat_vector_s64_fconst
+ ; CHECK-LEGAL: %imm:_(s64) = G_FCONSTANT double 9.000000e+00
+ ; CHECK-LEGAL-NEXT: %splat:_(<vscale x 2 x s64>) = G_SPLAT_VECTOR %imm(s64)
+ ; CHECK-LEGAL-NEXT: $z0 = COPY %splat(<vscale x 2 x s64>)
+ %imm:_(s64) = G_FCONSTANT double 9.0
+ %splat:_(<vscale x 2 x s64>) = G_SPLAT_VECTOR %imm(s64)
+ $z0 = COPY %splat(<vscale x 2 x s64>)
+...
+---
+name: test_splat_vector_s32
+body: |
+ bb.1:
+ ; CHECK-SELECT-LABEL: name: test_splat_vector_s32
+ ; CHECK-SELECT: %imm:gpr32sp = COPY $w0
+ ; CHECK-SELECT-NEXT: %splat:zpr = DUP_ZR_S %imm
+ ; CHECK-SELECT-NEXT: $z0 = COPY %splat
+ ;
+ ; CHECK-REGBANK-LABEL: name: test_splat_vector_s32
+ ; CHECK-REGBANK: %imm:gpr(s32) = COPY $w0
+ ; CHECK-REGBANK-NEXT: %splat:fpr(<vscale x 4 x s32>) = G_SPLAT_VECTOR %imm(s32)
+ ; CHECK-REGBANK-NEXT: $z0 = COPY %splat(<vscale x 4 x s32>)
+ ;
+ ; CHECK-LEGAL-LABEL: name: test_splat_vector_s32
+ ; CHECK-LEGAL: %imm:_(s32) = COPY $w0
+ ; CHECK-LEGAL-NEXT: %splat:_(<vscale x 4 x s32>) = G_SPLAT_VECTOR %imm(s32)
+ ; CHECK-LEGAL-NEXT: $z0 = COPY %splat(<vscale x 4 x s32>)
+ %imm:_(s32) = COPY $w0
+ %splat:_(<vscale x 4 x s32>) = G_SPLAT_VECTOR %imm(s32)
+ $z0 = COPY %splat(<vscale x 4 x s32>)
+...
+---
+name: test_splat_vector_s32_const
+body: |
+ bb.1:
+ ; CHECK-SELECT-LABEL: name: test_splat_vector_s32_const
+ ; CHECK-SELECT: %imm:gpr32common = MOVi32imm 9
+ ; CHECK-SELECT-NEXT: %splat:zpr = DUP_ZR_S %imm
+ ; CHECK-SELECT-NEXT: $z0 = COPY %splat
+ ;
+ ; CHECK-REGBANK-LABEL: name: test_splat_vector_s32_const
+ ; CHECK-REGBANK: %imm:gpr(s32) = G_CONSTANT i32 9
+ ; CHECK-REGBANK-NEXT: %splat:fpr(<vscale x 4 x s32>) = G_SPLAT_VECTOR %imm(s32)
+ ; CHECK-REGBANK-NEXT: $z0 = COPY %splat(<vscale x 4 x s32>)
+ ;
+ ; CHECK-LEGAL-LABEL: name: test_splat_vector_s32_const
+ ; CHECK-LEGAL: %imm:_(s32) = G_CONSTANT i32 9
+ ; CHECK-LEGAL-NEXT: %splat:_(<vscale x 4 x s32>) = G_SPLAT_VECTOR %imm(s32)
+ ; CHECK-LEGAL-NEXT: $z0 = COPY %splat(<vscale x 4 x s32>)
+ %imm:_(s32) = G_CONSTANT i32 9
+ %splat:_(<vscale x 4 x s32>) = G_SPLAT_VECTOR %imm(s32)
+ $z0 = COPY %splat(<vscale x 4 x s32>)
+...
+---
+name: test_splat_vector_s32_fconst
+body: |
+ bb.1:
+ ; CHECK-SELECT-LABEL: name: test_splat_vector_s32_fconst
+ ; CHECK-SELECT: %imm:fpr32 = FMOVSi 28
+ ; CHECK-SELECT-NEXT: [[COPY:%[0-9]+]]:gpr32sp = COPY %imm
+ ; CHECK-SELECT-NEXT: %splat:zpr = DUP_ZR_S [[COPY]]
+ ; CHECK-SELECT-NEXT: $z0 = COPY %splat
+ ;
+ ; CHECK-REGBANK-LABEL: name: test_splat_vector_s32_fconst
+ ; CHECK-REGBANK: %imm:fpr(s32) = G_FCONSTANT float 7.000000e+00
+ ; CHECK-REGBANK-NEXT: [[COPY:%[0-9]+]]:gpr(s32) = COPY %imm(s32)
+ ; CHECK-REGBANK-NEXT: %splat:fpr(<vscale x 4 x s32>) = G_SPLAT_VECTOR [[COPY]](s32)
+ ; CHECK-REGBANK-NEXT: $z0 = COPY %splat(<vscale x 4 x s32>)
+ ;
+ ; CHECK-LEGAL-LABEL: name: test_splat_vector_s32_fconst
+ ; CHECK-LEGAL: %imm:_(s32) = G_FCONSTANT float 7.000000e+00
+ ; CHECK-LEGAL-NEXT: %splat:_(<vscale x 4 x s32>) = G_SPLAT_VECTOR %imm(s32)
+ ; CHECK-LEGAL-NEXT: $z0 = COPY %splat(<vscale x 4 x s32>)
+ %imm:_(s32) = G_FCONSTANT float 7.0
+ %splat:_(<vscale x 4 x s32>) = G_SPLAT_VECTOR %imm(s32)
+ $z0 = COPY %splat(<vscale x 4 x s32>)
+...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make sure there is some proper .ll testing too?
@@ -1316,6 +1316,10 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const AArch64Subtarget &ST) | |||
.widenScalarOrEltToNextPow2(0) | |||
.immIdx(0); // Inform verifier imm idx 0 is handled. | |||
|
|||
// TODO: {nxv8s16, s16} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why leave s16 (and s8) as a todo?
(It is useful to leave reasons like this in the commit summary).
For nxv16i8, I could not find patterns. |
f2150ad
to
eb2ae8a
Compare
Oh I see - it is related to the legal types in SDAG, and the i8/i16 being promoted to i32. We usually have to widen the variable in regbankselect / pre-select, to make sure the patterns can trigger. The other option might be to do it during legalization, but that might not handle h/b fp registers correctly. Could you change the tests to something that won't eventually turn into another instruction (addimm in this case)? Either by changing the opcode (although that still feels more complex than it needs to be) or just returning the splat value. We will have to start going through extract_elements and adding combines so more will work properly and we can rely on better tests. |
This pattern looks fine to select
For the {nxv16s8, s8} case, I would prefer to add patterns instead of doing complicated stuff in passes. |
eb2ae8a
to
fd1afc5
Compare
For legalization of G_SPLAT_VECTOR, the tests are sufficient. When we add combines, we can extend the tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests should still be updated to just return a splat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the new tests. LGTM
The mir tests fail in CI, but locally they work. |
c9e341c
to
6098d91
Compare
{nxv8s16, s16} fails to select. {nxv16s8, s8} no patterns available.
{nxv8s16, s16} fails to select. {nxv16s8, s8} no patterns available.
{nxv8s16, s16} fails to select.
{nxv16s8, s8} no patterns available.