Skip to content

[Hexagon] Add missing pattern for v8i1 type #120703

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 10, 2025
Merged

Conversation

quic-santdas
Copy link
Contributor

HexagonISD::PFALSE and PTRUE patterns do not form independently in general as they are treated like operands of all 0s or all 1s. Eg: i32 = transfer HEXAGONISD::PFALSE.
In this case, v8i1 = HEXAGONISD::PFALSE is formed independently without accompanying opcode.

This patch adds a pattern to transfer all 0s or all 1s to a scalar register and then use that register and this PFALSE/PTRUE opcode to transfer to a predicate register like v8i1.

HexagonISD::PFALSE and PTRUE patterns do not form independently in
general as they are treated like operands of all 0s or all 1s.
Eg: i32 = transfer HEXAGONISD::PFALSE.
In this case, v8i1 = HEXAGONISD::PFALSE is formed independently without
accompanying opcode.

This patch adds a pattern to transfer all 0s or all 1s to a scalar
register and then use that register and this PFALSE/PTRUE opcode
to transfer to a predicate register like v8i1.
@llvmbot
Copy link
Member

llvmbot commented Dec 20, 2024

@llvm/pr-subscribers-backend-hexagon

Author: Santanu Das (quic-santdas)

Changes

HexagonISD::PFALSE and PTRUE patterns do not form independently in general as they are treated like operands of all 0s or all 1s. Eg: i32 = transfer HEXAGONISD::PFALSE.
In this case, v8i1 = HEXAGONISD::PFALSE is formed independently without accompanying opcode.

This patch adds a pattern to transfer all 0s or all 1s to a scalar register and then use that register and this PFALSE/PTRUE opcode to transfer to a predicate register like v8i1.


Full diff: https://github.com/llvm/llvm-project/pull/120703.diff

2 Files Affected:

  • (modified) llvm/lib/Target/Hexagon/HexagonPatterns.td (+3)
  • (added) llvm/test/CodeGen/Hexagon/isel/isel-tfrrp.ll (+15)
diff --git a/llvm/lib/Target/Hexagon/HexagonPatterns.td b/llvm/lib/Target/Hexagon/HexagonPatterns.td
index 9bd45c72b7d4dc..cba5ff1ab0d9b5 100644
--- a/llvm/lib/Target/Hexagon/HexagonPatterns.td
+++ b/llvm/lib/Target/Hexagon/HexagonPatterns.td
@@ -108,6 +108,9 @@ def ptrue:  PatFrag<(ops), (HexagonPTRUE)>;
 def pfalse: PatFrag<(ops), (HexagonPFALSE)>;
 def pnot:   PatFrag<(ops node:$Pu), (xor node:$Pu, ptrue)>;
 
+def: Pat<(v8i1 (HexagonPFALSE)), (C2_tfrrp (A2_tfrsi (i32 0)))>;
+def: Pat<(v8i1 (HexagonPTRUE)), (C2_tfrrp (A2_tfrsi (i32 -1)))>;
+
 def valign: PatFrag<(ops node:$Vt, node:$Vs, node:$Ru),
                     (HexagonVALIGN node:$Vt, node:$Vs, node:$Ru)>;
 def valignaddr: PatFrag<(ops node:$Addr), (HexagonVALIGNADDR node:$Addr)>;
diff --git a/llvm/test/CodeGen/Hexagon/isel/isel-tfrrp.ll b/llvm/test/CodeGen/Hexagon/isel/isel-tfrrp.ll
new file mode 100644
index 00000000000000..b2a9f732bdddc7
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/isel/isel-tfrrp.ll
@@ -0,0 +1,15 @@
+; Check if a C2_tfrrp instruction with constant i32 0 input is generated
+; The constant 0 is generated by a transfer immediate instruction.
+
+; RUN: llc -march=hexagon -debug-only=isel 2>&1 < %s - | FileCheck %s
+
+; CHECK: [[R0:%[0-9]+]]:intregs = A2_tfrsi 0
+; CHECK-NEXT: predregs = C2_tfrrp killed [[R0]]:intregs
+
+define void @test_false(i1 %0) {
+  %2 = insertelement <1024 x i1> zeroinitializer, i1 %0, i64 0
+  tail call void @llvm.masked.store.v1024f32.p0(<1024 x float> zeroinitializer, ptr null, i32 1, <1024 x i1> %2)
+  ret void
+}
+
+declare void @llvm.masked.store.v1024f32.p0(<1024 x float>, ptr nocapture, i32 immarg, <1024 x i1>)

@androm3da androm3da requested a review from iajbar December 20, 2024 14:56
@androm3da
Copy link
Member

Does this fix #59009 ?

@androm3da
Copy link
Member

Does this fix #59009 ?

Kryzsztof's concern there was that the ABI was underspecified, so the fix was not clear, or rather any fix wouldn't be portable.

The issue is that we'd need to amend the ABI, to either allow passing/returning boolean vectors, or specify the conversions. At the moment LLVM is (AFAIK) the only compiler targeting Hexagon with HVX, so we could get away with handling it internally to LLVM, but it would make it harder for other potential future compilers/code generators to be binary-compatible with LLVM.

@quic-santdas
Copy link
Contributor Author

Does this fix #59009 ?

Kryzsztof's concern there was that the ABI was underspecified, so the fix was not clear, or rather any fix wouldn't be portable.

The issue is that we'd need to amend the ABI, to either allow passing/returning boolean vectors, or specify the conversions. At the moment LLVM is (AFAIK) the only compiler targeting Hexagon with HVX, so we could get away with handling it internally to LLVM, but it would make it harder for other potential future compilers/code generators to be binary-compatible with LLVM.

This patch fixes a case of v8i1 mishandling. The issue #59009 is currently being fixed: this one deals with unhandled v8i1 when used as a call operand.

@quic-santdas
Copy link
Contributor Author

@androm3da This actually fixes #64970.

@androm3da
Copy link
Member

@androm3da This actually fixes #64970.

That one seems to resemble #59009. I think this fix seems like an appropriate one but in the longer term we should be mindful of the issues Krzyzstof raised.

Regarding this PR: LGTM but @iajbar or someone else should approve it.

Copy link
Contributor

@iajbar iajbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a different patch to fix handling call operand of type v8i1 in the issue #59009

@iajbar iajbar merged commit 9d7df23 into llvm:main Jan 10, 2025
8 checks passed
BaiXilin pushed a commit to BaiXilin/llvm-fix-vnni-instr-types that referenced this pull request Jan 12, 2025
HexagonISD::PFALSE and PTRUE patterns do not form independently in
general as they are treated like operands of all 0s or all 1s. Eg: i32 =
transfer HEXAGONISD::PFALSE.
In this case, v8i1 = HEXAGONISD::PFALSE is formed independently without
accompanying opcode.

This patch adds a pattern to transfer all 0s or all 1s to a scalar
register and then use that register and this PFALSE/PTRUE opcode to
transfer to a predicate register like v8i1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants