-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[CG][RISCV]Fix shuffling of odd number of input vectors #125693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CG][RISCV]Fix shuffling of odd number of input vectors #125693
Conversation
Created using spr 1.3.5
@llvm/pr-subscribers-backend-risc-v Author: Alexey Bataev (alexey-bataev) ChangesIf the input contains odd number of shuffled vectors, the 2 last Full diff: https://github.com/llvm/llvm-project/pull/125693.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 07e3390f3fbb22f..75fb909c503ec08 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -5213,13 +5213,27 @@ static SDValue lowerShuffleViaVRegSplitting(ShuffleVectorSDNode *SVN,
for (unsigned I : seq<unsigned>(Data.size())) {
const auto &[Idx1, Idx2, _] = Data[I];
if (Values.contains(Idx1)) {
- assert(Idx2 != UINT_MAX && Values.contains(Idx2) &&
- "Expected both indices to be extracted already.");
- break;
+ // If the shuffle contains permutation of odd number of elements,
+ // Idx1 might be used already in the first iteration.
+ //
+ // Idx1 = shuffle Idx1, Idx2
+ // Idx1 = shuffle Idx1, Idx3
+ if (const auto It = Values.find(Idx1);
+ I > 0 && std::get<0>(Data[I - 1]) == Idx1) {
+ assert(Idx2 != UINT_MAX && It != Values.end() &&
+ !Values.contains(Idx2) &&
+ "Expected only first index to be extracted already.");
+ ;
+ } else {
+ assert(Idx2 != UINT_MAX && Values.contains(Idx2) &&
+ "Expected both indices to be extracted already.");
+ break;
+ }
+ } else {
+ SDValue V = ExtractValue(Idx1 >= NumOfSrcRegs ? V2 : V1,
+ (Idx1 % NumOfSrcRegs) * NumOpElts);
+ Values[Idx1] = V;
}
- SDValue V = ExtractValue(Idx1 >= NumOfSrcRegs ? V2 : V1,
- (Idx1 % NumOfSrcRegs) * NumOpElts);
- Values[Idx1] = V;
if (Idx2 != UINT_MAX)
Values[Idx2] = ExtractValue(Idx2 >= NumOfSrcRegs ? V2 : V1,
(Idx2 % NumOfSrcRegs) * NumOpElts);
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
index afd560fd74d16ab..c0c17d4e0623e7b 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
@@ -431,3 +431,31 @@ define void @shuffle_i256_ldst(ptr %p) vscale_range(2,2) {
store <4 x i256> %res, ptr %p
ret void
}
+
+define void @shuffle_3_input_vectors() vscale_range(4,4) {
+; CHECK-LABEL: shuffle_3_input_vectors:
+; CHECK: # %bb.0:
+; CHECK-NEXT: vsetvli a0, zero, e64, m8, ta, ma
+; CHECK-NEXT: vmv.v.i v8, 1
+; CHECK-NEXT: vsetivli zero, 1, e8, mf8, ta, ma
+; CHECK-NEXT: vmv.v.i v0, 6
+; CHECK-NEXT: vsetvli a0, zero, e64, m8, ta, ma
+; CHECK-NEXT: vmv.v.i v16, 0
+; CHECK-NEXT: vsetivli zero, 4, e64, m1, ta, mu
+; CHECK-NEXT: vslidedown.vi v20, v8, 1, v0.t
+; CHECK-NEXT: vslideup.vi v20, v9, 3
+; CHECK-NEXT: vslidedown.vi v21, v9, 1
+; CHECK-NEXT: vmv1r.v v22, v8
+; CHECK-NEXT: vsetvli a0, zero, e64, m8, ta, ma
+; CHECK-NEXT: vmsgt.vi v8, v16, 0
+; CHECK-NEXT: vsetvli zero, zero, e32, m4, ta, ma
+; CHECK-NEXT: vmv.x.s a0, v8
+; CHECK-NEXT: sb a0, 0(zero)
+; CHECK-NEXT: ret
+ %1 = shufflevector <32 x i64> zeroinitializer, <32 x i64> splat (i64 1), <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 34, i32 35, i32 36, i32 37, i32 38, i32 39, i32 poison, i32 poison, i32 33, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison, i32 poison>
+ %2 = icmp slt <32 x i64> zeroinitializer, %1
+ %3 = bitcast <32 x i1> %2 to i32
+ %4 = trunc i32 %3 to i8
+ store i8 %4, ptr null, align 1
+ ret void
+}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Created using spr 1.3.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really sure the assert is worth the complexity here. I'd be tempted to just make the call to ExtractValue for Idx1 conditional on whether it's already in the set, and move on with life.
// | ||
// Idx1 = shuffle Idx1, Idx2 | ||
// Idx1 = shuffle Idx1, Idx3 | ||
if (const auto It = Values.find(Idx1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The It variable is only used in an assert, not the condition. Needs to either be folded into the assert, or #ifndef NDEBUG guards added to prevent non-assertions enabled build break.
Hm, looking at the check this is used for, it's entirely redundant with the contains check above. I think you can just remove this,
assert(Idx2 != UINT_MAX && It != Values.end() && | ||
!Values.contains(Idx2) && | ||
"Expected only first index to be extracted already."); | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stray semi?
!Values.contains(Idx2) && | ||
"Expected only first index to be extracted already."); | ||
; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reverse the if-clause so this becomes an early break for readability? (And reduces nesting.)
It is safe, generally speaking, I just wanted to add some extra checks here. I can remove all these checks |
Created using spr 1.3.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
If the input contains odd number of shuffled vectors, the 2 last shuffles are shuffled with the same first vector. Need to correctly process such situation: when the first vector is requested for the first time - extract it from the source vector, when it is requested the second time - reuse previous result. The second vector should be extracted in both cases. Fixes #125269 Reviewers: topperc, preames Reviewed By: preames Pull Request: llvm/llvm-project#125693
/cherry-pick 23b6a05 |
/pull-request #125910 |
If the input contains odd number of shuffled vectors, the 2 last shuffles are shuffled with the same first vector. Need to correctly process such situation: when the first vector is requested for the first time - extract it from the source vector, when it is requested the second time - reuse previous result. The second vector should be extracted in both cases. Fixes llvm#125269 Reviewers: topperc, preames Reviewed By: preames Pull Request: llvm#125693 (cherry picked from commit 23b6a05)
If the input contains odd number of shuffled vectors, the 2 last shuffles are shuffled with the same first vector. Need to correctly process such situation: when the first vector is requested for the first time - extract it from the source vector, when it is requested the second time - reuse previous result. The second vector should be extracted in both cases. Fixes #125269 Reviewers: topperc, preames Reviewed By: preames Pull Request: llvm/llvm-project#125693 (cherry picked from commit 23b6a05)
If the input contains odd number of shuffled vectors, the 2 last shuffles are shuffled with the same first vector. Need to correctly process such situation: when the first vector is requested for the first time - extract it from the source vector, when it is requested the second time - reuse previous result. The second vector should be extracted in both cases. Fixes llvm#125269 Reviewers: topperc, preames Reviewed By: preames Pull Request: llvm#125693
If the input contains odd number of shuffled vectors, the 2 last
shuffles are shuffled with the same first vector. Need to correctly
process such situation: when the first vector is requested for the first
time - extract it from the source vector, when it is requested the
second time - reuse previous result. The second vector should be
extracted in both cases.
Fixes #125269