-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SDAG] Harden assumption in getMemsetStringVal #126207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add missing test coverage for codepaths touched by llvm#126207.
@llvm/pr-subscribers-backend-aarch64 @llvm/pr-subscribers-backend-arm Author: Cullen Rhodes (c-rhodes) ChangesIn 5235973, an ICE was fixed in This part of getMemsetStringVal relates to memcpy operations where the For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 For other targets, ARM and x86 are the only ones that return a FP vector To defend this, I considered adding an assert as mentioned in [1], but For test coverage I added unreachables to both of the branches handling Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in AArch64TargetLowering::getOptimalMemOpType was updated in For the final branch noted as suspect in [1], as far as I can tell this Fixes: #20521 [1] Full diff: https://github.com/llvm/llvm-project/pull/126207.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 16c3b295426c648..445edf1dea41d24 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -8015,16 +8015,8 @@ static SDValue getMemsetStringVal(EVT VT, const SDLoc &dl, SelectionDAG &DAG,
if (Slice.Array == nullptr) {
if (VT.isInteger())
return DAG.getConstant(0, dl, VT);
- if (VT == MVT::f32 || VT == MVT::f64 || VT == MVT::f128)
+ if (VT.isFloatingPoint())
return DAG.getConstantFP(0.0, dl, VT);
- if (VT.isVector()) {
- unsigned NumElts = VT.getVectorNumElements();
- MVT EltVT = (VT.getVectorElementType() == MVT::f32) ? MVT::i32 : MVT::i64;
- return DAG.getNode(ISD::BITCAST, dl, VT,
- DAG.getConstant(0, dl,
- EVT::getVectorVT(*DAG.getContext(),
- EltVT, NumElts)));
- }
llvm_unreachable("Expected type!");
}
diff --git a/llvm/test/CodeGen/AArch64/memcpy-f128.ll b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
index 5b354dd23e01dfa..229c0034b6f636c 100644
--- a/llvm/test/CodeGen/AArch64/memcpy-f128.ll
+++ b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s
+; RUN: llc < %s -mtriple=aarch64-linux-gnu -mattr=-neon | FileCheck %s
%structA = type { i128 }
@stubA = internal unnamed_addr constant %structA zeroinitializer, align 8
diff --git a/llvm/test/CodeGen/ARM/memcpy-inline.ll b/llvm/test/CodeGen/ARM/memcpy-inline.ll
index 596a58afe46e5bb..89db22d1f0ed53c 100644
--- a/llvm/test/CodeGen/ARM/memcpy-inline.ll
+++ b/llvm/test/CodeGen/ARM/memcpy-inline.ll
@@ -12,6 +12,7 @@
@.str4 = private unnamed_addr constant [18 x i8] c"DHRYSTONE PROGR \00", align 1
@.str5 = private unnamed_addr constant [7 x i8] c"DHRYST\00", align 1
@.str6 = private unnamed_addr constant [14 x i8] c"/tmp/rmXXXXXX\00", align 1
+@empty = private unnamed_addr constant [31 x i8] zeroinitializer, align 1
@spool.splbuf = internal global [512 x i8] zeroinitializer, align 16
define i32 @t0() {
@@ -282,5 +283,31 @@ entry:
ret void
}
+define void @copy_from_zero_constant(ptr nocapture %C) nounwind {
+; CHECK-LABEL: copy_from_zero_constant:
+; CHECK: @ %bb.0: @ %entry
+; CHECK-NEXT: vmov.i32 q8, #0x0
+; CHECK-NEXT: movs r1, #15
+; CHECK-NEXT: vst1.8 {d16, d17}, [r0], r1
+; CHECK-NEXT: vst1.8 {d16, d17}, [r0]
+; CHECK-NEXT: bx lr
+;
+; CHECK-T1-LABEL: copy_from_zero_constant:
+; CHECK-T1: @ %bb.0: @ %entry
+; CHECK-T1-NEXT: .save {r7, lr}
+; CHECK-T1-NEXT: push {r7, lr}
+; CHECK-T1-NEXT: ldr r1, .LCPI8_0
+; CHECK-T1-NEXT: movs r2, #31
+; CHECK-T1-NEXT: bl __aeabi_memcpy
+; CHECK-T1-NEXT: pop {r7, pc}
+; CHECK-T1-NEXT: .p2align 2
+; CHECK-T1-NEXT: @ %bb.1:
+; CHECK-T1-NEXT: .LCPI8_0:
+; CHECK-T1-NEXT: .long .Lempty
+entry:
+ tail call void @llvm.memcpy.p0.p0.i64(ptr %C, ptr @empty, i64 31, i1 false)
+ ret void
+}
+
declare void @llvm.memcpy.p0.p0.i32(ptr nocapture, ptr nocapture, i32, i1) nounwind
declare void @llvm.memcpy.p0.p0.i64(ptr nocapture, ptr nocapture, i64, i1) nounwind
|
@llvm/pr-subscribers-llvm-selectiondag Author: Cullen Rhodes (c-rhodes) ChangesIn 5235973, an ICE was fixed in This part of getMemsetStringVal relates to memcpy operations where the For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 For other targets, ARM and x86 are the only ones that return a FP vector To defend this, I considered adding an assert as mentioned in [1], but For test coverage I added unreachables to both of the branches handling Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in AArch64TargetLowering::getOptimalMemOpType was updated in For the final branch noted as suspect in [1], as far as I can tell this Fixes: #20521 [1] Full diff: https://github.com/llvm/llvm-project/pull/126207.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 16c3b295426c648..445edf1dea41d24 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -8015,16 +8015,8 @@ static SDValue getMemsetStringVal(EVT VT, const SDLoc &dl, SelectionDAG &DAG,
if (Slice.Array == nullptr) {
if (VT.isInteger())
return DAG.getConstant(0, dl, VT);
- if (VT == MVT::f32 || VT == MVT::f64 || VT == MVT::f128)
+ if (VT.isFloatingPoint())
return DAG.getConstantFP(0.0, dl, VT);
- if (VT.isVector()) {
- unsigned NumElts = VT.getVectorNumElements();
- MVT EltVT = (VT.getVectorElementType() == MVT::f32) ? MVT::i32 : MVT::i64;
- return DAG.getNode(ISD::BITCAST, dl, VT,
- DAG.getConstant(0, dl,
- EVT::getVectorVT(*DAG.getContext(),
- EltVT, NumElts)));
- }
llvm_unreachable("Expected type!");
}
diff --git a/llvm/test/CodeGen/AArch64/memcpy-f128.ll b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
index 5b354dd23e01dfa..229c0034b6f636c 100644
--- a/llvm/test/CodeGen/AArch64/memcpy-f128.ll
+++ b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s
+; RUN: llc < %s -mtriple=aarch64-linux-gnu -mattr=-neon | FileCheck %s
%structA = type { i128 }
@stubA = internal unnamed_addr constant %structA zeroinitializer, align 8
diff --git a/llvm/test/CodeGen/ARM/memcpy-inline.ll b/llvm/test/CodeGen/ARM/memcpy-inline.ll
index 596a58afe46e5bb..89db22d1f0ed53c 100644
--- a/llvm/test/CodeGen/ARM/memcpy-inline.ll
+++ b/llvm/test/CodeGen/ARM/memcpy-inline.ll
@@ -12,6 +12,7 @@
@.str4 = private unnamed_addr constant [18 x i8] c"DHRYSTONE PROGR \00", align 1
@.str5 = private unnamed_addr constant [7 x i8] c"DHRYST\00", align 1
@.str6 = private unnamed_addr constant [14 x i8] c"/tmp/rmXXXXXX\00", align 1
+@empty = private unnamed_addr constant [31 x i8] zeroinitializer, align 1
@spool.splbuf = internal global [512 x i8] zeroinitializer, align 16
define i32 @t0() {
@@ -282,5 +283,31 @@ entry:
ret void
}
+define void @copy_from_zero_constant(ptr nocapture %C) nounwind {
+; CHECK-LABEL: copy_from_zero_constant:
+; CHECK: @ %bb.0: @ %entry
+; CHECK-NEXT: vmov.i32 q8, #0x0
+; CHECK-NEXT: movs r1, #15
+; CHECK-NEXT: vst1.8 {d16, d17}, [r0], r1
+; CHECK-NEXT: vst1.8 {d16, d17}, [r0]
+; CHECK-NEXT: bx lr
+;
+; CHECK-T1-LABEL: copy_from_zero_constant:
+; CHECK-T1: @ %bb.0: @ %entry
+; CHECK-T1-NEXT: .save {r7, lr}
+; CHECK-T1-NEXT: push {r7, lr}
+; CHECK-T1-NEXT: ldr r1, .LCPI8_0
+; CHECK-T1-NEXT: movs r2, #31
+; CHECK-T1-NEXT: bl __aeabi_memcpy
+; CHECK-T1-NEXT: pop {r7, pc}
+; CHECK-T1-NEXT: .p2align 2
+; CHECK-T1-NEXT: @ %bb.1:
+; CHECK-T1-NEXT: .LCPI8_0:
+; CHECK-T1-NEXT: .long .Lempty
+entry:
+ tail call void @llvm.memcpy.p0.p0.i64(ptr %C, ptr @empty, i64 31, i1 false)
+ ret void
+}
+
declare void @llvm.memcpy.p0.p0.i32(ptr nocapture, ptr nocapture, i32, i1) nounwind
declare void @llvm.memcpy.p0.p0.i64(ptr nocapture, ptr nocapture, i64, i1) nounwind
|
if (VT.isVector()) { | ||
unsigned NumElts = VT.getVectorNumElements(); | ||
MVT EltVT = (VT.getVectorElementType() == MVT::f32) ? MVT::i32 : MVT::i64; | ||
return DAG.getNode(ISD::BITCAST, dl, VT, | ||
DAG.getConstant(0, dl, | ||
EVT::getVectorVT(*DAG.getContext(), | ||
EltVT, NumElts))); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can just bitcast to VT.changeVectorElementTypeToInteger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would also work I guess, although it's not obvious to me that's any better than calling getConstantFP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean instead of having this unreachable case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if my understanding is correct, but I've updated to what I believe you're suggesting, except I've used changeTypeToInteger
since it handles both scalar/vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds OK, but could it use a fp type that doesn't have a representation of 0.0, or where 0.0 != 0x0000? I doubt that would be used in practice, but is it worth having an assert to check?
In 5235973, an ICE was fixed in getMemsetStringVal where f128 wasn't handled. It was noted at the time [1] that the code below this also looks suspect, since it assumes the element type of VT is either an f32 or f64. This part of getMemsetStringVal relates to memcpy operations where the source is a copy from a zero constant. The VT in question is determined by TargetLowering::findOptimalMemOpLowering, which in turn calls a further TLI hook getOptimalMemOpType. For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 or Other. For Other, TargetLowering::findOptimalMemOpLowering will then pick an integer VT. So on AArch64 at least, I don't believe the suspect code can be reached. For other targets, ARM and x86 are the only ones that return a FP vector type from getOptimalMemOpType. For both targets, the only such type is v2f64, but given f64 is already handled it should also be fine. To defend this, I considered adding an assert as mentioned in [1], but given getConstantFP handles vector types, I figured using this to fully handle the FP types makes the code simpler and more robust. For test coverage I added unreachables to both of the branches handling FP types in this code, but found neither fired with check-llvm across all targets. Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in 5235973 to defend ICE on f128, but at some point it stopped hitting this code. AArch64TargetLowering::getOptimalMemOpType was updated in 2920061, so I suspect this is when it happened, although I haven't verified this. Although I did find by updating the test to disable NEON, getOptimalMemOpType returns an f128 and the branch is once again hit. For the final branch noted as suspect in [1], as far as I can tell this has never had any test coverage, so I've added a test to the ARM backend for this. Fixes: llvm#20521 [1]
7fbd52b
to
2dfa6fe
Compare
any pointers on how i can check that? |
We don't have any of those in the IR, if they exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could have kept the constant fp case, I guess. Also could unconditionally go through the bitcast
Thanks for review. By unconditionally go through the bitcast, do you mean remove the int case? |
} | ||
llvm_unreachable("Expected type!"); | ||
return DAG.getNode(ISD::BITCAST, dl, VT, | ||
DAG.getConstant(0, dl, VT.changeTypeToInteger())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this can be used for the isInteger() path above as well, but I'm not sure if that just gets confusing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that crossed my mind also and is perhaps what @arsenm was referring to as well, I don't mind either way.
Add missing test coverage for codepaths touched by llvm#126207.
I'll land this tomorrow if there's no further comments by then. |
In 5235973, an ICE was fixed in getMemsetStringVal where f128 wasn't handled. It was noted at the time [1] that the code below this also looks suspect, since it assumes the element type of VT is either an f32 or f64. This part of getMemsetStringVal relates to memcpy operations where the source is a copy from a zero constant. The VT in question is determined by TargetLowering::findOptimalMemOpLowering, which in turn calls a further TLI hook getOptimalMemOpType. For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 or Other. For Other, TargetLowering::findOptimalMemOpLowering will then pick an integer VT. So on AArch64 at least, I don't believe the suspect code can be reached. For other targets, ARM and x86 are the only ones that return a FP vector type from getOptimalMemOpType. For both targets, the only such type is v2f64, but given f64 is already handled it should also be fine. To defend this, I considered adding an assert as mentioned in [1], but given getConstantFP handles vector types, I figured using this to fully handle the FP types makes the code simpler and more robust. For test coverage I added unreachables to both of the branches handling FP types in this code, but found neither fired with check-llvm across all targets. Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in 5235973 to defend ICE on f128, but at some point it stopped hitting this code. AArch64TargetLowering::getOptimalMemOpType was updated in 2920061, so I suspect this is when it happened, although I haven't verified this. Although I did find by updating the test to disable NEON, getOptimalMemOpType returns an f128 and the branch is once again hit. For the final branch noted as suspect in [1], as far as I can tell this has never had any test coverage, so I've added a test to the ARM backend for this. Fixes: llvm#20521 [1]
In 5235973, an ICE was fixed in getMemsetStringVal where f128 wasn't handled. It was noted at the time [1] that the code below this also looks suspect, since it assumes the element type of VT is either an f32 or f64. This part of getMemsetStringVal relates to memcpy operations where the source is a copy from a zero constant. The VT in question is determined by TargetLowering::findOptimalMemOpLowering, which in turn calls a further TLI hook getOptimalMemOpType. For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 or Other. For Other, TargetLowering::findOptimalMemOpLowering will then pick an integer VT. So on AArch64 at least, I don't believe the suspect code can be reached. For other targets, ARM and x86 are the only ones that return a FP vector type from getOptimalMemOpType. For both targets, the only such type is v2f64, but given f64 is already handled it should also be fine. To defend this, I considered adding an assert as mentioned in [1], but given getConstantFP handles vector types, I figured using this to fully handle the FP types makes the code simpler and more robust. For test coverage I added unreachables to both of the branches handling FP types in this code, but found neither fired with check-llvm across all targets. Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in 5235973 to defend ICE on f128, but at some point it stopped hitting this code. AArch64TargetLowering::getOptimalMemOpType was updated in 2920061, so I suspect this is when it happened, although I haven't verified this. Although I did find by updating the test to disable NEON, getOptimalMemOpType returns an f128 and the branch is once again hit. For the final branch noted as suspect in [1], as far as I can tell this has never had any test coverage, so I've added a test to the ARM backend for this. Fixes: llvm#20521 [1]
In 5235973, an ICE was fixed in getMemsetStringVal where f128 wasn't handled. It was noted at the time [1] that the code below this also looks suspect, since it assumes the element type of VT is either an f32 or f64. This part of getMemsetStringVal relates to memcpy operations where the source is a copy from a zero constant. The VT in question is determined by TargetLowering::findOptimalMemOpLowering, which in turn calls a further TLI hook getOptimalMemOpType. For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 or Other. For Other, TargetLowering::findOptimalMemOpLowering will then pick an integer VT. So on AArch64 at least, I don't believe the suspect code can be reached. For other targets, ARM and x86 are the only ones that return a FP vector type from getOptimalMemOpType. For both targets, the only such type is v2f64, but given f64 is already handled it should also be fine. To defend this, I considered adding an assert as mentioned in [1], but given getConstantFP handles vector types, I figured using this to fully handle the FP types makes the code simpler and more robust. For test coverage I added unreachables to both of the branches handling FP types in this code, but found neither fired with check-llvm across all targets. Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in 5235973 to defend ICE on f128, but at some point it stopped hitting this code. AArch64TargetLowering::getOptimalMemOpType was updated in 2920061, so I suspect this is when it happened, although I haven't verified this. Although I did find by updating the test to disable NEON, getOptimalMemOpType returns an f128 and the branch is once again hit. For the final branch noted as suspect in [1], as far as I can tell this has never had any test coverage, so I've added a test to the ARM backend for this. Fixes: llvm#20521 [1]
In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.
This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.
For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.
For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.
To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.
For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.
Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.
AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.
For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.
Fixes: #20521 [1]