[SDAG] Harden assumption in getMemsetStringVal #126207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

c-rhodes merged 2 commits into llvm:main from c-rhodes:gh-issue-#20521

Feb 13, 2025

Collaborator

c-rhodes commented Feb 7, 2025

In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: #20521 [1]

c-rhodes added a commit to c-rhodes/llvm-project that referenced this pull request


          [SDAG] Precommit tests for llvm#126207 (NFC)

1f68bf1

Add missing test coverage for codepaths touched by llvm#126207.

c-rhodes mentioned this pull request

[SDAG] Precommit tests for #126207 (NFC) #126208

Merged

c-rhodes marked this pull request as ready for review

February 7, 2025 09:38

c-rhodes requested a review from davemgreen

February 7, 2025 09:38

llvmbot added backend:ARM backend:AArch64 llvm:SelectionDAG labels

Member

llvmbot commented Feb 7, 2025 •

edited

Loading

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-backend-arm

Author: Cullen Rhodes (c-rhodes)

Changes

In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: #20521 [1]

Full diff: https://github.com/llvm/llvm-project/pull/126207.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+1-9)
(modified) llvm/test/CodeGen/AArch64/memcpy-f128.ll (+1-1)
(modified) llvm/test/CodeGen/ARM/memcpy-inline.ll (+27)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 16c3b295426c648..445edf1dea41d24 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -8015,16 +8015,8 @@ static SDValue getMemsetStringVal(EVT VT, const SDLoc &dl, SelectionDAG &DAG,
   if (Slice.Array == nullptr) {
     if (VT.isInteger())
       return DAG.getConstant(0, dl, VT);
-    if (VT == MVT::f32 || VT == MVT::f64 || VT == MVT::f128)
+    if (VT.isFloatingPoint())
       return DAG.getConstantFP(0.0, dl, VT);
-    if (VT.isVector()) {
-      unsigned NumElts = VT.getVectorNumElements();
-      MVT EltVT = (VT.getVectorElementType() == MVT::f32) ? MVT::i32 : MVT::i64;
-      return DAG.getNode(ISD::BITCAST, dl, VT,
-                         DAG.getConstant(0, dl,
-                                         EVT::getVectorVT(*DAG.getContext(),
-                                                          EltVT, NumElts)));
-    }
     llvm_unreachable("Expected type!");
   }
 
diff --git a/llvm/test/CodeGen/AArch64/memcpy-f128.ll b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
index 5b354dd23e01dfa..229c0034b6f636c 100644
--- a/llvm/test/CodeGen/AArch64/memcpy-f128.ll
+++ b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s
+; RUN: llc < %s -mtriple=aarch64-linux-gnu -mattr=-neon | FileCheck %s
 
 %structA = type { i128 }
 @stubA = internal unnamed_addr constant %structA zeroinitializer, align 8
diff --git a/llvm/test/CodeGen/ARM/memcpy-inline.ll b/llvm/test/CodeGen/ARM/memcpy-inline.ll
index 596a58afe46e5bb..89db22d1f0ed53c 100644
--- a/llvm/test/CodeGen/ARM/memcpy-inline.ll
+++ b/llvm/test/CodeGen/ARM/memcpy-inline.ll
@@ -12,6 +12,7 @@
 @.str4 = private unnamed_addr constant [18 x i8] c"DHRYSTONE PROGR  \00", align 1
 @.str5 = private unnamed_addr constant [7 x i8] c"DHRYST\00", align 1
 @.str6 = private unnamed_addr constant [14 x i8] c"/tmp/rmXXXXXX\00", align 1
+@empty = private unnamed_addr constant [31 x i8] zeroinitializer, align 1
 @spool.splbuf = internal global [512 x i8] zeroinitializer, align 16
 
 define i32 @t0() {
@@ -282,5 +283,31 @@ entry:
   ret void
 }
 
+define void @copy_from_zero_constant(ptr nocapture %C) nounwind {
+; CHECK-LABEL: copy_from_zero_constant:
+; CHECK:       @ %bb.0: @ %entry
+; CHECK-NEXT:    vmov.i32 q8, #0x0
+; CHECK-NEXT:    movs r1, #15
+; CHECK-NEXT:    vst1.8 {d16, d17}, [r0], r1
+; CHECK-NEXT:    vst1.8 {d16, d17}, [r0]
+; CHECK-NEXT:    bx lr
+;
+; CHECK-T1-LABEL: copy_from_zero_constant:
+; CHECK-T1:       @ %bb.0: @ %entry
+; CHECK-T1-NEXT:    .save {r7, lr}
+; CHECK-T1-NEXT:    push {r7, lr}
+; CHECK-T1-NEXT:    ldr r1, .LCPI8_0
+; CHECK-T1-NEXT:    movs r2, #31
+; CHECK-T1-NEXT:    bl __aeabi_memcpy
+; CHECK-T1-NEXT:    pop {r7, pc}
+; CHECK-T1-NEXT:    .p2align 2
+; CHECK-T1-NEXT:  @ %bb.1:
+; CHECK-T1-NEXT:  .LCPI8_0:
+; CHECK-T1-NEXT:    .long .Lempty
+entry:
+  tail call void @llvm.memcpy.p0.p0.i64(ptr %C, ptr @empty, i64 31, i1 false)
+  ret void
+}
+
 declare void @llvm.memcpy.p0.p0.i32(ptr nocapture, ptr nocapture, i32, i1) nounwind
 declare void @llvm.memcpy.p0.p0.i64(ptr nocapture, ptr nocapture, i64, i1) nounwind

Member

llvmbot commented Feb 7, 2025

@llvm/pr-subscribers-llvm-selectiondag

Author: Cullen Rhodes (c-rhodes)

Changes

In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: #20521 [1]

Full diff: https://github.com/llvm/llvm-project/pull/126207.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+1-9)
(modified) llvm/test/CodeGen/AArch64/memcpy-f128.ll (+1-1)
(modified) llvm/test/CodeGen/ARM/memcpy-inline.ll (+27)

diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 16c3b295426c648..445edf1dea41d24 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -8015,16 +8015,8 @@ static SDValue getMemsetStringVal(EVT VT, const SDLoc &dl, SelectionDAG &DAG,
   if (Slice.Array == nullptr) {
     if (VT.isInteger())
       return DAG.getConstant(0, dl, VT);
-    if (VT == MVT::f32 || VT == MVT::f64 || VT == MVT::f128)
+    if (VT.isFloatingPoint())
       return DAG.getConstantFP(0.0, dl, VT);
-    if (VT.isVector()) {
-      unsigned NumElts = VT.getVectorNumElements();
-      MVT EltVT = (VT.getVectorElementType() == MVT::f32) ? MVT::i32 : MVT::i64;
-      return DAG.getNode(ISD::BITCAST, dl, VT,
-                         DAG.getConstant(0, dl,
-                                         EVT::getVectorVT(*DAG.getContext(),
-                                                          EltVT, NumElts)));
-    }
     llvm_unreachable("Expected type!");
   }
 
diff --git a/llvm/test/CodeGen/AArch64/memcpy-f128.ll b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
index 5b354dd23e01dfa..229c0034b6f636c 100644
--- a/llvm/test/CodeGen/AArch64/memcpy-f128.ll
+++ b/llvm/test/CodeGen/AArch64/memcpy-f128.ll
@@ -1,4 +1,4 @@
-; RUN: llc < %s -mtriple=aarch64-linux-gnu | FileCheck %s
+; RUN: llc < %s -mtriple=aarch64-linux-gnu -mattr=-neon | FileCheck %s
 
 %structA = type { i128 }
 @stubA = internal unnamed_addr constant %structA zeroinitializer, align 8
diff --git a/llvm/test/CodeGen/ARM/memcpy-inline.ll b/llvm/test/CodeGen/ARM/memcpy-inline.ll
index 596a58afe46e5bb..89db22d1f0ed53c 100644
--- a/llvm/test/CodeGen/ARM/memcpy-inline.ll
+++ b/llvm/test/CodeGen/ARM/memcpy-inline.ll
@@ -12,6 +12,7 @@
 @.str4 = private unnamed_addr constant [18 x i8] c"DHRYSTONE PROGR  \00", align 1
 @.str5 = private unnamed_addr constant [7 x i8] c"DHRYST\00", align 1
 @.str6 = private unnamed_addr constant [14 x i8] c"/tmp/rmXXXXXX\00", align 1
+@empty = private unnamed_addr constant [31 x i8] zeroinitializer, align 1
 @spool.splbuf = internal global [512 x i8] zeroinitializer, align 16
 
 define i32 @t0() {
@@ -282,5 +283,31 @@ entry:
   ret void
 }
 
+define void @copy_from_zero_constant(ptr nocapture %C) nounwind {
+; CHECK-LABEL: copy_from_zero_constant:
+; CHECK:       @ %bb.0: @ %entry
+; CHECK-NEXT:    vmov.i32 q8, #0x0
+; CHECK-NEXT:    movs r1, #15
+; CHECK-NEXT:    vst1.8 {d16, d17}, [r0], r1
+; CHECK-NEXT:    vst1.8 {d16, d17}, [r0]
+; CHECK-NEXT:    bx lr
+;
+; CHECK-T1-LABEL: copy_from_zero_constant:
+; CHECK-T1:       @ %bb.0: @ %entry
+; CHECK-T1-NEXT:    .save {r7, lr}
+; CHECK-T1-NEXT:    push {r7, lr}
+; CHECK-T1-NEXT:    ldr r1, .LCPI8_0
+; CHECK-T1-NEXT:    movs r2, #31
+; CHECK-T1-NEXT:    bl __aeabi_memcpy
+; CHECK-T1-NEXT:    pop {r7, pc}
+; CHECK-T1-NEXT:    .p2align 2
+; CHECK-T1-NEXT:  @ %bb.1:
+; CHECK-T1-NEXT:  .LCPI8_0:
+; CHECK-T1-NEXT:    .long .Lempty
+entry:
+  tail call void @llvm.memcpy.p0.p0.i64(ptr %C, ptr @empty, i64 31, i1 false)
+  ret void
+}
+
 declare void @llvm.memcpy.p0.p0.i32(ptr nocapture, ptr nocapture, i32, i1) nounwind
 declare void @llvm.memcpy.p0.p0.i64(ptr nocapture, ptr nocapture, i64, i1) nounwind

arsenm reviewed

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

Comment on lines -8020 to -8027

-                  if (VT.isVector()) {
-                    unsigned NumElts = VT.getVectorNumElements();
-                    MVT EltVT = (VT.getVectorElementType() == MVT::f32) ? MVT::i32 : MVT::i64;
-                    return DAG.getNode(ISD::BITCAST, dl, VT,
-                                       DAG.getConstant(0, dl,
-                                                       EVT::getVectorVT(*DAG.getContext(),
-                                                                        EltVT, NumElts)));
-                  }

Contributor

arsenm Feb 7, 2025

This can just bitcast to VT.changeVectorElementTypeToInteger?

Collaborator Author

c-rhodes Feb 7, 2025

that would also work I guess, although it's not obvious to me that's any better than calling getConstantFP?

Contributor

arsenm Feb 10, 2025

I mean instead of having this unreachable case

Collaborator Author

c-rhodes Feb 10, 2025

not sure if my understanding is correct, but I've updated to what I believe you're suggesting, except I've used changeTypeToInteger since it handles both scalar/vector.

davemgreen reviewed

View reviewed changes

Collaborator

davemgreen left a comment

This sounds OK, but could it use a fp type that doesn't have a representation of 0.0, or where 0.0 != 0x0000? I doubt that would be used in practice, but is it worth having an assert to check?

c-rhodes added a commit that referenced this pull request


          [SDAG] Precommit tests for #126207 (NFC) (#126208)

317a644

Add missing test coverage for codepaths touched by #126207.

c-rhodes added 2 commits

February 10, 2025 09:59


          [SDAG] Harden assumption in getMemsetStringVal

In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: llvm#20521 [1]


          address comments

2dfa6fe

c-rhodes force-pushed the gh-issue-#20521 branch from 7fbd52b to 2dfa6fe Compare

February 10, 2025 11:02

Collaborator Author

c-rhodes commented Feb 10, 2025

This sounds OK, but could it use a fp type that doesn't have a representation of 0.0, or where 0.0 != 0x0000? I doubt that would be used in practice, but is it worth having an assert to check?

any pointers on how i can check that?

Contributor

arsenm commented Feb 10, 2025

This sounds OK, but could it use a fp type that doesn't have a representation of 0.0, or where 0.0 != 0x0000? I doubt that would be used in practice, but is it worth having an assert to check?

any pointers on how i can check that?

We don't have any of those in the IR, if they exist.

arsenm approved these changes

View reviewed changes

Contributor

arsenm left a comment

Could have kept the constant fp case, I guess. Also could unconditionally go through the bitcast

Collaborator Author

c-rhodes commented Feb 10, 2025

Could have kept the constant fp case, I guess. Also could unconditionally go through the bitcast

Thanks for review. By unconditionally go through the bitcast, do you mean remove the int case?

RKSimon reviewed

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

-                  }
-                  llvm_unreachable("Expected type!");
+                  return DAG.getNode(ISD::BITCAST, dl, VT,
+                                     DAG.getConstant(0, dl, VT.changeTypeToInteger()));

Collaborator

RKSimon Feb 10, 2025

Technically this can be used for the isInteger() path above as well, but I'm not sure if that just gets confusing?

Collaborator Author

c-rhodes Feb 10, 2025 •

edited

Loading

that crossed my mind also and is perhaps what @arsenm was referring to as well, I don't mind either way.

Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request


          [SDAG] Precommit tests for llvm#126207 (NFC) (llvm#126208)

3edfa00

Add missing test coverage for codepaths touched by llvm#126207.

Collaborator Author

c-rhodes commented Feb 12, 2025

I'll land this tomorrow if there's no further comments by then.

c-rhodes merged commit 9b2fc66 into llvm:main

8 checks passed

flovent pushed a commit to flovent/llvm-project that referenced this pull request


          [SDAG] Harden assumption in getMemsetStringVal (llvm#126207)

659d771

In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: llvm#20521 [1]

joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request


          [SDAG] Harden assumption in getMemsetStringVal (llvm#126207)

4ace703

In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: llvm#20521 [1]

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request


          [SDAG] Harden assumption in getMemsetStringVal (llvm#126207)

506e6d1

In 5235973, an ICE was fixed in
getMemsetStringVal where f128 wasn't handled. It was noted at the time
[1] that the code below this also looks suspect, since it assumes the
element type of VT is either an f32 or f64.

This part of getMemsetStringVal relates to memcpy operations where the
source is a copy from a zero constant. The VT in question is determined
by TargetLowering::findOptimalMemOpLowering, which in turn calls a
further TLI hook getOptimalMemOpType.

For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32
or Other. For Other, TargetLowering::findOptimalMemOpLowering will then
pick an integer VT. So on AArch64 at least, I don't believe the suspect
code can be reached.

For other targets, ARM and x86 are the only ones that return a FP vector
type from getOptimalMemOpType. For both targets, the only such type is
v2f64, but given f64 is already handled it should also be fine.

To defend this, I considered adding an assert as mentioned in [1], but
given getConstantFP handles vector types, I figured using this to fully
handle the FP types makes the code simpler and more robust.

For test coverage I added unreachables to both of the branches handling
FP types in this code, but found neither fired with check-llvm across
all targets.

Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in
5235973 to defend ICE on f128, but at
some point it stopped hitting this code.

AArch64TargetLowering::getOptimalMemOpType was updated in
2920061, so I suspect this is when it
happened, although I haven't verified this. Although I did find by
updating the test to disable NEON, getOptimalMemOpType returns an f128
and the branch is once again hit.

For the final branch noted as suspect in [1], as far as I can tell this
has never had any test coverage, so I've added a test to the ARM backend
for this.

Fixes: llvm#20521 [1]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 backend:ARM llvm:SelectionDAG