Skip to content

[NFCI][msan] Show that shadow for partially undefined vectors is computed as fully initialized #143823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 12, 2025

Conversation

thurstond
Copy link
Contributor

@thurstond thurstond commented Jun 12, 2025

This happens because getShadow(Value *V) has a special case for fully undefined/poisoned values, but partially undefined values fall-through and are given a clean shadow. This leads to false negatives (no false positives).

Note: MSan correctly handles InsertElementInst, but the shadow of the initial vector may still be wrong.

Showing that the same approximation happens for other composite types is left as an exercise for the reader.

… vectors

This happens because `getShadow(Value *V)` has a special case for fully
undefined/poisoned values, but partially undefined values fall-through
and are given a clean shadow.

Showing that the same flaw happens for other composite types is left as
an exercise for the reader.
@thurstond thurstond requested review from fmayer and vitalybuka June 12, 2025 02:23
Copy link

github-actions bot commented Jun 12, 2025

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp

The following files introduce new uses of undef:

  • llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Thurston Dang (thurstond)

Changes

This happens because getShadow(Value *V) has a special case for fully undefined/poisoned values, but partially undefined values fall-through and are given a clean shadow.

Showing that the same flaw happens for other composite types is left as an exercise for the reader.


Full diff: https://github.com/llvm/llvm-project/pull/143823.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (+4)
  • (added) llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll (+77)
diff --git a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
index c2315d5de7041..4a618cf42c559 100644
--- a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -2085,6 +2085,10 @@ struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
       assert(ShadowPtr && "Could not find shadow for an argument");
       return ShadowPtr;
     }
+
+    // TODO: partially undefined vectors are not correctly handled
+    //       (see partial-poison.ll)
+
     // For everything else the shadow is zero.
     return getCleanShadow(V);
   }
diff --git a/llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll b/llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll
new file mode 100644
index 0000000000000..89ceda67a04d0
--- /dev/null
+++ b/llvm/test/Instrumentation/MemorySanitizer/partial-poison.ll
@@ -0,0 +1,77 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -S -passes='msan' 2>&1 | FileCheck %s
+;
+; Test case to show that MSan incorrectly computes shadows for partially poisoned vectors
+
+target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define <2 x i64> @left_poison(ptr %add.ptr) sanitize_memory {
+; CHECK-LABEL: define <2 x i64> @left_poison(
+; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    call void @llvm.donothing()
+; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    ret <2 x i64> <i64 poison, i64 42>
+;
+  ret <2 x i64> <i64 poison, i64 42>
+}
+
+define <2 x i64> @right_poison(ptr %add.ptr) sanitize_memory {
+; CHECK-LABEL: define <2 x i64> @right_poison(
+; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    call void @llvm.donothing()
+; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    ret <2 x i64> <i64 42, i64 poison>
+;
+  ret <2 x i64> <i64 42, i64 poison>
+}
+
+define <2 x i64> @full_poison(ptr %add.ptr) sanitize_memory {
+; CHECK-LABEL: define <2 x i64> @full_poison(
+; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    call void @llvm.donothing()
+; CHECK-NEXT:    store <2 x i64> splat (i64 -1), ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    ret <2 x i64> poison
+;
+  ret <2 x i64> <i64 poison, i64 poison>
+}
+
+define <2 x i64> @no_poison_or_undef(ptr %add.ptr) sanitize_memory {
+; CHECK-LABEL: define <2 x i64> @no_poison_or_undef(
+; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    call void @llvm.donothing()
+; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    ret <2 x i64> splat (i64 42)
+;
+  ret <2 x i64> <i64 42, i64 42>
+}
+
+define <2 x i64> @left_undef(ptr %add.ptr) sanitize_memory {
+; CHECK-LABEL: define <2 x i64> @left_undef(
+; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    call void @llvm.donothing()
+; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    ret <2 x i64> <i64 undef, i64 42>
+;
+  ret <2 x i64> <i64 undef, i64 42>
+}
+
+define <2 x i64> @right_undef(ptr %add.ptr) sanitize_memory {
+; CHECK-LABEL: define <2 x i64> @right_undef(
+; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    call void @llvm.donothing()
+; CHECK-NEXT:    store <2 x i64> zeroinitializer, ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    ret <2 x i64> <i64 42, i64 undef>
+;
+  ret <2 x i64> <i64 42, i64 undef>
+}
+
+define <2 x i64> @full_undef(ptr %add.ptr) sanitize_memory {
+; CHECK-LABEL: define <2 x i64> @full_undef(
+; CHECK-SAME: ptr [[ADD_PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    call void @llvm.donothing()
+; CHECK-NEXT:    store <2 x i64> splat (i64 -1), ptr @__msan_retval_tls, align 8
+; CHECK-NEXT:    ret <2 x i64> undef
+;
+  ret <2 x i64> <i64 undef, i64 undef>
+}

@thurstond thurstond changed the title [NFCI][msan] Show shadow incorrectly computed for partially undefined vectors [NFCI][msan] Show that shadow for partially undefined vectors is computed as fully initialized Jun 12, 2025
@thurstond thurstond merged commit 2efff47 into llvm:main Jun 12, 2025
6 of 7 checks passed
thurstond added a commit to thurstond/llvm-project that referenced this pull request Jun 12, 2025
…length vectors

For each element of the vector, the shadow is initialized iff the
element is not undefined/poisoned. Previously, partially undefined
constant fixed-length vectors always had fully initialized shadows,
which leads to false negatives.

Updates the tests from llvm#143823

Note: since this patch corrects a false negative, MSan can now detect more bugs (corollary: code/tests that passed MSan may start failing).
tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
…s is computed as fully initialized (llvm#143823)

This happens because `getShadow(Value *V)` has a special case for fully undefined/poisoned values, but partially undefined values fall-through and are given a clean shadow. This leads to false negatives (no false positives).

Note: MSan correctly handles InsertElementInst, but the shadow of the initial constant vector may still be wrong and be propagated.

Showing that the same approximation happens for other composite types is left as an exercise for the reader.
thurstond added a commit that referenced this pull request Jun 20, 2025
…undefined constant fixed-length vectors (#143837)

This patch adds an off-by-default flag which, when enabled via `-mllvm -msan-poison-undef-vectors=true`, fixes a false negative in MSan (partially-undefined constant fixed-length vectors). It is currently off by default since, by fixing the false positive, code/tests that previously passed MSan may start failing. The default will be changed in a future patch.

Prior to this patch, MSan computes that partially-undefined constant fixed-length vectors are fully initialized, which leads to false negatives; moreover, benign vector rewriting could theoretically flip MSan's shadow computation from initialized to uninitialized or vice-versa (*). `-msan-poison-undef-vectors=true` calculates the shadow precisely: for each element of the vector, the corresponding shadow is fully uninitialized if the element is undefined/poisoned, otherwise it is fully initialized.

Updates the test from #143823

(*) For example:
  ```
  %x = insertelement <2 x i64> <i64 0, i64 poison>, i64 42, i64 0
  %y = insertelement <2 x i64> <i64 poison, i64 poison>, i64 42, i64 0
  ```
%x and %y are equivalent but, prior to this patch, MSan incorrectly computes the shadow of %x as <0, 0> rather than <0, -1>.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jun 20, 2025
… partially undefined constant fixed-length vectors (#143837)

This patch adds an off-by-default flag which, when enabled via `-mllvm -msan-poison-undef-vectors=true`, fixes a false negative in MSan (partially-undefined constant fixed-length vectors). It is currently off by default since, by fixing the false positive, code/tests that previously passed MSan may start failing. The default will be changed in a future patch.

Prior to this patch, MSan computes that partially-undefined constant fixed-length vectors are fully initialized, which leads to false negatives; moreover, benign vector rewriting could theoretically flip MSan's shadow computation from initialized to uninitialized or vice-versa (*). `-msan-poison-undef-vectors=true` calculates the shadow precisely: for each element of the vector, the corresponding shadow is fully uninitialized if the element is undefined/poisoned, otherwise it is fully initialized.

Updates the test from llvm/llvm-project#143823

(*) For example:
  ```
  %x = insertelement <2 x i64> <i64 0, i64 poison>, i64 42, i64 0
  %y = insertelement <2 x i64> <i64 poison, i64 poison>, i64 42, i64 0
  ```
%x and %y are equivalent but, prior to this patch, MSan incorrectly computes the shadow of %x as <0, 0> rather than <0, -1>.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants