Skip to content

[win/asan] Don't intercept memset etc. in ntdll #120397

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 20, 2024

Conversation

zmodem
Copy link
Collaborator

@zmodem zmodem commented Dec 18, 2024

When ntdll was added to the list of of "interesting DLLs" list (in
d58230b), the intention was not to
intercept the "mini CRT" functions it exports. OverrideFunction would
only intercept the first function it found when searching the list of
DLLs, and ntdll was put last in that list.

However, after 42cdfbc,
OverrideFunction intercepts all matching functions in those DLLs. As
a side-effect, the runtime would now intercept functions like memset
etc. also in ntdll.

This causes a problem when ntdll-internal functions like
RtlDispatchException call the intercepted memset, which tries to
inspect uncommitted shadow memory, raising an exception, and getting
stuck in that loop until the stack overflows.

Since we never intended to intercept ntdll's memset etc., the simplest
fix seems to be to actively ignore ntdll when intercepting those
functions.

Fixes #114793

@llvmbot
Copy link
Member

llvmbot commented Dec 18, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Hans Wennborg (zmodem)

Changes

When ntdll was added to the list of of "interesting DLLs" list (in
d58230b), the intention was not to
intercept the "mini CRT" functions it exports. OverrideFunction would
only intercept the first function it found when searching the list of
DLLs, and ntdll was put last in that list.

However, after 42cdfbc,
OverrideFunction intercepts all matching functions in those DLLs. As
a side-effect, the runtime would now intercept functions like memset
etc. also in ntdll.

This causes a problem when ntdll-internal functions like
RtlDispatchException call the intercepted memset, which tries to
inspect uncommitted shadow memory, raising an exception, and getting
stuck in that loop until the stack overflows.

Since we never intended to intercept ntdll's memset etc., the simplest
fix seems to be to actively ignore ntdll when intercepting those
functions.

Fixes #114793


Full diff: https://github.com/llvm/llvm-project/pull/120397.diff

1 Files Affected:

  • (modified) compiler-rt/lib/interception/interception_win.cpp (+14-2)
diff --git a/compiler-rt/lib/interception/interception_win.cpp b/compiler-rt/lib/interception/interception_win.cpp
index a5897274521e92..094953d3544fba 100644
--- a/compiler-rt/lib/interception/interception_win.cpp
+++ b/compiler-rt/lib/interception/interception_win.cpp
@@ -1177,8 +1177,7 @@ static void **InterestingDLLsAvailable() {
     "libc++.dll",     // libc++
     "libunwind.dll",  // libunwind
 #  endif
-    // NTDLL should go last as it exports some functions that we should
-    // override in the CRT [presumably only used internally].
+    // NTDLL must go last as it gets special treatment in OverrideFunction.
     "ntdll.dll",
     NULL
   };
@@ -1281,9 +1280,22 @@ uptr InternalGetProcAddress(void *module, const char *func_name) {
 
 bool OverrideFunction(
     const char *func_name, uptr new_func, uptr *orig_old_func) {
+  static const char *kNtDllIgnore[] = {
+    "memcmp", "memcpy", "memmove", "memset"
+  };
+
   bool hooked = false;
   void **DLLs = InterestingDLLsAvailable();
   for (size_t i = 0; DLLs[i]; ++i) {
+    if (DLLs[i + 1] == nullptr) {
+      // This is the last DLL, i.e. NTDLL. It exports some functions that
+      // we only want to override in the CRT.
+      for (const char *ignored : kNtDllIgnore) {
+        if (strcmp(func_name, ignored) == 0)
+          return hooked;
+      }
+    }
+
     uptr func_addr = InternalGetProcAddress(DLLs[i], func_name);
     if (func_addr &&
         OverrideFunction(func_addr, new_func, orig_old_func)) {

Copy link

github-actions bot commented Dec 18, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 90968794e26709957d49dd660e4e453235d393e8 8959e640613d189bb0aad09905e8fe534f5fc3ba --extensions cpp -- compiler-rt/lib/interception/interception_win.cpp
View the diff from clang-format here.
diff --git a/compiler-rt/lib/interception/interception_win.cpp b/compiler-rt/lib/interception/interception_win.cpp
index bd85c50a08..ef19d0d3cf 100644
--- a/compiler-rt/lib/interception/interception_win.cpp
+++ b/compiler-rt/lib/interception/interception_win.cpp
@@ -217,8 +217,10 @@ static int _strcmp(const char *s1, const char *s2) {
   while (true) {
     unsigned c1 = *s1;
     unsigned c2 = *s2;
-    if (c1 != c2) return (c1 < c2) ? -1 : 1;
-    if (c1 == 0) break;
+    if (c1 != c2)
+      return (c1 < c2) ? -1 : 1;
+    if (c1 == 0)
+      break;
     s1++;
     s2++;
   }
@@ -1174,25 +1176,23 @@ bool OverrideFunction(
 
 static void **InterestingDLLsAvailable() {
   static const char *InterestingDLLs[] = {
-    "kernel32.dll",
-    "msvcr100d.dll",      // VS2010
-    "msvcr110d.dll",      // VS2012
-    "msvcr120d.dll",      // VS2013
-    "vcruntime140d.dll",  // VS2015
-    "ucrtbased.dll",      // Universal CRT
-    "msvcr100.dll",       // VS2010
-    "msvcr110.dll",       // VS2012
-    "msvcr120.dll",       // VS2013
-    "vcruntime140.dll",   // VS2015
-    "ucrtbase.dll",       // Universal CRT
+      "kernel32.dll",
+      "msvcr100d.dll",      // VS2010
+      "msvcr110d.dll",      // VS2012
+      "msvcr120d.dll",      // VS2013
+      "vcruntime140d.dll",  // VS2015
+      "ucrtbased.dll",      // Universal CRT
+      "msvcr100.dll",       // VS2010
+      "msvcr110.dll",       // VS2012
+      "msvcr120.dll",       // VS2013
+      "vcruntime140.dll",   // VS2015
+      "ucrtbase.dll",       // Universal CRT
 #  if (defined(__MINGW32__) && defined(__i386__))
-    "libc++.dll",     // libc++
-    "libunwind.dll",  // libunwind
+      "libc++.dll",     // libc++
+      "libunwind.dll",  // libunwind
 #  endif
-    // NTDLL must go last as it gets special treatment in OverrideFunction.
-    "ntdll.dll",
-    NULL
-  };
+      // NTDLL must go last as it gets special treatment in OverrideFunction.
+      "ntdll.dll", NULL};
   static void *result[ARRAY_SIZE(InterestingDLLs)] = { 0 };
   if (!result[0]) {
     for (size_t i = 0, j = 0; InterestingDLLs[i]; ++i) {
@@ -1292,9 +1292,7 @@ uptr InternalGetProcAddress(void *module, const char *func_name) {
 
 bool OverrideFunction(
     const char *func_name, uptr new_func, uptr *orig_old_func) {
-  static const char *kNtDllIgnore[] = {
-    "memcmp", "memcpy", "memmove", "memset"
-  };
+  static const char *kNtDllIgnore[] = {"memcmp", "memcpy", "memmove", "memset"};
 
   bool hooked = false;
   void **DLLs = InterestingDLLsAvailable();

@zmodem
Copy link
Collaborator Author

zmodem commented Dec 18, 2024

I think this strikes the right balance between hacky and pragmatic, and is in line with @rnk's advice in #114793 (comment)

@rnk, @zacklj89, @barcharcraz, @yjugl What do you think?

@zacklj89
Copy link
Contributor

@zmodem firstly, thanks for the review on #120110 and tagging here! I work on ASan at MS, and issues/conversations like these are helpful.

I think it's a reasonable implementation and tradeoff. My initial impression was that there are programs that use ntdll as their sole C runtime and thus would lose significant coverage by omitting these functions as candidates for interception. However, there's nothing stopping further changes to the EH machinery / ntdll stack which aggravates interception again.

If this is preferred over #120110, let me know.

@zmodem
Copy link
Collaborator Author

zmodem commented Dec 18, 2024

My initial impression was that there are programs that use ntdll as their sole C runtime and thus would lose significant coverage by omitting these functions as candidates for interception.

Are such programs common, and do they avoid the regular CRT for size or other reasons? And do we know if they use ASan?

I think it would be fair to say that ASan only supports programs using the regular CRT (https://learn.microsoft.com/en-us/cpp/sanitizers/asan-building?view=msvc-170#fsanitizeaddress-compiler-option kind of says so already: "The code produced with this option works with static and dynamic CRTs (for example, /MD, /MDd, /MT, and /MTd).")

If we can make that assumption, I think we should intercept ntdll as little as possible, and then I think my patch is preferable.

@zacklj89
Copy link
Contributor

Are such programs common, and do they avoid the regular CRT for size or other reasons? And do we know if they use ASan?

I think they're uncommon or nearly(?) nonexistent.

If we can make that assumption, I think we should intercept ntdll as little as possible, and then I think my patch is preferable.

Sounds good to me!

@yjugl
Copy link

yjugl commented Dec 18, 2024

For Mozilla, firefox.exe imports and uses ntdll's memset, whereas our DLLs like mozglue.dll and xul.dll import and use the one from VCRUNTIME140.dll (the vast majority of our code lives in xul.dll). @glandium, would you know why we have this difference and whether it is intended? If yes, do you think our reasons for using ntdll's memset here could apply to other projects?

Nonetheless this patch sounds like a reasonable compromise to me. I got progressively convinced that it's probably just best to not have any instrumentation for these two memset calls, like this patch proposes. As you mention, the uses of ntdll's memset should be fairly marginal, and this instrumentation could be considered out-of-scope in the same way that, more generally, ASAN cannot and will not instrument every memory read/write operation performed by ntdll.

For the record here is a summary of my own attempts at fixing this issue:

  • force-committing the shadow memory in memset instrumentation, which resulted in too much performance overhead;
  • detecting memset reentrancy, which seemed to have a more reasonable overhead for non-24H2 platforms with Firefox, but could have had a more significant impact on 24H2 performance, where the reentrant calls would actually occur -- often.

#120110 by @zacklj89 is another attempt at force-committing the shadow memory, this time in a smarter way since it only does that for callers that live in ntdll. This should perform better than my own attempt -- perhaps similar to my memset reentrancy attempt on non-24H2 platforms.

As far as I am concerned, the correct way forward for Mozilla is probably to stop relying on ntdll's memset at all here. And if we have a good reason to keep using it, the tradeoff from this patch seems acceptable -- there are not that many actual calls to this import.

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good way to go, but in the long run I think we'll need to have flags or special macros on the interceptors to communicate which functions need ntdll interception.

// This is the last DLL, i.e. NTDLL. It exports some functions that
// we only want to override in the CRT.
for (const char *ignored : kNtDllIgnore) {
if (strcmp(func_name, ignored) == 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some internal_strcmp function we should be calling instead in order to avoid libc re-entrancy, like hitting the ASan strcmp interceptor, for example?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// we only want to override in the CRT.
for (const char *ignored : kNtDllIgnore) {
if (strcmp(func_name, ignored) == 0)
return hooked;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, let's just break or continue so the reader doesn't have to think about what hooked is, we just hit the shared return codepath.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a nested loop, I think returning is the most straight-forward.

@glandium
Copy link
Contributor

@glandium, would you know why we have this difference and whether it is intended?

Probably the consequence of how things are ordered on the linker command line. No specific intent there, just happenstance.

@zmodem
Copy link
Collaborator Author

zmodem commented Dec 20, 2024

It sounds like we have consensus on going with this approach. I'll push the button.

@zmodem zmodem merged commit 69ebac7 into llvm:main Dec 20, 2024
6 of 7 checks passed
@mstorsjo
Copy link
Member

mstorsjo commented Feb 4, 2025

This patch seems to have broken asan, when running on top of msvcrt.dll (rather than UCRT); this came up when trying to build a release of 20.1.0 RC 1 with msvcrt.dll, at https://github.com/mstorsjo/llvm-mingw/actions/runs/13123676548/job/36623780321. Whenever a process is started, it errors out with this message:

AddressSanitizer: CHECK failed: sanitizer_common_interceptors_memintrinsics.inc:239 "((__interception::real_memcpy)) != (0)" (0x0, 0x0) (tid=1128)

Given the list of potential DLLs that are attempted to intercept at

static void **InterestingDLLsAvailable() {
static const char *InterestingDLLs[] = {
"kernel32.dll",
"msvcr100d.dll", // VS2010
"msvcr110d.dll", // VS2012
"msvcr120d.dll", // VS2013
"vcruntime140d.dll", // VS2015
"ucrtbased.dll", // Universal CRT
"msvcr100.dll", // VS2010
"msvcr110.dll", // VS2012
"msvcr120.dll", // VS2013
"vcruntime140.dll", // VS2015
"ucrtbase.dll", // Universal CRT
# if (defined(__MINGW32__) && defined(__i386__))
"libc++.dll", // libc++
"libunwind.dll", // libunwind
# endif
// NTDLL must go last as it gets special treatment in OverrideFunction.
"ntdll.dll",
NULL
};
, I see that we don't include msvcrt.dll in there. I guess that means that previously, the only case of e.g. memcpy that was intercepted, was from ntdll.dll. For e.g. UCRT builds, we'd also find one in ucrtbase.dll, but here we no longer find any at all.

Intercepting msvcrt.dll is a bit special though, because even if the user code uses UCRT, most processes end up loading msvcrt.dll as well, because this is used by most system DLLs. And I guess in this case, it may have already been used before we get to intercepting it?

I did try adding msvcrt.dll to this list, but that doesn't seem to fix the issue entirely either - I'm ending up with errors like this instead:

==5236==ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned:

For further context; asan wasn't entirely working in msvcrt.dll builds anyway, as far as I know (I've had a couple of reports where it doesn't, while it works fine in UCRT builds). But it seems to have been working well enough to pass my very minimal smoke tests at least.

I'm wondering what others think about this, which way we should go:

  • Reinstate intercepting ntdll.dll, if building for mingw + msvcrt.dll. While it isn't the entirely right thing to do, we should reach the same level of functionality as before
  • Try to intercept msvcrt.dll, possibly only in the mingw + msvcrt.dll case. But that apparently seems like it would require more work than just adding it to the list.
  • Just give up on asan on msvcrt.dll - skip testing those aspects in such builds (and maybe skip shipping those DLLs entirely in such builds?)

What do e.g. @alvinhochun @bernhardu @cjacek @lazka @mati865 think about that?

@zacklj89
Copy link
Contributor

zacklj89 commented Feb 5, 2025

@mstorsjo just adding some comments here to help with the conversation.

I did try adding msvcrt.dll to this list, but that doesn't seem to fix the issue entirely either - I'm ending up with errors like this instead:
==5236==ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned:

Presumably this is happening as a result of calls to _Crt* APIs or msvcrt initialization routines that are dealing with memory that ASan is not aware of that end up making calls to intercepted APIs?

Reinstate intercepting ntdll.dll, if building for mingw + msvcrt.dll. While it isn't the entirely right thing to do, we should reach the same level of functionality as before

Perhaps there should be a runtime option (defaulting off) that allows this, as if I remember correctly Mozilla was having both reentrancy and performance issues as a result of this. FWIW, MSVC behaves very similarly to #120110.

@rnk
Copy link
Collaborator

rnk commented Feb 5, 2025

First, this is just venting, I think everyone who works on Windows already knows this and accepts this, but: The Windows C runtime ecosystem is such a nightmare. ⚡🧠⚡

At a really high level, I think the best direction here is to de-escalate the hotpatching wars.

Regarding msvcrt.dll ASan compatibility, I cannot remember any attempt to intentionally support WinASan usage in this configuration, so I'm somewhat inclined to say that it was an accident that it ever worked at all. Contributions to support it would be welcome, but we should keep this change in to prioritize the users (Chromium & Firefox) who benefit from this change.

If we want to intentionally support msvcrt.dll, there is prior art for relaxing the CHECK failure in malloc_usable_size. See these checks for OWNED_BY_RTL:
https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/asan/asan_malloc_win.cpp#L204

Adding checks for heap ownership on the hot malloc/free codepaths are prohibitively expensive, so we want to minimize those, but these malloc_usable_size APIs are usually much colder and its probably acceptable to add a wide variety of defensive checks there in the name of compatibility and ease of adoption.

My assumption is that calls to malloc from /instrumented DSOs/ are going to be very hot, while malloc calls from precompiled DLLs, system or third party, are going to be much colder. If we could find a way to separate the malloc entry points between instrumented and non-instrumented code, that could allow us to make the non-instrumented memory allocator entry points more flexible, meaning we have fewer CHECK failures and more recovery paths.

This would all be worth doing if it makes future Windows updates smoother.

@mati865
Copy link
Contributor

mati865 commented Feb 6, 2025

My personal stance regarding msvcrt doesn't change - there is no real reason to use it when building for any somewhat modern Windows.
If somebody wants modern features with it, they should be ready to invest their own time and effort into it.

@mstorsjo
Copy link
Member

mstorsjo commented Feb 6, 2025

I did try adding msvcrt.dll to this list, but that doesn't seem to fix the issue entirely either - I'm ending up with errors like this instead:
==5236==ERROR: AddressSanitizer: attempting to call malloc_usable_size() for pointer which is not owned:

Presumably this is happening as a result of calls to _Crt* APIs or msvcrt initialization routines that are dealing with memory that ASan is not aware of that end up making calls to intercepted APIs?

I'm not entirely sure; surprisingly enough, if I include the same change (to intercept functions in msvcrt.dll) in an UCRT build, I'm not hitting the same issue, so the issue is at least not triggered by the background use of msvcrt.dll in other system DLLs.

Reinstate intercepting ntdll.dll, if building for mingw + msvcrt.dll. While it isn't the entirely right thing to do, we should reach the same level of functionality as before

Perhaps there should be a runtime option (defaulting off) that allows this, as if I remember correctly Mozilla was having both reentrancy and performance issues as a result of this. FWIW, MSVC behaves very similarly to #120110.

TBH, I'm not really sure if it is worth it.

And as @rnk says:

Regarding msvcrt.dll ASan compatibility, I cannot remember any attempt to intentionally support WinASan usage in this configuration, so I'm somewhat inclined to say that it was an accident that it ever worked at all. Contributions to support it would be welcome, but we should keep this change in to prioritize the users (Chromium & Firefox) who benefit from this change.

Yeah - ASAN is already very complex, so I also agree that we should be wary of adding any extra complexity for something so unimportant as this target.

Especially as it already is known to not really be entirely working; see e.g. mstorsjo/llvm-mingw#224 for a user case. I guess it's mostly surprising that I have been able to include this case in my smoke tests and having those tests passing up until now.

FWIW, to judge the tradeoff a bit better, to what extent it is worth spending effort on it, I wanted to have some sort of measurement on how well it was working before; was it working up to 95% of cases (then perhaps it would be worth spending some more effort on keeping it alive), or did it only manage to work for trivial cases?

I tried running the compiler-rt testsuite with such a toolchain. In mingw UCRT builds, I get clean runs. With msvcrt, I got the following results:

check-asan-dynamic:

Total Discovered Tests: 500
  Unsupported      : 243 (48.60%)
  Passed           : 155 (31.00%)
  Expectedly Failed:  19 (3.80%)
  Failed           :  83 (16.60%)

check-interception:

Total Discovered Tests: 14
  Passed: 14 (100.00%)

check-builtins:

Total Discovered Tests: 221
  Unsupported      :  63 (28.51%)
  Passed           : 157 (71.04%)
  Expectedly Failed:   1 (0.45%)

check-profile:

Total Discovered Tests: 145
  Unsupported      : 91 (62.76%)
  Passed           : 52 (35.86%)
  Expectedly Failed:  2 (1.38%)

check-sanitizer:

Total Discovered Tests: 804
  Unsupported: 456 (56.72%)
  Passed     : 348 (43.28%)

check-ubsan:

Total Discovered Tests: 88
  Unsupported      : 16 (18.18%)
  Passed           : 65 (73.86%)
  Expectedly Failed:  2 (2.27%)
  Failed           :  5 (5.68%)

So ubsan is mostly working, only a curious handful of testcases that seem to fail for some reason. Asan has got a much higher percentage of tests failing, and I had to disable the gtest based unit tests entirely as they crashed to the point of not getting any json output, so the lit test runner couldn't produce a summary.

So therefore, I guess I'll work around this in my smoke tests, to stop trying to use asan in msvcrt builds. And I'll consider entirely stripping out the asan libraries from those toolchains, so users get a clearer linker error, rather than more obscure runtime errors.

mstorsjo added a commit to mstorsjo/llvm-mingw that referenced this pull request Feb 11, 2025
Asan doesn't really run properly on msvcrt.dll; skip testing it
in that configuration.

See discussion on llvm/llvm-project#120397
and #224 for context
around this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

memset interception in compiler-rt asan is incompatible with ntdll.dll from Windows 11 24H2
8 participants