Skip to content

[win/asan] Ensure errno gets set correctly for strtol #109258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 22, 2024

Conversation

zmodem
Copy link
Collaborator

@zmodem zmodem commented Sep 19, 2024

This fixes two problems with asan's interception of strtol on Windows:

  1. In the dynamic runtime, the strtol interceptor calls out to ntdll's strtol to perform the string conversion. Unfortunately, that function doesn't set errno. This has been a long-standing problem (ASan strtol interceptor breaks errno on Windows #34485), but it was not an issue when using the static runtime. After the static runtime was removed recently (Reland [asan][windows] Eliminate the static asan runtime on windows #107899), the problem became more urgent.

  2. A module linked against the static CRT will have a different instance of errno than the ASan runtime, since that's now always linked against the dynamic CRT. That means even if the ASan runtime sets errno correctly, the calling module will not see it.

This patch fixes the first problem by making the strtol interceptor call out to strtoll instead, and do 32-bit range checks on the result.

I can't think of any reasonable way to fix the second problem, so we should stop intercepting strtol in the static runtime thunk. I checked the list of functions in the thunk, and strtol and strtoll are the only ones that set errno. (strtoll was already missing, probably by mistake.)

@llvmbot
Copy link
Member

llvmbot commented Sep 19, 2024

@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Hans (zmodem)

Changes

This fixes two problems with asan's interception of strtol on Windows:

  1. In the dynamic runtime, the strtol interceptor calls out to ntdll's strol to perform the string conversion. Unfortunately, that function doesn't set errno. This has been a long-standing problem (#34485), but it was not an issue when using the static runtime. After the static runtime was removed recently (#107899), the problem became more urgent.

  2. A module linked against the static CRT will have a different instance of errno than the ASan runtime, since that's now always linked against the dynamic CRT. That means even if the ASan runtime sets errno correctly, the calling module will not see it.

This patch fixes the first problem by making the strtol interceptor call out to strtoll instead, and do 32-bit range checks on the result.

I can't think of any reasonable way to fix the second problem, so we should stop intercepting strtol in the static runtime thunk. I checked the list of functions in the thunk, and strtol and strtoll are the only ones that set errno. (strtoll was already missing, probably by mistake.)


Full diff: https://github.com/llvm/llvm-project/pull/109258.diff

6 Files Affected:

  • (modified) compiler-rt/lib/asan/asan_interceptors.cpp (+26-1)
  • (modified) compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp (+4-1)
  • (modified) compiler-rt/lib/asan/tests/asan_str_test.cpp (+24)
  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_errno.cpp (+1)
  • (modified) compiler-rt/lib/sanitizer_common/sanitizer_errno_codes.h (+1)
  • (modified) compiler-rt/test/asan/TestCases/strtol_strict.c (+3)
diff --git a/compiler-rt/lib/asan/asan_interceptors.cpp b/compiler-rt/lib/asan/asan_interceptors.cpp
index 74af2e65e9bfa6..afde42e677a62d 100644
--- a/compiler-rt/lib/asan/asan_interceptors.cpp
+++ b/compiler-rt/lib/asan/asan_interceptors.cpp
@@ -650,9 +650,34 @@ static ALWAYS_INLINE auto StrtolImpl(void *ctx, Fn real, const char *nptr,
       return StrtolImpl(ctx, REAL(func), nptr, endptr, base);                \
     }
 
-INTERCEPTOR_STRTO_BASE(long, strtol)
 INTERCEPTOR_STRTO_BASE(long long, strtoll)
 
+#if SANITIZER_WINDOWS
+INTERCEPTOR(long, strtol, const char *nptr, char **endptr, int base) {
+  // REAL(strol) may be ntdll!strol, which doesn't set errno. Instead,
+  // call REAL(strtoll) and do the range check ourselves.
+  COMPILER_CHECK(sizeof(long) == sizeof(u32));
+
+  void *ctx;
+  ASAN_INTERCEPTOR_ENTER(ctx, strtol);
+  AsanInitFromRtl();
+
+  long long result = StrtolImpl(ctx, REAL(strtoll), nptr, endptr, base);
+
+  if (result > INT32_MAX) {
+    errno = errno_ERANGE;
+    return INT32_MAX;
+  }
+  if (result < INT32_MIN) {
+    errno = errno_ERANGE;
+    return INT32_MIN;
+  }
+  return (long)result;
+}
+#else
+INTERCEPTOR_STRTO_BASE(long, strtol)
+#endif
+
 #  if SANITIZER_GLIBC
 INTERCEPTOR_STRTO_BASE(long, __isoc23_strtol)
 INTERCEPTOR_STRTO_BASE(long long, __isoc23_strtoll)
diff --git a/compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp b/compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp
index dec50a5e1d4d9e..82aaadb9db1038 100644
--- a/compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp
+++ b/compiler-rt/lib/asan/asan_win_static_runtime_thunk.cpp
@@ -63,10 +63,13 @@ INTERCEPT_LIBRARY_FUNCTION_ASAN(strpbrk);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(strspn);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(strstr);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(strtok);
-INTERCEPT_LIBRARY_FUNCTION_ASAN(strtol);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(wcslen);
 INTERCEPT_LIBRARY_FUNCTION_ASAN(wcsnlen);
 
+// Note: Don't intercept strtol(l). They are supposed to set errno for out-of-
+// range values, but since the ASan runtime is linked against the dynamic CRT,
+// its errno is different from the one in the current module.
+
 #  if defined(_MSC_VER) && !defined(__clang__)
 #    pragma warning(pop)
 #  endif
diff --git a/compiler-rt/lib/asan/tests/asan_str_test.cpp b/compiler-rt/lib/asan/tests/asan_str_test.cpp
index 8b1c1dc459d74f..b28143db860f5e 100644
--- a/compiler-rt/lib/asan/tests/asan_str_test.cpp
+++ b/compiler-rt/lib/asan/tests/asan_str_test.cpp
@@ -638,3 +638,27 @@ TEST(AddressSanitizer, StrtolOOBTest) {
   RunStrtolOOBTest(&CallStrtol);
 }
 #endif
+
+TEST(AddressSanitizer, StrtolOverflow) {
+  if (sizeof(long) == 4) {
+    // Check that errno gets set correctly on 32-bit strtol overflow.
+    long res;
+    errno = 0;
+    res = Ident(strtol("2147483647", NULL, 0));
+    EXPECT_EQ(errno, 0);
+    EXPECT_EQ(res, 2147483647);
+
+    res = Ident(strtol("2147483648", NULL, 0));
+    EXPECT_EQ(errno, ERANGE);
+    EXPECT_EQ(res, 2147483647);
+
+    errno = 0;
+    res = Ident(strtol("-2147483648", NULL, 0));
+    EXPECT_EQ(errno, 0);
+    EXPECT_EQ(res, -2147483648);
+
+    res = Ident(strtol("-2147483649", NULL, 0));
+    EXPECT_EQ(errno, ERANGE);
+    EXPECT_EQ(res, -2147483648);
+  }
+}
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_errno.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_errno.cpp
index cbadf4d924a7e0..a7cdf3256757c7 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_errno.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_errno.cpp
@@ -23,6 +23,7 @@ namespace __sanitizer {
 COMPILER_CHECK(errno_ENOMEM == ENOMEM);
 COMPILER_CHECK(errno_EBUSY == EBUSY);
 COMPILER_CHECK(errno_EINVAL == EINVAL);
+COMPILER_CHECK(errno_ERANGE == ERANGE);
 
 // EOWNERDEAD is not present in some older platforms.
 #if defined(EOWNERDEAD)
diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_errno_codes.h b/compiler-rt/lib/sanitizer_common/sanitizer_errno_codes.h
index 3917b2817f2e99..9e6e71ec80c15c 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_errno_codes.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_errno_codes.h
@@ -24,6 +24,7 @@ namespace __sanitizer {
 #define errno_ENOMEM 12
 #define errno_EBUSY 16
 #define errno_EINVAL 22
+#define errno_ERANGE 34
 #define errno_ENAMETOOLONG 36
 #define errno_ENOSYS 38
 
diff --git a/compiler-rt/test/asan/TestCases/strtol_strict.c b/compiler-rt/test/asan/TestCases/strtol_strict.c
index bc30c87b18cb30..aa37af07b63dfc 100644
--- a/compiler-rt/test/asan/TestCases/strtol_strict.c
+++ b/compiler-rt/test/asan/TestCases/strtol_strict.c
@@ -23,6 +23,9 @@
 // RUN: %env_asan_opts=strict_string_checks=true not %run %t test7 2>&1 | FileCheck %s --check-prefix=CHECK7
 // REQUIRES: shadow-scale-3
 
+// On Windows, strtol cannot be intercepted when statically linked against the CRT.
+// UNSUPPORTED: win32-static-asan
+
 #include <assert.h>
 #include <stdlib.h>
 #include <string.h>

Copy link

github-actions bot commented Sep 19, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

This fixes two problems with asan's interception of `strtol` on Windows:

1. In the dynamic runtime, the `strtol` interceptor calls out to ntdll's
   `strtol` to perform the string conversion. Unfortunately, that
   function doesn't set `errno`. This has been a long-standing problem
   (llvm#34485), but it was not an issue when using the static runtime.
   After the static runtime was removed recently (llvm#107899), the problem
   became more urgent.

2. A module linked against the static CRT will have a different instance
   of `errno` than the ASan runtime, since that's now always linked
   against the dynamic CRT. That means even if the ASan runtime sets
   `errno` correctly, the calling module will not see it.

This patch fixes the first problem by making the `strtol` interceptor
call out to `strtoll` instead, and do 32-bit range checks on the result.

I can't think of any reasonable way to fix the second problem, so we
should stop intercepting `strtol` in the static runtime thunk. I checked
the list of functions in the thunk, and `strtol` and `strtoll` are the
only ones that set `errno`. (`strtoll` was already missing, probably by
mistake.)
INTERCEPTOR_STRTO_BASE(long long, strtoll)

# if SANITIZER_WINDOWS
INTERCEPTOR(long, strtol, const char *nptr, char **endptr, int base) {
// REAL(strtol) may be ntdll!strtol, which doesn't set errno. Instead,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term, we should stop patching ntdll for anything other than the purpose of intercepting heap allocation. For maintenance reasons, it's best not to touch the string routines in the minicrt inside of ntdll.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@zmodem zmodem merged commit 9f3d083 into llvm:main Sep 22, 2024
7 checks passed
@barcharcraz
Copy link
Contributor

We fixed this internally in a similar (but somewhat more hacky) way.

For the errno issue it's actually a problem for mixed debug/release mode CRTs generally, and we've fixed a few instances of it by stamping out separate interceptors for each DLL.

One possible fix is to thunk through to an implementation that returns the right errno, or to one that takes the errno to update as a parameter.

Another idea is to have the static runtime thunk register errno upon startup and then have the interceptor figure out which one to update based on the return address or by walking the stack (we could enable frame pointers for the relevant code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants