-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Don't pass null pointers to memcmp and memcpy in libFuzzer #96775
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-compiler-rt-sanitizer Author: David Benjamin (davidben) ChangesDue to a C language bug, it is UB to call memcmp(NULL, NULL, 0), memcpy(NULL, NULL, 0), etc. This is a problem because (NULL, 0) is the natural representation of the empty slice and extremely common in real world code. As a result, all C code, and C++ code which calls into C functions, must carefully guard all calls to memcpy. This is horrible and should be fixed in C (see #49459), but in the meantime, avoid tripping UBSan in libFuzzer. Fixes #96772 Full diff: https://github.com/llvm/llvm-project/pull/96775.diff 2 Files Affected:
diff --git a/compiler-rt/lib/fuzzer/FuzzerDictionary.h b/compiler-rt/lib/fuzzer/FuzzerDictionary.h
index 48f063c7ee4e6..b447f3de6a591 100644
--- a/compiler-rt/lib/fuzzer/FuzzerDictionary.h
+++ b/compiler-rt/lib/fuzzer/FuzzerDictionary.h
@@ -29,7 +29,9 @@ template <size_t kMaxSizeT> class FixedWord {
static_assert(kMaxSizeT <= std::numeric_limits<uint8_t>::max(),
"FixedWord::kMaxSizeT cannot fit in a uint8_t.");
assert(S <= kMaxSize);
- memcpy(Data, B, S);
+ // memcpy cannot be passed empty slices when the pointers are null.
+ if (S)
+ memcpy(Data, B, S);
Size = static_cast<uint8_t>(S);
}
diff --git a/compiler-rt/lib/fuzzer/FuzzerLoop.cpp b/compiler-rt/lib/fuzzer/FuzzerLoop.cpp
index 935dd2342e18f..8e229bd0ef322 100644
--- a/compiler-rt/lib/fuzzer/FuzzerLoop.cpp
+++ b/compiler-rt/lib/fuzzer/FuzzerLoop.cpp
@@ -579,6 +579,9 @@ void Fuzzer::CrashOnOverwrittenData() {
// Compare two arrays, but not all bytes if the arrays are large.
static bool LooseMemeq(const uint8_t *A, const uint8_t *B, size_t Size) {
const size_t Limit = 64;
+ // memcmp cannot be passed empty slices when the pointers are null.
+ if (!Size)
+ return 1;
if (Size <= 64)
return !memcmp(A, B, Size);
// Compare first and last Limit/2 bytes.
@@ -596,7 +599,9 @@ ATTRIBUTE_NOINLINE bool Fuzzer::ExecuteCallback(const uint8_t *Data,
// We copy the contents of Unit into a separate heap buffer
// so that we reliably find buffer overflows in it.
uint8_t *DataCopy = new uint8_t[Size];
- memcpy(DataCopy, Data, Size);
+ // memcpy cannot be passed empty slices when the pointers are null.
+ if (Size)
+ memcpy(DataCopy, Data, Size);
if (EF->__msan_unpoison)
EF->__msan_unpoison(DataCopy, Size);
if (EF->__msan_unpoison_param)
|
An alternate fix would be to give up on the libc functions and instead use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an expert here, but these changes LGTM.
I was a little suspicious of the memcpy
in MutateAndTestOne()
, but it turns out that ChooseUnitToMutate()
asserts that the returned unit is not empty.
LLVM folks: Friendly ping. Is there anything else needed to land this fix for undefined behavior in libFuzzer? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subject line can be named [fuzzer] xxx
or [Fuzzer] xxx
.
"work around a C language bug" is not a good way to describe this usability issue. The function signature is unfortunate and should be fixed, but I am not sure calling it a language bug is useful.
I mean, the non-null precondition does actually come from C itself, so that is where it'd need to be fixed. I think it's also demonstrably a problem on the spec side, seeing as it's inconsistent with how anyone anywhere writes C code. (Chrome for Linux currently cannot even compile with the relevant UBSan bit turned on because some tool used during the build computes |
In C, it is UB to call memcmp(NULL, NULL, 0), memcpy(NULL, NULL, 0), etc. Unfortunately, (NULL, 0) is the natural representation of an empty sequence of objects and extremely common in real world code. As a result, all C code, and C++ code which calls into C functions, must carefully guard all calls to memcpy. This is a serious, real world usability issue in C and should be fixed in the language (see llvm#49459). In the meantime, pay the cost of the extra branch to avoid tripping UBSan in libFuzzer. Once the usability problem in C has been fixed, these checks can be removed. Fixes llvm#96772
7df3637
to
6ddda79
Compare
Comment addressed and commit message / PR description updated. |
Anything else left to do here? (I don't have commit privileges, so someone else would need to merge this for me.) |
It says you are a "member". I believe you can press "squash and merge" above. |
Like I said above, I do not have access to that. I think "member" just came about because I had a bugtracker account. Will ping in chat. |
In C, it is UB to call
memcmp(NULL, NULL, 0)
,memcpy(NULL, NULL, 0)
, etc. Unfortunately,(NULL, 0)
is the natural representation of an empty sequence of objects and extremely common in real world code. As a result, all C code, and C++ code which calls into C functions, must carefully guard all calls tomemcpy
.This is a serious, real world usability issue in C and should be fixed in the language (see #49459). In the meantime, pay the cost of the extra branch to avoid tripping UBSan in libFuzzer. Once the usability problem in C has been fixed, these checks can be removed.
Fixes #96772