Skip to content

[ubsan] Use internal_memcpy to copy ubsan bits size #121586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 7, 2025
Merged

Conversation

earnol
Copy link
Contributor

@earnol earnol commented Jan 3, 2025

While fetching amounts of bits used to correctly display ubsan value reinterpret_cast was used, however as noted by Jakub Jelínek in #96240 discussion it might cause issues due to potentially unaligned memory access. The patch addresses this problem.

@llvmbot
Copy link
Member

llvmbot commented Jan 3, 2025

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: None (earnol)

Changes

While fetching amounts of bits used to correctly display ubsan value reinterpret_cast was used, however as noted by Jakub Jelínek in #96240 discussion it might cause issues due to potentially unaligned memory access. The patch addresses this problem.


Full diff: https://github.com/llvm/llvm-project/pull/121586.diff

1 Files Affected:

  • (modified) compiler-rt/lib/ubsan/ubsan_value.h (+5-2)
diff --git a/compiler-rt/lib/ubsan/ubsan_value.h b/compiler-rt/lib/ubsan/ubsan_value.h
index 430c9ea0dc8d15..058de943ba7fd8 100644
--- a/compiler-rt/lib/ubsan/ubsan_value.h
+++ b/compiler-rt/lib/ubsan/ubsan_value.h
@@ -150,8 +150,11 @@ class TypeDescriptor {
 
   unsigned getIntegerBitCount() const {
     DCHECK(isIntegerTy());
-    if (isSignedBitIntTy())
-      return *reinterpret_cast<const u32 *>(getBitIntBitCountPointer());
+    if (isSignedBitIntTy()) {
+      u32 BitCountValue;
+      internal_memcpy(&BitCountValue, getBitIntBitCountPointer(), sizeof(BitCountValue)); 
+      return BitCountValue;
+    }
     else
       return getIntegerBitWidth();
   }

Copy link
Contributor

@fmayer fmayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. out of interest, did you run into this in practice anywhere?

Copy link

github-actions bot commented Jan 3, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@earnol
Copy link
Contributor Author

earnol commented Jan 3, 2025

lgtm. out of interest, did you run into this in practice anywhere?

Nope. Just noticed that nobody still addressed this comment and thought it might be reasonable to handle it in appropriate way. The reason is some platforms (Sony Emotion Engine for example) works in those situations, but actually return incorrect value.
For example, you request u32 value from address 0x5672, but actually you are getting value from the address 0x5670 because it is nearest aligned address.
So, Jakub comment is totally valid and needs to be addressed.
The proper way would be to use ifdef to separate platforms working with non-aligned values from platforms behaving improperly, but i was not able to locate such define.
Another way to ensure the value is always 32-bit aligned, but it is quite complex without demanding a more global alignment and wasting more bytes.
Considering the performance is not an issue here, i went with this approach.

While fetching amounts of bits used to correctly display ubsan
value reinterpret_cast was used, however as noted by Jakub
Jelínek in llvm#96240
discussion it might cause issues due to potentially unaligned
memory access. The patch addresses this problem.
Copy link
Contributor

@ilovepi ilovepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@fmayer
Copy link
Contributor

fmayer commented Jan 6, 2025

The proper way would be to use ifdef to separate platforms working with non-aligned values from platforms behaving improperly, but i was not able to locate such define.

shouldn't the compiler be able to optimize the memcpy with target specific knowledge?

@earnol
Copy link
Contributor Author

earnol commented Jan 7, 2025

shouldn't the compiler be able to optimize the memcpy with target specific knowledge?

It should. However i do prefer to do stuff explicitly: lower failure chances.

@earnol earnol merged commit b7a6e9d into llvm:main Jan 7, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants