Skip to content

[libunwind] Remove unnecessary strcpy dependency #72043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 15, 2023

Conversation

michael-kenzel
Copy link
Contributor

@michael-kenzel michael-kenzel commented Nov 12, 2023

libunwind uses a minimal set of necessary standard library functions, basically just memset and memcpy. There is a single use of strcpy to copy the bytes "CLNGUNW" into a uint64_t object. This is both an arguably odd use of the strcpy function as well as it unnecessarily widens the set of library functions that must be available to build libunwind, which can be an obstacle in baremetal scenarios. This change simply replaces this one strcpy with the more fundamental memcpy.

@michael-kenzel michael-kenzel requested a review from a team as a code owner November 12, 2023 01:59
@llvmbot
Copy link
Member

llvmbot commented Nov 12, 2023

@llvm/pr-subscribers-libunwind

Author: Michael Kenzel (michael-kenzel)

Changes

libunwind uses a minimum set of necessary standard library functions, basically just memset and memcpy. There is a single use of strcpy to copy the bytes "CLNGUNW" into a uint64_t object. This is both an arguably odd use of the strcpy function as well as it unnecessarily widens the set of library functions that must be available to build libunwind, which can be an obstacle in baremetal scenarios. This change simply replaces this one strcpy with the more fundamental memcpy.


Full diff: https://github.com/llvm/llvm-project/pull/72043.diff

1 Files Affected:

  • (modified) libunwind/src/UnwindLevel1-gcc-ext.c (+1-1)
diff --git a/libunwind/src/UnwindLevel1-gcc-ext.c b/libunwind/src/UnwindLevel1-gcc-ext.c
index d343f4e6e9cc837..32c872ffade1fd0 100644
--- a/libunwind/src/UnwindLevel1-gcc-ext.c
+++ b/libunwind/src/UnwindLevel1-gcc-ext.c
@@ -143,7 +143,7 @@ _Unwind_Backtrace(_Unwind_Trace_Fn callback, void *ref) {
   // Create a mock exception object for force unwinding.
   _Unwind_Exception ex;
   memset(&ex, '\0', sizeof(ex));
-  strcpy((char *)&ex.exception_class, "CLNGUNW");
+  memcpy(&ex.exception_class, "CLNGUNW", sizeof(ex.exception_class));
 #endif
 
   // walk each frame

@michael-kenzel michael-kenzel force-pushed the libunwind-remove-strcpy branch from 63b2013 to c423c5b Compare November 12, 2023 03:00
@michael-kenzel michael-kenzel changed the title [libunwind] remove unnecessary strcpy dependency [libunwind] Remove unnecessary strcpy dependency Nov 12, 2023
Copy link
Member

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume this is already transformed to memcpy but I guess it won't be for -O0

@@ -143,7 +143,7 @@ _Unwind_Backtrace(_Unwind_Trace_Fn callback, void *ref) {
// Create a mock exception object for force unwinding.
_Unwind_Exception ex;
memset(&ex, '\0', sizeof(ex));
strcpy((char *)&ex.exception_class, "CLNGUNW");
memcpy(&ex.exception_class, "CLNGUNW", sizeof(ex.exception_class));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use the string length here instead of the size of the exception class?

Copy link
Contributor Author

@michael-kenzel michael-kenzel Nov 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning was that if the string happened to be too long for some reason, we'd want to ensure that we don't write past the member. Using the string length would require duplicating the string literal, or introducing a constant, which would probably require a macro since this is C and so we can't just use constexpr (I doubt we can assume C23?) or some helper function/macro to wrap the copy. Either option, using the string length or member length, has the downside of not considering the other length. Ideally, we'd have a _Static_assert to ensure the sizes match. But that then puts us back to the problem of needing to duplicate the string literal. I couldn't find a way to put any such improvements into plain C without making this code significantly more complex. I was also not quite sure what versions of C this needs to build under, which would have implications regarding what features can be used.

One very simple solution (at least code-wise) would be to just do

  _Unwind_Exception ex = {
    .exception_class = 0x574E55474E4C43  // "CLNGUNW"
  };

That would take care of both the memset and the memcpy. But this then assumes little endian and so would be a potential ABI break on big-endian systems?

Overall, I think it's reasonable to assume that this string isn't just gonna change in a way that would make it not match the size of the member anymore. The current implementation was already broken in that case as well (at least if the string would happen to be longer). Unfortunately, no compiler seems to issue a warning if you make the string not match the size of the member.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is fine as is. Assigning the integer directly will cause endian issues, so using memcpy is better.

If we want to be extra safe, we could #define the string value and add a static_assert().

@michael-kenzel
Copy link
Contributor Author

michael-kenzel commented Nov 12, 2023

I would assume this is already transformed to memcpy but I guess it won't be for -O0

I did some testing on this: it seems that gcc replaces the strcpy with a simple mov even under -O0, but not the memcpy. clang does the reverse: it replaces the memcpy with a mov even under -O0 but not the strcpy. So it's a bit of a wash (I didn't expect either compiler to transform any of these with -O0 tbqh). Though since libunwind is primarily meant to be used with LLVM, this change then arguably presents the tiniest codegen improvement. On either compiler, turning on optimizations (even just -O1) results in the exact same codegen from either version (as one would expect).

@michael-kenzel michael-kenzel force-pushed the libunwind-remove-strcpy branch from c423c5b to d5adc24 Compare November 13, 2023 00:06
@michael-kenzel
Copy link
Contributor Author

I found another use of the same string copying in tests/forceunwind.pass.cpp and updated that to match.

@ldionne
Copy link
Member

ldionne commented Nov 15, 2023

This seems reasonable. I read the discussion about the size of the member but I think this is unlikely to change much -- plus I think the compiler should be able to diagnose in case the string becomes smaller and we'd be copying from a past-the-end location into the data structure: https://godbolt.org/z/qP736q5je

Clang doesn't warn right now but GCC does. I filed #72455 about that.

@ldionne ldionne merged commit b7a249d into llvm:main Nov 15, 2023
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
libunwind uses a minimal set of necessary standard library functions,
basically just `memset` and `memcpy`. There is a single use of `strcpy`
to copy the bytes `"CLNGUNW"` into a `uint64_t` object. This is both an
arguably odd use of the `strcpy` function as well as it unnecessarily
widens the set of library functions that must be available to build
libunwind, which can be an obstacle in baremetal scenarios. This change
simply replaces this one `strcpy` with the more fundamental `memcpy`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants