Skip to content

[llvm-cxxfilt][macOS] Don't strip underscores on macOS by default #106233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 28, 2024

Conversation

Michael137
Copy link
Member

Currently, llvm-cxxfilt will strip the leading underscore of its input on macOS. Historically MachO symbols were prefixed with an extra underscore and this is why this default exists. However, nowadays, the ItaniumDemangler supports all of the following mangling prefixes:
_Z, __Z, ___Z, ____Z. So really llvm-cxxfilt can simply forward the mangled name to the demangler and let the library decide whether it's a valid encoding.

Compiling C++ on macOS nowadays will generate symbols with _Z and ___Z prefixes. So users trying to demangle these symbols will have to know that they need to add the -n prefix. This routinely catches people off-guard.

This patch removes the -n default for macOS and allows calling into the ItaniumDemangler with all the _Z prefixes that the demangler supports (1-4 underscores).

rdar://132714940

Currently, `llvm-cxxfilt` will strip the leading underscore of
its input on macOS. Historically MachO symbols were prefixed
with an extra underscore and this is why this default exists.
However, since this default was introduced the `ItaniumDemangler`
supports all of the following mangling prefixes:
`_Z`, `__Z`, `___Z`, `____Z`. So really `llvm-cxxfilt` can
simply forward the mangled name to the demangler and let the
library decide whether it's a valid encoding.

Compiling C++ on macOS nowadays will generate symbols with `_Z`
and `___Z` prefixes. So users trying to demangle these symbols
will have to know that they need to add the `-n` prefix. This
routinely catches people off-guard.

This patch removes the `-n` default for macOS and allows calling
into the `ItaniumDemangler` with all the `_Z` prefixes that the
demangler supports (1-4 underscores).

rdar://132714940
@llvmbot
Copy link
Member

llvmbot commented Aug 27, 2024

@llvm/pr-subscribers-llvm-binary-utilities

Author: Michael Buch (Michael137)

Changes

Currently, llvm-cxxfilt will strip the leading underscore of its input on macOS. Historically MachO symbols were prefixed with an extra underscore and this is why this default exists. However, nowadays, the ItaniumDemangler supports all of the following mangling prefixes:
_Z, __Z, ___Z, ____Z. So really llvm-cxxfilt can simply forward the mangled name to the demangler and let the library decide whether it's a valid encoding.

Compiling C++ on macOS nowadays will generate symbols with _Z and ___Z prefixes. So users trying to demangle these symbols will have to know that they need to add the -n prefix. This routinely catches people off-guard.

This patch removes the -n default for macOS and allows calling into the ItaniumDemangler with all the _Z prefixes that the demangler supports (1-4 underscores).

rdar://132714940


Full diff: https://github.com/llvm/llvm-project/pull/106233.diff

5 Files Affected:

  • (modified) llvm/lib/Demangle/Demangle.cpp (+3-2)
  • (modified) llvm/test/tools/llvm-cxxfilt/invalid.test (+4-2)
  • (removed) llvm/test/tools/llvm-cxxfilt/strip-underscore-default-darwin.test (-7)
  • (modified) llvm/test/tools/llvm-cxxfilt/strip-underscore.test (+7-5)
  • (modified) llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp (+1-5)
diff --git a/llvm/lib/Demangle/Demangle.cpp b/llvm/lib/Demangle/Demangle.cpp
index 117b849d1c7848..f0f7eacac98e64 100644
--- a/llvm/lib/Demangle/Demangle.cpp
+++ b/llvm/lib/Demangle/Demangle.cpp
@@ -38,8 +38,9 @@ std::string llvm::demangle(std::string_view MangledName) {
 }
 
 static bool isItaniumEncoding(std::string_view S) {
-  // Itanium encoding requires 1 or 3 leading underscores, followed by 'Z'.
-  return starts_with(S, "_Z") || starts_with(S, "___Z");
+  // Itanium demangler supports prefixes with 1-4 underscores.
+  const size_t Pos = S.find_first_not_of('_');
+  return Pos > 0 && Pos <= 4 && S[Pos] == 'Z';
 }
 
 static bool isRustEncoding(std::string_view S) { return starts_with(S, "_R"); }
diff --git a/llvm/test/tools/llvm-cxxfilt/invalid.test b/llvm/test/tools/llvm-cxxfilt/invalid.test
index 13866f3fb90eb9..f1cf583936bcf8 100644
--- a/llvm/test/tools/llvm-cxxfilt/invalid.test
+++ b/llvm/test/tools/llvm-cxxfilt/invalid.test
@@ -1,6 +1,8 @@
-RUN: llvm-cxxfilt -n _Z1fi __Z1fi f ___ZSt1ff_block_invoke | FileCheck %s
+RUN: llvm-cxxfilt -n _Z1fi __Z1fi _____Z1fi f ___ZSt1ff_block_invoke ____ZSt1ff_block_invoke | FileCheck %s
 
 CHECK: f(int)
-CHECK-NEXT: __Z1fi
+CHECK-NEXT: f(int)
+CHECK-NEXT: _____Z1fi
 CHECK-NEXT: f
 CHECK-NEXT: invocation function for block in std::f(float)
+CHECK-NEXT: invocation function for block in std::f(float)
diff --git a/llvm/test/tools/llvm-cxxfilt/strip-underscore-default-darwin.test b/llvm/test/tools/llvm-cxxfilt/strip-underscore-default-darwin.test
deleted file mode 100644
index 7dcffa05a04887..00000000000000
--- a/llvm/test/tools/llvm-cxxfilt/strip-underscore-default-darwin.test
+++ /dev/null
@@ -1,7 +0,0 @@
-REQUIRES: system-darwin
-
-## Show that on darwin, the default is to strip the leading underscore.
-
-RUN: llvm-cxxfilt __Z1fv _Z2bav | FileCheck %s
-CHECK: f()
-CHECK: _Z2bav
diff --git a/llvm/test/tools/llvm-cxxfilt/strip-underscore.test b/llvm/test/tools/llvm-cxxfilt/strip-underscore.test
index 2b7057fbfbae82..6eb0de8302c3a4 100644
--- a/llvm/test/tools/llvm-cxxfilt/strip-underscore.test
+++ b/llvm/test/tools/llvm-cxxfilt/strip-underscore.test
@@ -1,17 +1,19 @@
 ## Show the behaviour of --[no-]strip-underscore. This test does not test
 ## the platform-specific default behaviour. This is tested elsewhere.
 
-RUN: llvm-cxxfilt -_ __ZN2ns1fE _ZSt1f _f _._Z3f.0v | FileCheck %s -check-prefix CHECK-STRIPPED
-RUN: llvm-cxxfilt --strip-underscore __ZN2ns1fE _ZSt1f _f _._Z3f.0v | FileCheck %s -check-prefix CHECK-STRIPPED
-RUN: llvm-cxxfilt -n __ZN2ns1fE _ZSt1f _f _._Z3f.0v | FileCheck %s -check-prefix CHECK-UNSTRIPPED
-RUN: llvm-cxxfilt --no-strip-underscore __ZN2ns1fE _ZSt1f _f _._Z3f.0v | FileCheck %s -check-prefix CHECK-UNSTRIPPED
+RUN: llvm-cxxfilt -_ ___ZN2ns1fE _____Z1fi_block_invoke _ZSt1f _f _._Z3f.0v | FileCheck %s -check-prefix CHECK-STRIPPED
+RUN: llvm-cxxfilt --strip-underscore __ZN2ns1fE _____Z1fi_block_invoke _ZSt1f _f _._Z3f.0v | FileCheck %s -check-prefix CHECK-STRIPPED
+RUN: llvm-cxxfilt -n ___ZN2ns1fE _____Z1fi_block_invoke _ZSt1f _f _._Z3f.0v | FileCheck %s -check-prefix CHECK-UNSTRIPPED
+RUN: llvm-cxxfilt --no-strip-underscore ___ZN2ns1fE _____Z1fi_block_invoke _ZSt1f _f _._Z3f.0v | FileCheck %s -check-prefix CHECK-UNSTRIPPED
 
 CHECK-STRIPPED: ns::f
+CHECK-STRIPPED: invocation function for block in f(int)
 CHECK-STRIPPED: _ZSt1f
 CHECK-STRIPPED: _f
 CHECK-STRIPPED: ._Z3f.0v
 
-CHECK-UNSTRIPPED: __ZN2ns1fE
+CHECK-UNSTRIPPED: ___ZN2ns1fE
+CHECK-UNSTRIPPED: _____Z1fi_block_invoke
 CHECK-UNSTRIPPED: std::f
 CHECK-UNSTRIPPED: _f
 CHECK-UNSTRIPPED: _._Z3f.0v
diff --git a/llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp b/llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
index d63f507619a0e9..0ce2f4b2e62d42 100644
--- a/llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
+++ b/llvm/tools/llvm-cxxfilt/llvm-cxxfilt.cpp
@@ -54,7 +54,7 @@ class CxxfiltOptTable : public opt::GenericOptTable {
 } // namespace
 
 static bool ParseParams;
-static bool StripUnderscore;
+static bool StripUnderscore = false;
 static bool Types;
 
 static StringRef ToolName;
@@ -165,13 +165,9 @@ int llvm_cxxfilt_main(int argc, char **argv, const llvm::ToolContext &) {
     return 0;
   }
 
-  // The default value depends on the default triple. Mach-O has symbols
-  // prefixed with "_", so strip by default.
   if (opt::Arg *A =
           Args.getLastArg(OPT_strip_underscore, OPT_no_strip_underscore))
     StripUnderscore = A->getOption().matches(OPT_strip_underscore);
-  else
-    StripUnderscore = Triple(sys::getProcessTriple()).isOSBinFormatMachO();
 
   ParseParams = !Args.hasArg(OPT_no_params);
 

@jroelofs jroelofs requested a review from urnathan August 27, 2024 15:48
@@ -165,13 +165,9 @@ int llvm_cxxfilt_main(int argc, char **argv, const llvm::ToolContext &) {
return 0;
}

// The default value depends on the default triple. Mach-O has symbols
// prefixed with "_", so strip by default.
if (opt::Arg *A =
Args.getLastArg(OPT_strip_underscore, OPT_no_strip_underscore))
StripUnderscore = A->getOption().matches(OPT_strip_underscore);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this can be simplified without if statement.
StripUnderscore = Args.hasFlag(OPT_strip_underscore, OPT_no_strip_underscore, false)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, thanks!

Copy link
Collaborator

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Michael137 Michael137 merged commit 0b554dd into llvm:main Aug 28, 2024
8 checks passed
@Michael137 Michael137 deleted the tools/cxxfilt-macos-underscores branch August 28, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants