Skip to content

[lldb] Expand background symbol lookup #80890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 8, 2024

Conversation

JDevlieghere
Copy link
Member

@JDevlieghere JDevlieghere commented Feb 6, 2024

LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.

Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.

This patch renames the setting to have a more generic name
(symbols.download) and changes its values from a boolean to an enum.
Users can now specify "off", "background" and "foreground". The default
remains "off" although I'll probably change that in the near future.

@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2024

@llvm/pr-subscribers-lldb

Author: Jonas Devlieghere (JDevlieghere)

Changes

LLDB has a setting (symbols.enable-background-lookup) that calls dsymForUUID on a background thread for images as they appear in the current backtrace. Originally, the laziness of only looking up symbols for images in the backtrace only existed to bring the number of dsymForUUID calls down to a manageable number.

Users have requesting the same functionality but blocking. This gives them the same user experience as enabling dsymForUUID globally, but without the massive upfront cost of having to download all the images, the majority of which they'll likely not need.

This patch renames the setting to have a more generic name (symbols.lazy-lookup) and changes its values from a boolean to an enum. Users can now specify lazy-lookup as "off", "background" and "foreground". The default remains "off" although I'll probably change that in the near future.


Full diff: https://github.com/llvm/llvm-project/pull/80890.diff

5 Files Affected:

  • (modified) lldb/include/lldb/Core/ModuleList.h (+22-1)
  • (modified) lldb/include/lldb/lldb-enumerations.h (+6)
  • (modified) lldb/source/Core/CoreProperties.td (+4-3)
  • (modified) lldb/source/Core/ModuleList.cpp (+5-4)
  • (modified) lldb/source/Symbol/SymbolLocator.cpp (+14-5)
diff --git a/lldb/include/lldb/Core/ModuleList.h b/lldb/include/lldb/Core/ModuleList.h
index d78f7c5ef3f70..5c19cbc1e799c 100644
--- a/lldb/include/lldb/Core/ModuleList.h
+++ b/lldb/include/lldb/Core/ModuleList.h
@@ -47,6 +47,26 @@ class UUID;
 class VariableList;
 struct ModuleFunctionSearchOptions;
 
+static constexpr OptionEnumValueElement g_lazy_lookup_enum_values[] = {
+    {
+        lldb::eLazyLookupOff,
+        "off",
+        "Disable lazy symbol lookup.",
+    },
+    {
+        lldb::eLazyLookupBackground,
+        "background",
+        "Lazily look up symbols in the background without blocking the "
+        "debugger.",
+    },
+    {
+        lldb::eLazyLookupForeground,
+        "foreground",
+        "Lazily look up symbols in the foreground and block the debugger until "
+        "they're found.",
+    },
+};
+
 class ModuleListProperties : public Properties {
   mutable llvm::sys::RWMutex m_symlink_paths_mutex;
   PathMappingList m_symlink_paths;
@@ -60,7 +80,6 @@ class ModuleListProperties : public Properties {
   bool SetClangModulesCachePath(const FileSpec &path);
   bool GetEnableExternalLookup() const;
   bool SetEnableExternalLookup(bool new_value);
-  bool GetEnableBackgroundLookup() const;
   bool GetEnableLLDBIndexCache() const;
   bool SetEnableLLDBIndexCache(bool new_value);
   uint64_t GetLLDBIndexCacheMaxByteSize();
@@ -71,6 +90,8 @@ class ModuleListProperties : public Properties {
 
   bool GetLoadSymbolOnDemand();
 
+  lldb::LazySymbolLookup GetLazySymbolLookup() const;
+
   PathMappingList GetSymlinkMappings() const;
 };
 
diff --git a/lldb/include/lldb/lldb-enumerations.h b/lldb/include/lldb/lldb-enumerations.h
index 392d333c23a44..c1f9fab524499 100644
--- a/lldb/include/lldb/lldb-enumerations.h
+++ b/lldb/include/lldb/lldb-enumerations.h
@@ -1305,6 +1305,12 @@ enum CompletionType {
   eTerminatorCompletion = (1ul << 27)
 };
 
+enum LazySymbolLookup {
+  eLazyLookupOff = 0,
+  eLazyLookupBackground = 1,
+  eLazyLookupForeground = 2,
+};
+
 } // namespace lldb
 
 #endif // LLDB_LLDB_ENUMERATIONS_H
diff --git a/lldb/source/Core/CoreProperties.td b/lldb/source/Core/CoreProperties.td
index 8d81967bdb50a..fcf89bd412b84 100644
--- a/lldb/source/Core/CoreProperties.td
+++ b/lldb/source/Core/CoreProperties.td
@@ -5,10 +5,11 @@ let Definition = "modulelist" in {
     Global,
     DefaultTrue,
     Desc<"Control the use of external tools and repositories to locate symbol files. Directories listed in target.debug-file-search-paths and directory of the executable are always checked first for separate debug info files. Then depending on this setting: On macOS, Spotlight would be also used to locate a matching .dSYM bundle based on the UUID of the executable. On NetBSD, directory /usr/libdata/debug would be also searched. On platforms other than NetBSD directory /usr/lib/debug would be also searched. If all other methods fail there may be symbol-locator plugins that, if configured properly, will also attempt to acquire symbols. The debuginfod plugin defaults to the DEGUFINFOD_URLS environment variable which is configurable through the 'plugin.symbol-locator.debuginfod.server_urls' setting.">;
-  def EnableBackgroundLookup: Property<"enable-background-lookup", "Boolean">,
+  def LazySymbolLookup: Property<"lazy-lookup", "Enum">,
     Global,
-    DefaultFalse,
-    Desc<"On macOS, enable calling dsymForUUID (or an equivalent script/binary) in the background to locate symbol files that weren't found.">;
+    DefaultEnumValue<"eLazyLookupOff">,
+    EnumValues<"OptionEnumValues(g_lazy_lookup_enum_values)">,
+    Desc<"On macOS, lazily look up symbols with dsymForUUID (or an equivalent script/binary) as images appear in the current backtrace.">;
   def ClangModulesCachePath: Property<"clang-modules-cache-path", "FileSpec">,
     Global,
     DefaultStringValue<"">,
diff --git a/lldb/source/Core/ModuleList.cpp b/lldb/source/Core/ModuleList.cpp
index b7f393636ba28..a7ab7c56eb7a1 100644
--- a/lldb/source/Core/ModuleList.cpp
+++ b/lldb/source/Core/ModuleList.cpp
@@ -104,10 +104,11 @@ bool ModuleListProperties::SetEnableExternalLookup(bool new_value) {
   return SetPropertyAtIndex(ePropertyEnableExternalLookup, new_value);
 }
 
-bool ModuleListProperties::GetEnableBackgroundLookup() const {
-  const uint32_t idx = ePropertyEnableBackgroundLookup;
-  return GetPropertyAtIndexAs<bool>(
-      idx, g_modulelist_properties[idx].default_uint_value != 0);
+LazySymbolLookup ModuleListProperties::GetLazySymbolLookup() const {
+  const uint32_t idx = ePropertyLazySymbolLookup;
+  return GetPropertyAtIndexAs<lldb::LazySymbolLookup>(
+      idx, static_cast<lldb::LazySymbolLookup>(
+               g_modulelist_properties[idx].default_uint_value));
 }
 
 FileSpec ModuleListProperties::GetClangModulesCachePath() const {
diff --git a/lldb/source/Symbol/SymbolLocator.cpp b/lldb/source/Symbol/SymbolLocator.cpp
index 918f13ed9c193..288c986cbdc28 100644
--- a/lldb/source/Symbol/SymbolLocator.cpp
+++ b/lldb/source/Symbol/SymbolLocator.cpp
@@ -18,12 +18,10 @@ using namespace lldb;
 using namespace lldb_private;
 
 void SymbolLocator::DownloadSymbolFileAsync(const UUID &uuid) {
-  if (!ModuleList::GetGlobalModuleListProperties().GetEnableBackgroundLookup())
-    return;
-
   static llvm::SmallSet<UUID, 8> g_seen_uuids;
   static std::mutex g_mutex;
-  Debugger::GetThreadPool().async([=]() {
+
+  auto lookup = [=]() {
     {
       std::lock_guard<std::mutex> guard(g_mutex);
       if (g_seen_uuids.count(uuid))
@@ -43,5 +41,16 @@ void SymbolLocator::DownloadSymbolFileAsync(const UUID &uuid) {
       return;
 
     Debugger::ReportSymbolChange(module_spec);
-  });
+  };
+
+  switch (ModuleList::GetGlobalModuleListProperties().GetLazySymbolLookup()) {
+  case eLazyLookupOff:
+    break;
+  case eLazyLookupBackground:
+    Debugger::GetThreadPool().async(lookup);
+    break;
+  case eLazyLookupForeground:
+    lookup();
+    break;
+  };
 }

@@ -5,10 +5,11 @@ let Definition = "modulelist" in {
Global,
DefaultTrue,
Desc<"Control the use of external tools and repositories to locate symbol files. Directories listed in target.debug-file-search-paths and directory of the executable are always checked first for separate debug info files. Then depending on this setting: On macOS, Spotlight would be also used to locate a matching .dSYM bundle based on the UUID of the executable. On NetBSD, directory /usr/libdata/debug would be also searched. On platforms other than NetBSD directory /usr/lib/debug would be also searched. If all other methods fail there may be symbol-locator plugins that, if configured properly, will also attempt to acquire symbols. The debuginfod plugin defaults to the DEGUFINFOD_URLS environment variable which is configurable through the 'plugin.symbol-locator.debuginfod.server_urls' setting.">;
def EnableBackgroundLookup: Property<"enable-background-lookup", "Boolean">,
def LazySymbolLookup: Property<"lazy-lookup", "Enum">,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name should probably be different. "LazySymbolLookup" being set to off makes me think it will look things up eagerly. Looking up the possible values of "off", "background", and "foreground" also don't really illustrate what they mean either.

I would propose the setting name "external-symbol-load-behavior" with the enumeration values "off", "background", and "blocking" or something to this effect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is entirely orthogonal to the usual external symbol lookup. The latter uses DebugSymbols/Spotlight which may or may not use dsymForUUID under the hood. I considered on-demand but that's already taken by SymbolFileOnDemand. I'm open to other names but I don't have a better idea :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

symbol-loading-behavior? symbol-lookup-strategy? 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep coming back to synonyms for "on demand" but not clear why they are different, like "as-needed-symbol-lookup". Or "backtrace-driven-symbol-lookup".

I don't know if there's any really clear name that could help a first-time user to distinguish between the different mechanisms. We have the existing symbols.load-on-demand which is really avoiding loading the DWARF for any binaries until criteria are met (if I'm reading the docs/use/ondemand.rst correctly). In this case, lldb has the DWARF for the binaries locally, it's just not reading it until the debug info is needed (with a few caveats like "breakpoints by name on inlined functions won't work").

This mechanism is instead "Download the DWARF (dSYM) and executable (if possible) from an external server, and load it immediately in to lldb". For instance, you connect to a process on a remote system and none of the binaries running on the remote system, e.g. libsystem_c.dylib, exist on the host computer. lldb can use this mechanism to download the correct libsystem_c.dylib and libsystem_c.dylib.dSYM to the host computer and load them in to lldb. Instead of doing that for all binaries on connect, this allows us to avoid hitting the external servers until specific binaries are needed.

I might try to work 'symbol-download' into the setting name, that is a real difference between the two.

Copy link
Member Author

@JDevlieghere JDevlieghere Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your suggestion. @bulbazord How do you feel about symbols.download={off, background,foreground}?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

symbols.download-on-demand lol it's actually accurate but probably too overly clever compared to symbols.load-on-demand.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just to be clear, my fear of symbols.download-on-demand in addition to the existing symbols.load-on-demand is that no one except us will ever understand what the difference is. I can sit here and say "it's very clear if you think about it!" but I think it's not worth it to be this subtle, IMO.)

@jasonmolenda
Copy link
Collaborator

I like this change, this will be a very handy capability for people debugging corefiles & live sessions on remote computers when they have a mechanism to find the binaries and/or debug info for those processes, without needing to download all of them. One change I would also roll in in SymbolLocator::DownloadSymbolFileAsync() is to request copy_executable=true here,

    if (!PluginManager::DownloadObjectAndSymbolFile(module_spec, error,
                                                    /*force_lookup=*/true,
                                                    /*copy_executable=*/false))

So we download both the dSYM and binary for situations where we don't have a local binary.

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy w/ the name. Thanks!

@@ -5,10 +5,11 @@ let Definition = "modulelist" in {
Global,
DefaultTrue,
Desc<"Control the use of external tools and repositories to locate symbol files. Directories listed in target.debug-file-search-paths and directory of the executable are always checked first for separate debug info files. Then depending on this setting: On macOS, Spotlight would be also used to locate a matching .dSYM bundle based on the UUID of the executable. On NetBSD, directory /usr/libdata/debug would be also searched. On platforms other than NetBSD directory /usr/lib/debug would be also searched. If all other methods fail there may be symbol-locator plugins that, if configured properly, will also attempt to acquire symbols. The debuginfod plugin defaults to the DEGUFINFOD_URLS environment variable which is configurable through the 'plugin.symbol-locator.debuginfod.server_urls' setting.">;
def EnableBackgroundLookup: Property<"enable-background-lookup", "Boolean">,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we leave this setting around and adapt it to use the new setting with the correct defaults set? If someone has a settings set enable-background-lookup true in their ~/.lldbinit file, this will cause the .lldbinit file to stop executing lines if this setting fails to be set. This is kind of like a backward compatibility in our API kind of thing. We can make this setting as deprecated and tell people to use the download setting instead, but it would be nice if it still worked.

If this was used for dsymForUUID only and there weren't many customers it could be ok to remove, but I would err on the side of caution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was on the fence, as I've been encouraging users to use settings set symbols.enable-background-lookup true in their .lldbinit (because older versions don't know about the setting either). On the other hand it's very little work to keep the old setting working, so why not. I've pushed a commit that makes the old setting an alias.

Desc<"On macOS, enable calling dsymForUUID (or an equivalent script/binary) in the background to locate symbol files that weren't found.">;
DefaultEnumValue<"eSymbolDownloadOff">,
EnumValues<"OptionEnumValues(g_download_enum_values)">,
Desc<"On macOS, lazily download symbols with dsymForUUID (or an equivalent script/binary) as images appear in the current backtrace.">;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it truly only for backtraces? Seems like an address lookup in a shared library might be a good place to also get the symbols.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now yes, but we could extend this in the future.

Comment on lines +59 to +60
"Download symbols in the background for images as they appear in the "
"backtrace.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this truly only for backtraces? Wouldn't an address lookup in a binary also qualify as a way to force debug info to be loaded?

Copy link
Member

@medismailben medismailben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@clayborg
Copy link
Collaborator

clayborg commented Feb 7, 2024

Does this setting only apply to the downloading of symbols that are auto downloaded for stack traces? Or does it apply to any other areas that auto download stuff? Do we want a setting to control the auto downloading of symbols to clarify? Something like:

(lldb) settings set symbols.auto-download-stack-symbols true

Right now your setting seems to be targeting stack auto symbol downloads, but this setting is pretty generic and could apply to other things in the future like auto downloading symbols when an address lookup happens on a module, finding the real definition for a type when we know a type lives in a shared library (like ObjC and C++ vtables can easily tell us).

@JDevlieghere
Copy link
Member Author

@clayborg The idea behind the feature is to get symbols for things that are relevant to the user. Right now, that's only hooked up for images that appear in the stack trace, but there are certainly other places this would be useful. So yeah, I absolutely expect this to apply to other things in the future as well.

We currently check the setting right before downloading, so in a generic place that would be shared by all the other places that could trigger a download. Therefore I think the "mode" should be generic. If we have different triggers we could have separate settings to turn those things off (e.g. only download symbols for images in the stack trace, not for address lookups). So down the line it could look something like this:

(lldb) settings set symbols.auto-download background
(lldb) settings set symbols.auto-download-stack-symbols true
(lldb) settings set symbols.auto-download-address-lookup false

Let me know if you think it's critical to add symbols.auto-download-stack-symbols now in addition to symbols.auto-download (which I like better than symbols.download, thanks for the suggestion!) .

LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.

Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.

This patch renames the setting to have a more generic name
(symbols.download) and changes its values from a boolean to an enum.
Users can now specify "off", "background" and "foreground". The default
remains "off" although I'll probably change that in the near future.
@JDevlieghere
Copy link
Member Author

Rebased & renamed the property to symbols.auto-download. I think all the review comments have been addressed. The only oustanding question is if @clayborg wants to have an additional setting for symbols.auto-download-stack-symbols.

@clayborg
Copy link
Collaborator

clayborg commented Feb 8, 2024

@clayborg The idea behind the feature is to get symbols for things that are relevant to the user. Right now, that's only hooked up for images that appear in the stack trace, but there are certainly other places this would be useful. So yeah, I absolutely expect this to apply to other things in the future as well.

We currently check the setting right before downloading, so in a generic place that would be shared by all the other places that could trigger a download. Therefore I think the "mode" should be generic. If we have different triggers we could have separate settings to turn those things off (e.g. only download symbols for images in the stack trace, not for address lookups). So down the line it could look something like this:

(lldb) settings set symbols.auto-download background
(lldb) settings set symbols.auto-download-stack-symbols true
(lldb) settings set symbols.auto-download-address-lookup false

Let me know if you think it's critical to add symbols.auto-download-stack-symbols now in addition to symbols.auto-download (which I like better than symbols.download, thanks for the suggestion!) .

Thanks for the update. Not critical to this patch, just wondering where we are going with this in the future and auto downloading of symbols.

@JDevlieghere JDevlieghere merged commit 74fc16a into llvm:main Feb 8, 2024
@JDevlieghere JDevlieghere deleted the lazy-lookup branch February 8, 2024 19:24
@nico
Copy link
Contributor

nico commented Feb 8, 2024

This doesn't build on Linux: http://45.33.8.238/linux/130329/step_4.txt

Please take a look and revert for now if it takes a while to fix.

(Looks like you added a definition on non apple; maybe the declaration is just still missing)

JDevlieghere added a commit that referenced this pull request Feb 8, 2024
JDevlieghere added a commit that referenced this pull request Feb 8, 2024
JDevlieghere added a commit that referenced this pull request Feb 8, 2024
LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.

Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.

This patch renames the setting to have a more generic name
(symbols.auto-download) and changes its values from a boolean to an
enum. Users can now specify "off", "background" and "foreground". The
default remains "off" although I'll probably change that in the near
future.
JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Feb 8, 2024
LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.

Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.

This patch renames the setting to have a more generic name
(symbols.auto-download) and changes its values from a boolean to an
enum. Users can now specify "off", "background" and "foreground". The
default remains "off" although I'll probably change that in the near
future.

(cherry picked from commit 5f4b40c)
JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Feb 8, 2024
LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.

Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.

This patch renames the setting to have a more generic name
(symbols.auto-download) and changes its values from a boolean to an
enum. Users can now specify "off", "background" and "foreground". The
default remains "off" although I'll probably change that in the near
future.

(cherry picked from commit 5f4b40c)
jasonmolenda pushed a commit to jasonmolenda/llvm-project that referenced this pull request Feb 16, 2024
LLDB has a setting (symbols.enable-background-lookup) that calls
dsymForUUID on a background thread for images as they appear in the
current backtrace. Originally, the laziness of only looking up symbols
for images in the backtrace only existed to bring the number of
dsymForUUID calls down to a manageable number.

Users have requesting the same functionality but blocking. This gives
them the same user experience as enabling dsymForUUID globally, but
without the massive upfront cost of having to download all the images,
the majority of which they'll likely not need.

This patch renames the setting to have a more generic name
(symbols.auto-download) and changes its values from a boolean to an
enum. Users can now specify "off", "background" and "foreground". The
default remains "off" although I'll probably change that in the near
future.

(cherry picked from commit 5f4b40c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants