Skip to content

[lldb][progress] Add discrete boolean flag to progress reports #69516

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chelcassanova
Copy link
Contributor

This commit adds a boolean flag is_discrete is to progress reports in LLDB. The flag is set to false by default and indicates if a progress event is discrete, i.e. an operation that has no clear start and end and can happen multiple times during the course of a debug session. Operations that happen in this manner will report multiple individual progress events as they happen, so this flag gives the functionality to group multiple progress events so they can be reported in a less haphazard manner.

@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2023

@llvm/pr-subscribers-lldb

Author: Chelsea Cassanova (chelcassanova)

Changes

This commit adds a boolean flag is_discrete is to progress reports in LLDB. The flag is set to false by default and indicates if a progress event is discrete, i.e. an operation that has no clear start and end and can happen multiple times during the course of a debug session. Operations that happen in this manner will report multiple individual progress events as they happen, so this flag gives the functionality to group multiple progress events so they can be reported in a less haphazard manner.


Full diff: https://github.com/llvm/llvm-project/pull/69516.diff

11 Files Affected:

  • (modified) lldb/include/lldb/Core/Debugger.h (+1-1)
  • (modified) lldb/include/lldb/Core/DebuggerEvents.h (+5-2)
  • (modified) lldb/include/lldb/Core/Progress.h (+9-1)
  • (modified) lldb/source/Core/Debugger.cpp (+9-7)
  • (modified) lldb/source/Core/DebuggerEvents.cpp (+1)
  • (modified) lldb/source/Core/Progress.cpp (+2-2)
  • (modified) lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+2-2)
  • (modified) lldb/source/Symbol/LocateSymbolFile.cpp (+1-1)
  • (modified) lldb/test/API/functionalities/progress_reporting/TestProgressReporting.py (+7-4)
diff --git a/lldb/include/lldb/Core/Debugger.h b/lldb/include/lldb/Core/Debugger.h
index 5532cace606bfed..8e21502dac6dee2 100644
--- a/lldb/include/lldb/Core/Debugger.h
+++ b/lldb/include/lldb/Core/Debugger.h
@@ -618,7 +618,7 @@ class Debugger : public std::enable_shared_from_this<Debugger>,
   static void ReportProgress(uint64_t progress_id, std::string title,
                              std::string details, uint64_t completed,
                              uint64_t total,
-                             std::optional<lldb::user_id_t> debugger_id);
+                             std::optional<lldb::user_id_t> debugger_id, bool is_discrete);
 
   static void ReportDiagnosticImpl(DiagnosticEventData::Type type,
                                    std::string message,
diff --git a/lldb/include/lldb/Core/DebuggerEvents.h b/lldb/include/lldb/Core/DebuggerEvents.h
index 982b22229701f89..88455d8d60bb488 100644
--- a/lldb/include/lldb/Core/DebuggerEvents.h
+++ b/lldb/include/lldb/Core/DebuggerEvents.h
@@ -21,10 +21,11 @@ class Stream;
 class ProgressEventData : public EventData {
 public:
   ProgressEventData(uint64_t progress_id, std::string title, std::string update,
-                    uint64_t completed, uint64_t total, bool debugger_specific)
+                    uint64_t completed, uint64_t total, bool debugger_specific,
+                    bool is_discrete)
       : m_title(std::move(title)), m_details(std::move(update)),
         m_id(progress_id), m_completed(completed), m_total(total),
-        m_debugger_specific(debugger_specific) {}
+        m_debugger_specific(debugger_specific), m_is_discrete(is_discrete) {}
 
   static llvm::StringRef GetFlavorString();
 
@@ -52,6 +53,7 @@ class ProgressEventData : public EventData {
   const std::string &GetTitle() const { return m_title; }
   const std::string &GetDetails() const { return m_details; }
   bool IsDebuggerSpecific() const { return m_debugger_specific; }
+  bool IsDiscrete() const { return m_is_discrete; }
 
 private:
   /// The title of this progress event. The value is expected to remain stable
@@ -68,6 +70,7 @@ class ProgressEventData : public EventData {
   uint64_t m_completed;
   const uint64_t m_total;
   const bool m_debugger_specific;
+  const bool m_is_discrete;
   ProgressEventData(const ProgressEventData &) = delete;
   const ProgressEventData &operator=(const ProgressEventData &) = delete;
 };
diff --git a/lldb/include/lldb/Core/Progress.h b/lldb/include/lldb/Core/Progress.h
index b2b8781a43b0591..a48255fc88cf69b 100644
--- a/lldb/include/lldb/Core/Progress.h
+++ b/lldb/include/lldb/Core/Progress.h
@@ -69,8 +69,13 @@ class Progress {
   ///
   /// @param [in] debugger An optional debugger pointer to specify that this
   /// progress is to be reported only to specific debuggers.
+  ///
+  /// @param [in] is_discrete Boolean indicating whether or not
+  /// this progress report will happen once during a debug session or multiple
+  /// times as individual progress reports.
   Progress(std::string title, uint64_t total = UINT64_MAX,
-           lldb_private::Debugger *debugger = nullptr);
+           lldb_private::Debugger *debugger = nullptr,
+           bool is_discrete = false);
 
   /// Destroy the progress object.
   ///
@@ -110,6 +115,9 @@ class Progress {
   /// to ensure that we don't send progress updates after progress has
   /// completed.
   bool m_complete = false;
+  /// Set to true if the progress event is discrete; meaning it will happen
+  /// multiple times during a debug session as individual progress events
+  bool m_is_discrete = false;
 };
 
 } // namespace lldb_private
diff --git a/lldb/source/Core/Debugger.cpp b/lldb/source/Core/Debugger.cpp
index 7ec1efc64fe9383..027b01cb2297e31 100644
--- a/lldb/source/Core/Debugger.cpp
+++ b/lldb/source/Core/Debugger.cpp
@@ -1402,22 +1402,23 @@ void Debugger::SetDestroyCallback(
 static void PrivateReportProgress(Debugger &debugger, uint64_t progress_id,
                                   std::string title, std::string details,
                                   uint64_t completed, uint64_t total,
-                                  bool is_debugger_specific) {
+                                  bool is_debugger_specific, bool is_discrete) {
   // Only deliver progress events if we have any progress listeners.
   const uint32_t event_type = Debugger::eBroadcastBitProgress;
   if (!debugger.GetBroadcaster().EventTypeHasListeners(event_type))
     return;
   EventSP event_sp(new Event(
-      event_type,
-      new ProgressEventData(progress_id, std::move(title), std::move(details),
-                            completed, total, is_debugger_specific)));
+      event_type, new ProgressEventData(progress_id, std::move(title),
+                                        std::move(details), completed, total,
+                                        is_debugger_specific, is_discrete)));
   debugger.GetBroadcaster().BroadcastEvent(event_sp);
 }
 
 void Debugger::ReportProgress(uint64_t progress_id, std::string title,
                               std::string details, uint64_t completed,
                               uint64_t total,
-                              std::optional<lldb::user_id_t> debugger_id) {
+                              std::optional<lldb::user_id_t> debugger_id,
+                              bool is_discrete) {
   // Check if this progress is for a specific debugger.
   if (debugger_id) {
     // It is debugger specific, grab it and deliver the event if the debugger
@@ -1426,7 +1427,8 @@ void Debugger::ReportProgress(uint64_t progress_id, std::string title,
     if (debugger_sp)
       PrivateReportProgress(*debugger_sp, progress_id, std::move(title),
                             std::move(details), completed, total,
-                            /*is_debugger_specific*/ true);
+                            /*is_debugger_specific*/ true,
+                            /*is_discrete*/ is_discrete);
     return;
   }
   // The progress event is not debugger specific, iterate over all debuggers
@@ -1436,7 +1438,7 @@ void Debugger::ReportProgress(uint64_t progress_id, std::string title,
     DebuggerList::iterator pos, end = g_debugger_list_ptr->end();
     for (pos = g_debugger_list_ptr->begin(); pos != end; ++pos)
       PrivateReportProgress(*(*pos), progress_id, title, details, completed,
-                            total, /*is_debugger_specific*/ false);
+                            total, /*is_debugger_specific*/ false, is_discrete);
   }
 }
 
diff --git a/lldb/source/Core/DebuggerEvents.cpp b/lldb/source/Core/DebuggerEvents.cpp
index dd77fff349a64a7..6720d84131bfc39 100644
--- a/lldb/source/Core/DebuggerEvents.cpp
+++ b/lldb/source/Core/DebuggerEvents.cpp
@@ -67,6 +67,7 @@ ProgressEventData::GetAsStructuredData(const Event *event_ptr) {
   dictionary_sp->AddIntegerItem("total", progress_data->GetTotal());
   dictionary_sp->AddBooleanItem("debugger_specific",
                                 progress_data->IsDebuggerSpecific());
+  dictionary_sp->AddBooleanItem("is_discrete", progress_data->IsDiscrete());
 
   return dictionary_sp;
 }
diff --git a/lldb/source/Core/Progress.cpp b/lldb/source/Core/Progress.cpp
index 08be73f1470f349..543933af4c53fdd 100644
--- a/lldb/source/Core/Progress.cpp
+++ b/lldb/source/Core/Progress.cpp
@@ -17,7 +17,7 @@ using namespace lldb_private;
 std::atomic<uint64_t> Progress::g_id(0);
 
 Progress::Progress(std::string title, uint64_t total,
-                   lldb_private::Debugger *debugger)
+                   lldb_private::Debugger *debugger, bool is_discrete)
     : m_title(title), m_id(++g_id), m_completed(0), m_total(total) {
   assert(total > 0);
   if (debugger)
@@ -55,6 +55,6 @@ void Progress::ReportProgress(std::string update) {
     // complete.
     m_complete = m_completed == m_total;
     Debugger::ReportProgress(m_id, m_title, std::move(update), m_completed,
-                             m_total, m_debugger_id);
+                             m_total, m_debugger_id, m_is_discrete);
   }
 }
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
index 3e52c9e3c042811..f50a3be75a7f79b 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
@@ -2219,7 +2219,7 @@ void ObjectFileMachO::ParseSymtab(Symtab &symtab) {
   const FileSpec &file = m_file ? m_file : module_sp->GetFileSpec();
   const char *file_name = file.GetFilename().AsCString("<Unknown>");
   LLDB_SCOPED_TIMERF("ObjectFileMachO::ParseSymtab () module = %s", file_name);
-  Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name));
+  Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name), UINT64_MAX, nullptr, true);
 
   llvm::MachO::symtab_command symtab_load_command = {0, 0, 0, 0, 0, 0};
   llvm::MachO::linkedit_data_command function_starts_load_command = {0, 0, 0, 0};
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
index 57b962ff60df03d..96c0cb1e69c7562 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
@@ -76,7 +76,7 @@ void ManualDWARFIndex::Index() {
   const uint64_t total_progress = units_to_index.size() * 2 + 8;
   Progress progress(
       llvm::formatv("Manually indexing DWARF for {0}", module_desc.GetData()),
-      total_progress);
+      total_progress, nullptr, true);
 
   std::vector<IndexSet> sets(units_to_index.size());
 
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 02037e7403cd9ad..c4db5f1d1ee5b11 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -457,7 +457,7 @@ void SymbolFileDWARF::InitializeObject() {
     if (apple_names.GetByteSize() > 0 || apple_namespaces.GetByteSize() > 0 ||
         apple_types.GetByteSize() > 0 || apple_objc.GetByteSize() > 0) {
       Progress progress(llvm::formatv("Loading Apple DWARF index for {0}",
-                                      module_desc.GetData()));
+                                      module_desc.GetData()), UINT64_MAX, nullptr, true);
       m_index = AppleDWARFIndex::Create(
           *GetObjectFile()->GetModule(), apple_names, apple_namespaces,
           apple_types, apple_objc, m_context.getOrLoadStrData());
@@ -470,7 +470,7 @@ void SymbolFileDWARF::InitializeObject() {
     LoadSectionData(eSectionTypeDWARFDebugNames, debug_names);
     if (debug_names.GetByteSize() > 0) {
       Progress progress(
-          llvm::formatv("Loading DWARF5 index for {0}", module_desc.GetData()));
+        llvm::formatv("Loading DWARF5 index for {0}", module_desc.GetData()), UINT64_MAX, nullptr, true);
       llvm::Expected<std::unique_ptr<DebugNamesDWARFIndex>> index_or =
           DebugNamesDWARFIndex::Create(*GetObjectFile()->GetModule(),
                                        debug_names,
diff --git a/lldb/source/Symbol/LocateSymbolFile.cpp b/lldb/source/Symbol/LocateSymbolFile.cpp
index 66ee7589ac60499..ce1149370539a6e 100644
--- a/lldb/source/Symbol/LocateSymbolFile.cpp
+++ b/lldb/source/Symbol/LocateSymbolFile.cpp
@@ -269,7 +269,7 @@ Symbols::LocateExecutableSymbolFile(const ModuleSpec &module_spec,
 
   Progress progress(llvm::formatv(
       "Locating external symbol file for {0}",
-      module_spec.GetFileSpec().GetFilename().AsCString("<Unknown>")));
+      module_spec.GetFileSpec().GetFilename().AsCString("<Unknown>")), UINT64_MAX, nullptr, true);
 
   FileSpecList debug_file_search_paths = default_search_paths;
 
diff --git a/lldb/test/API/functionalities/progress_reporting/TestProgressReporting.py b/lldb/test/API/functionalities/progress_reporting/TestProgressReporting.py
index 0e72770e350366d..b71984772bb1fd1 100644
--- a/lldb/test/API/functionalities/progress_reporting/TestProgressReporting.py
+++ b/lldb/test/API/functionalities/progress_reporting/TestProgressReporting.py
@@ -25,8 +25,8 @@ def test_dwarf_symbol_loading_progress_report(self):
         event = lldbutil.fetch_next_event(self, self.listener, self.broadcaster)
         ret_args = lldb.SBDebugger.GetProgressFromEvent(event)
         self.assertGreater(len(ret_args), 0)
-        message = ret_args[0]
-        self.assertGreater(len(message), 0)
+        title = ret_args[0]
+        self.assertGreater(len(title), 0)
 
     def test_dwarf_symbol_loading_progress_report_structured_data(self):
         """Test that we are able to fetch dwarf symbol loading progress events
@@ -37,5 +37,8 @@ def test_dwarf_symbol_loading_progress_report_structured_data(self):
 
         event = lldbutil.fetch_next_event(self, self.listener, self.broadcaster)
         progress_data = lldb.SBDebugger.GetProgressDataFromEvent(event)
-        message = progress_data.GetValueForKey("message").GetStringValue(100)
-        self.assertGreater(len(message), 0)
+        title = progress_data.GetValueForKey("title").GetStringValue(100)
+        self.assertGreater(len(title), 0)
+
+        is_discrete = progress_data.GetValueForKey("is_discrete")
+        self.assertTrue(is_discrete, "is_discrete should be true")

@@ -1426,7 +1427,8 @@ void Debugger::ReportProgress(uint64_t progress_id, std::string title,
if (debugger_sp)
PrivateReportProgress(*debugger_sp, progress_id, std::move(title),
std::move(details), completed, total,
/*is_debugger_specific*/ true);
/*is_debugger_specific*/ true,
/*is_discrete*/ is_discrete);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need for the comment here, since the argument is not a boolean literal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove it, left in there by mistake.

@@ -2219,7 +2219,7 @@ void ObjectFileMachO::ParseSymtab(Symtab &symtab) {
const FileSpec &file = m_file ? m_file : module_sp->GetFileSpec();
const char *file_name = file.GetFilename().AsCString("<Unknown>");
LLDB_SCOPED_TIMERF("ObjectFileMachO::ParseSymtab () module = %s", file_name);
Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name));
Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name), UINT64_MAX, nullptr, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit unfortunate that you have to specify all the other default parameters to set is_discret to true ... may be it would be worth to make the first default argument so you can omit the 2 others default arguments ?

Copy link
Contributor Author

@chelcassanova chelcassanova Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that would make sense since the first argument right now is the completion amount which not as many progress reports use.

@@ -25,8 +25,8 @@ def test_dwarf_symbol_loading_progress_report(self):
event = lldbutil.fetch_next_event(self, self.listener, self.broadcaster)
ret_args = lldb.SBDebugger.GetProgressFromEvent(event)
self.assertGreater(len(ret_args), 0)
message = ret_args[0]
self.assertGreater(len(message), 0)
title = ret_args[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the point of this change ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In older versions of Progress the message that got displayed to the user was called message but it was changed to title when progress reports got updated to have details as a field, so I changed the name in the test to match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, my point was that this change is unrelated to this PR and also NFC, which basically means we're re-writing the git history for no good reason. We try to avoid that in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, I'll remove this change then.

Copy link
Member

@medismailben medismailben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine but it doesn't look like is_discrete is being propagated correctly everywhere?

@@ -16,7 +16,7 @@ using namespace lldb_private;

std::atomic<uint64_t> Progress::g_id(0);

Progress::Progress(std::string title, uint64_t total,
Progress::Progress(std::string title, bool is_discrete, uint64_t total,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You added this parameter but it doesn't look like it's used anywhere in Progress? Was this supposed to go to m_is_discrete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it was, left it out by mistake.

self.assertGreater(len(title), 0)

is_discrete = progress_data.GetValueForKey("is_discrete")
self.assertTrue(is_discrete, "is_discrete should be true")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion message isn't very helpful or actionable. I would probably do something like:

Progress event data is missing `is_discrete` key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you could check if the dictionary has the key is_discrete AND if its value is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll update the test to add this.

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking more about this last night, I'm not sure the word "discrete" is the right word for what this describes. It's not a bad word per se, but if a ProgressEvent isn't discrete, it's not really "continuous" either (or any of the other antonyms of discrete that I can think of). It seems like the dichotomy is less "discrete vs continuous" and more like "one vs many".

That being said, what do you think of the name is_aggregate?

@chelcassanova
Copy link
Contributor Author

"Aggregate" can work. These are moreso status updates than than they are progress reports and I need to work on the documentation in the code that explains what I'm trying to accomplish.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already discussed this offline with Chelsea, but yeah I agree that "discrete" isn't very informative. Regardless of the name we pick, we should include an example too to help clarify its meaning. I'd also prefer to use an enum rather than a boolean for this, so that we don't have naked true/false all over the place or avoid inline comments.

Copy link
Member

@medismailben medismailben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with comments.

Copy link

github-actions bot commented Nov 29, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@chelcassanova chelcassanova force-pushed the progress_reports branch 2 times, most recently from 004e0e6 to 83ca6c8 Compare November 30, 2023 19:31
Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There looks to be a ton of unrelated changes in ObjectFileMachO.cpp. Mostly formatting from what I can tell. Please revert to preserve blame history.

@chelcassanova chelcassanova force-pushed the progress_reports branch 2 times, most recently from 417d99f to 028d5b9 Compare December 1, 2023 21:19
@chelcassanova
Copy link
Contributor Author

@clayborg Just pinging on this PR (when you have the chance) :)

@clayborg
Copy link
Collaborator

clayborg commented Dec 5, 2023

So we have a function in SBDebugger:

  static const char *GetProgressFromEvent(const lldb::SBEvent &event,
                                          uint64_t &progress_id,
                                          uint64_t &completed, uint64_t &total,
                                          bool &is_debugger_specific);

Do we need an overload to allow access to the Progress::ProgressReportType? If so, then we need to move the Progress::ProgressReportType enum into lldb-enumerations. In our public API we can never remove or change the arguments in a function call, so we would need to add another overload if we want this.

We also have:

  static lldb::SBStructuredData lldb::SBDebugger::GetProgressDataFromEvent(const lldb::SBEvent &event);

I am guessing that the report type shows up in here?

Comment on lines +59 to +62
enum class ProgressReportType {
eAggregateProgressReport,
eNonAggregateProgressReport
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need to belong in lldb-enumerations if we want SBDebugger::GetProgressFromEvent(...) to have an overload where this is supplied.

@chelcassanova
Copy link
Contributor Author

The report type has been added as a key in the dictionary that gets returned from SBDebugger::GetProgressDataFromEvent (specifically it's the key is_aggregate) and that key is added in DebuggerEvents.cpp. To my understanding since we can get the report type from GetProgressDataFromEvent we shouldn't have to overload GetProgressFromEvent.

@@ -63,13 +68,30 @@ class Progress {
///
/// @param [in] title The title of this progress activity.
///
/// @param [in] total The total units of work to be done if specified, if
/// set to UINT64_MAX then an indeterminate progress indicator should be
/// @param [in] report_type Enum value indicating how the progress is being
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally these start with @param [in] variable_name, and then on the next line the description starts. I see @clayborg wrote the current header in this style, maybe it was a clang-format artifact? Or possibly the preferred style has shifted while I wasn't paying attention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've clang-formatted these files and that didn't change this comment style so I guess clang-format is just going with was already here, strange 🤔

/// reported. Progress reports considered "aggregate" are reports done for
/// operations that may happen multiple times during a debug session.
///
/// For example, when a debug session is first started it needs to parse the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the case that I attach to a running process and I find 100 binaries loaded in the process. I create Module/ObjectFile's for each binary, including parsing the symbol table, possibly looking for a dSYM or a dwp symbol file with debug information for each of those. If dSYMs are an example of additional binaries that may need to be scanned (and report progress as they're being scanned), is it sufficient to simply mention how each of the hundred binaries in the process will report a status update as its Module is created, and not mention dSYMs specifically?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd want to use this enum to indicate that parsing symbol tables and loading DWARF indices. I used the term dSYM in this example which makes it look like this is a macOS specific issue when it's more generic so I'll change this, thanks for noticing!

@@ -97,12 +119,17 @@ class Progress {
/// The title of the progress activity.
std::string m_title;
std::mutex m_mutex;
/// Set to eNonAggregateProgressReport if the progress event is aggregate;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by this comment. Set to eNonAggregateProgressReport if the progress report is aggregate?

Copy link
Contributor Author

@chelcassanova chelcassanova Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mistake by me 😓 , it should say eAggregateProgressReport

/// Enum that indicates the type of progress report
enum class ProgressReportType {
eAggregateProgressReport,
eNonAggregateProgressReport
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it might be a little easier to intuit if the enums were eAggregateProgressReport and eIndividualProgressReport? Then someone might look at this and say "oh I have a single thing I'm providing progress on, I probably want eIndividualProgressReport". I don't feel strongly that I'm correct here, just throwing it out.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe eSolitaryProgressReport

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eUmbrellaProgressReport may work here since that's what's happening when we use a single progress report that's being incrementally updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eUmbrellaProgressReport may work here since that's what's happening when we use a single progress report that's being incrementally updated.

What is happening multiple times a session? We only index the DWARF one time where we have N DWARF units that we index, but that doesn't happen at all for dSYM files. Symbol tables are parsed only once, but we do parse the symbol tables for each library. What is happening multiple times and how are we going to be able to group these?

Would it be better to use progress category as a string instead of a bool? I will make some inline comments to clarify this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding something like symbol table parsing can happen multiple times per session if new symbol tables need to be parsed due to new libraries being added. We want to send the progress events to the IDE team who can then filter the events based on this enum.

This commit adds a boolean flag `is_discrete` is to progress reports in
LLDB. The flag is set to false by default and indicates if a progress event is discrete, i.e. an
operation that has no clear start and end and can happen multiple times
during the course of a debug session. Operations that happen
in this manner will report multiple individual progress events as they
happen, so this flag gives the functionality to group multiple progress
events so they can be reported in a less haphazard manner.
Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just trying to understand how this will be used. Because right now we are grouping all symbol table and debug info indexing into a single Progress::ProgressReportType::eAggregateProgressReport category and I don't see how this helps a UI show something more useful

Comment on lines +2228 to +2229
Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name),
Progress::ProgressReportType::eAggregateProgressReport);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a progress category as a string instead of a report type? Then this would be easy to group all of symbol table parsing into one section.

std::string category("Symbol table parsing");
Progress progress(llvm::formatv("Parsing symbol table for {0}", file_name), category);

I used a local variable for the category for clarity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the functionality to add details to a progress report was added in LLDB we wanted to group operations under large progress reports and then incrementally update them but we found that there was no easy place to do that for symbol table and debug info operations so for now, we want to us a flag to indicate to a UI how these operations are reported.

We did find that it was possible to group Swift-based operations into their own large progress reports (swiftlang#7769) and so we'd want to have these operations be the "non-aggregate/umbrella" operations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a string to represent the category (one of a possible vector of strings for each operation even?) looks like a good idea though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a string to represent the category (one of a possible vector of strings for each operation even?) looks like a good idea though.

Seems to me that a category string that could be anything ("Parsing symbol tables" or "Indexing DWARF" or "Parsing DWARF") would allow us to group similar notifications together in an IDE better than a single ProgressReportType::eAggregateProgressReport vs ProgressReportType::eNonAggregateProgressReport enum value. With this solution, does this mean that all notifications that with is_aggregate == true would show up in one area and is_aggregate == false would show up another area? If we use a category string, any concurrent notifications that share the same category could show up under some UI that displayed the category string and allowed it to be expanded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is that if is_aggregate == false then the progress report type can be grouped together in LLDB under one large umbrella progress report for that operation whereas if is_aggregate == true then the IDE could potentially filter out those progress reports since each progress report happening individually means that they look spammy in an IDE.

Instead of using a string as a category field we could have each progress report type/category be an enum in lldb_enumerations.h i.e. ProgressReportType/Category == eProgressSymbolTableParse or eProgressDWARFIndexing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we looked at how to group progress notifications by is_aggregate we found that although the symbol table/DWARF progress notifications would be grouped under is_aggregate == true, long-running operations related to Swift (importing/loading Swift modules etc.) could actually be grouped under is_aggregate == false which allows the IDE team to display and group those notifications. (swiftlang#7769). We want to integrate this enum flag into the Swift progress reports once landed.

That being said, this does group long-running LLDB notifications as all being too spammy to show since we couldn't group them together under umbrella progress reports. We don't want to keep it this way forever as ultimately we'd like to group all categories of progress reports under their own umbrella Progress objects instead of sending several individual progress reports, but this is what we're doing right now to help the IDE have more streamlined UI progress reports.

We also thought about going with the timeout approach that you suggested but ran into similar issues as you mentioned wherein if many reports were too short then they wouldn't get displayed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we do want all of the reports to be sent, this seems like a GUI / display issue that can be solved with category names. For you guys the DWARF isn't a big contributor and we want to always see those since on linux a lot of DWARF is not indexed and that is a major contributor to our startup times. Symbol table parsing takes time on mac and linux as well. So everyone is going to classify what progress is interesting in their own way. With a single enum for what Apple users might consider "aggregate" doesn't match up with what Linux or other users consider aggregate and we are now going to be hard coded this into the progress objectsd and that is what is worrying me about the current approach.

The user interface is what needs to figure out how to display these things efficiently and the category string seems to allow user interfaces to do more grouping in more efficient ways. The boolean of is_aggregate is a bit too coarse. With the category strings like "Parsing symbol tables" or "Manually Indexing DWARF" or "Loading DWARF accelerator tables", it would allow spammy stuff to be noticed on the, and then instead of displaying individual entries, it would display a single entry with the category name and a spinning progress GUI. Of course you could hard code things based on the category name all the time in Xcode so that "Parsing symbol tables" and "Loading DWARF accelerator tables" would always be grouped if desired to keep the noise down.

If no one else agrees with me here, please let me know

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clayborg I understand your point but how would a category string be different from filtering out the progress events based on the progress title string (or parts of it) ?

I see how certain progress reports could be relevant to linux where we wouldn't want to surface it on macOS. The reason we came up with this approach is precisely because we prefer having LLDB tell its clients what progress should be displayed, rather than having the filtering logic be done in the client.

May we could keep a list of blacklisted progress title / categories for every platform and either submit progress reports if they're not in that list or have an SBAPI to fetch the list of blacklisted progresses from the client ? What do you think about this ?

(also cc @JDevlieghere since he will probably have some opinions about this).

Copy link
Collaborator

@clayborg clayborg Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clayborg I understand your point but how would a category string be different from filtering out the progress events based on the progress title string (or parts of it) ?

It would stop clients from having to add logic to try and look for common substrings, but other than that not much different. It would allow progress titles to vary quite a bit more, but they don't right now.

I see how certain progress reports could be relevant to linux where we wouldn't want to surface it on macOS. The reason we came up with this approach is precisely because we prefer having LLDB tell its clients what progress should be displayed, rather than having the filtering logic be done in the client.

Not sure I follow? Are you saying if is_aggregate == true that we are letting users know they should ignore these progress reports? And we are setting to true for all progress reports except for Swift and a few other rarely used ones?

May we could keep a list of blacklisted progress title / categories for every platform and either submit progress reports if they're not in that list or have an SBAPI to fetch the list of blacklisted progresses from the client ? What do you think about this ?

One ideas is if we have categories, we could have settings that could control if the progress reports for a given category should be displayed. Those could easily be set after initializing a debugger object.

(also cc @JDevlieghere since he will probably have some opinions about this).

Sounds good

The entire idea of the progress dialogs is to provide insight for when the debugger seems to be stalled, but it is actually doing work. If we have any progress items that are not useful, we should remove them. But parsing symbol tables, even on Mac, can sometimes take a few seconds on larger binaries, so I find this useful when running LLDB from the command line. Indexing DWARF is a huge time sync for really large projects which often can take many minutes to complete and when I load them on the command line, seeing the progress in the terminal is very useful.

So a few things that seem to stand out to me:

  • Darwin will never see the Manually indexing DWARF for <path> progress dialogs, and these are needed and should always be displayed.
  • Darwin does see the Parsing symbol table for <path> progress dialogs, but many are fast and probably are part of the problem you want to fix?
  • Loading Apple DWARF index for <path> has no real work being done here, just initializing a memory buffer, seems like it should probably be removed to cut down on the noise?
  • Loading DWARF5 index for <path> is the same kind of thing for both Darwin and linux (no real work to be done, the accelerator tables are "ready to use") so maybe we should remove them?
  • Locating external symbol file for <path> seems useful if the download can take a while

I don't see any other uses in the upstream LLDB sources.

But my main point is we make Progress dialogs for a reason and we should make sure they are useful and that everyone wants them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if we want to allow different platforms to say what is useful and what isn't we can provide commands to control these if needed. We can have an internal enumeration for the categories if needed, or we can add a string as the category to help say "please disable all symbol table parsing" progress dialogs.

@chelcassanova
Copy link
Contributor Author

It looks we have kind of an impasse here, I've posted an RFC to try and clear up any misunderstandings and get some input about how we can move forward with this: https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717

Looking forward to everyone's feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants