Skip to content

[lldb] Add ability to rate-limit progress reports #119377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 16, 2024
Merged

Conversation

labath
Copy link
Collaborator

@labath labath commented Dec 10, 2024

For high-frequency multithreaded progress reports, the contention of taking the progress mutex and the overhead of generating event can significantly slow down the operation whose progress we are reporting.

This patch adds an (optional) capability to rate-limit them. It's optional because this approach has one drawback: if the progress reporting has a pattern where it generates a burst of activity and then blocks (without reporting anything) for a large amount of time, it can appear as if less progress was made that it actually was (because we only reported the first event from the burst and dropped the other ones).

I've also made a small refactor of the Progress class to better distinguish between const (don't need protection), atomic (are used on the hot path) and other (need mutex protection) members.

@labath labath requested a review from clayborg December 10, 2024 13:34
@labath labath requested a review from JDevlieghere as a code owner December 10, 2024 13:34
@llvmbot llvmbot added the lldb label Dec 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 10, 2024

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

For high-frequency multithreaded progress reports, the contention of taking the progress mutex and the overhead of generating event can significantly slow down the operation whose progress we are reporting.

This patch adds an (optional) capability to rate-limit them. It's optional because this approach has one drawback: if the progress reporting has a pattern where it generates a burst of activity and then blocks (without reporting anything) for a large amount of time, it can appear as if less progress was made that it actually was (because we only reported the first event from the burst and dropped the other ones).

I've also made a small refactor of the Progress class to better distinguish between const (don't need protection), atomic (are used on the hot path) and other (need mutex protection) members.


Full diff: https://github.com/llvm/llvm-project/pull/119377.diff

4 Files Affected:

  • (modified) lldb/include/lldb/Core/Progress.h (+26-13)
  • (modified) lldb/source/Core/Progress.cpp (+53-30)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp (+2-1)
  • (modified) lldb/unittests/Core/ProgressReportTest.cpp (+106)
diff --git a/lldb/include/lldb/Core/Progress.h b/lldb/include/lldb/Core/Progress.h
index 421e435a9e685a..f6cea282842e1c 100644
--- a/lldb/include/lldb/Core/Progress.h
+++ b/lldb/include/lldb/Core/Progress.h
@@ -10,6 +10,7 @@
 #define LLDB_CORE_PROGRESS_H
 
 #include "lldb/Host/Alarm.h"
+#include "lldb/Utility/Timeout.h"
 #include "lldb/lldb-forward.h"
 #include "lldb/lldb-types.h"
 #include "llvm/ADT/StringMap.h"
@@ -81,7 +82,8 @@ class Progress {
   /// progress is to be reported only to specific debuggers.
   Progress(std::string title, std::string details = {},
            std::optional<uint64_t> total = std::nullopt,
-           lldb_private::Debugger *debugger = nullptr);
+           lldb_private::Debugger *debugger = nullptr,
+           Timeout<std::nano> minimum_report_time = std::nullopt);
 
   /// Destroy the progress object.
   ///
@@ -121,21 +123,32 @@ class Progress {
 private:
   void ReportProgress();
   static std::atomic<uint64_t> g_id;
-  /// More specific information about the current file being displayed in the
-  /// report.
-  std::string m_details;
-  /// How much work ([0...m_total]) that has been completed.
-  uint64_t m_completed;
+
   /// Total amount of work, use a std::nullopt in the constructor for non
   /// deterministic progress.
-  uint64_t m_total;
-  std::mutex m_mutex;
-  /// Set to true when progress has been reported where m_completed == m_total
-  /// to ensure that we don't send progress updates after progress has
-  /// completed.
-  bool m_complete = false;
+  const uint64_t m_total;
+
+  // Minimum amount of time between two progress reports.
+  const Timeout<std::nano> m_minimum_report_time;
+
   /// Data needed by the debugger to broadcast a progress event.
-  ProgressData m_progress_data;
+  const ProgressData m_progress_data;
+
+  /// How much work ([0...m_total]) that has been completed.
+  std::atomic<uint64_t> m_completed = 0;
+
+  /// Time (in nanoseconds since epoch) of the last progress report.
+  std::atomic<uint64_t> m_last_report_time_ns;
+
+  /// Guards non-const non-atomic members of the class.
+  std::mutex m_mutex;
+
+  /// More specific information about the current file being displayed in the
+  /// report.
+  std::string m_details;
+
+  /// The "completed" value of the last reported event.
+  std::optional<uint64_t> m_prev_completed;
 };
 
 /// A class used to group progress reports by category. This is done by using a
diff --git a/lldb/source/Core/Progress.cpp b/lldb/source/Core/Progress.cpp
index c9a556472c06b6..14b05a3d96ab43 100644
--- a/lldb/source/Core/Progress.cpp
+++ b/lldb/source/Core/Progress.cpp
@@ -11,7 +11,8 @@
 #include "lldb/Core/Debugger.h"
 #include "lldb/Utility/StreamString.h"
 #include "llvm/Support/Signposts.h"
-
+#include <atomic>
+#include <chrono>
 #include <cstdint>
 #include <mutex>
 #include <optional>
@@ -26,17 +27,18 @@ static llvm::ManagedStatic<llvm::SignpostEmitter> g_progress_signposts;
 
 Progress::Progress(std::string title, std::string details,
                    std::optional<uint64_t> total,
-                   lldb_private::Debugger *debugger)
-    : m_details(details), m_completed(0),
-      m_total(Progress::kNonDeterministicTotal),
+                   lldb_private::Debugger *debugger,
+                   Timeout<std::nano> minimum_report_time)
+    : m_total(total.value_or(Progress::kNonDeterministicTotal)),
+      m_minimum_report_time(minimum_report_time),
       m_progress_data{title, ++g_id,
-                      /*m_progress_data.debugger_id=*/std::nullopt} {
-  if (total)
-    m_total = *total;
-
-  if (debugger)
-    m_progress_data.debugger_id = debugger->GetID();
-
+                      debugger ? std::optional<user_id_t>(debugger->GetID())
+                               : std::nullopt},
+      m_last_report_time_ns(
+          std::chrono::nanoseconds(
+              std::chrono::steady_clock::now().time_since_epoch())
+              .count()),
+      m_details(std::move(details)) {
   std::lock_guard<std::mutex> guard(m_mutex);
   ReportProgress();
 
@@ -65,29 +67,50 @@ Progress::~Progress() {
 
 void Progress::Increment(uint64_t amount,
                          std::optional<std::string> updated_detail) {
-  if (amount > 0) {
-    std::lock_guard<std::mutex> guard(m_mutex);
-    if (updated_detail)
-      m_details = std::move(updated_detail.value());
-    // Watch out for unsigned overflow and make sure we don't increment too
-    // much and exceed the total.
-    if (m_total && (amount > (m_total - m_completed)))
-      m_completed = m_total;
-    else
-      m_completed += amount;
-    ReportProgress();
+  if (amount == 0)
+    return;
+
+  m_completed.fetch_add(amount, std::memory_order_relaxed);
+
+  if (m_minimum_report_time) {
+    using namespace std::chrono;
+
+    nanoseconds now;
+    uint64_t last_report_time_ns =
+        m_last_report_time_ns.load(std::memory_order_relaxed);
+
+    do {
+      now = steady_clock::now().time_since_epoch();
+      if (now < nanoseconds(last_report_time_ns) + *m_minimum_report_time)
+        return; // Too little time has passed since the last report.
+
+    } while (!m_last_report_time_ns.compare_exchange_weak(
+        last_report_time_ns, now.count(), std::memory_order_relaxed,
+        std::memory_order_relaxed));
   }
+
+
+  std::lock_guard<std::mutex> guard(m_mutex);
+  if (updated_detail)
+    m_details = std::move(updated_detail.value());
+  ReportProgress();
 }
 
 void Progress::ReportProgress() {
-  if (!m_complete) {
-    // Make sure we only send one notification that indicates the progress is
-    // complete
-    m_complete = m_completed == m_total;
-    Debugger::ReportProgress(m_progress_data.progress_id, m_progress_data.title,
-                             m_details, m_completed, m_total,
-                             m_progress_data.debugger_id);
-  }
+  // NB: Comparisons with optional<T> rely on the fact that std::nullopt is
+  // "smaller" than zero.
+  if (m_prev_completed >= m_total)
+    return; // We've reported completion already.
+
+  uint64_t completed =
+      std::min(m_completed.load(std::memory_order_relaxed), m_total);
+  if (completed < m_prev_completed)
+    return; // An overflow in the m_completed counter. Just ignore these events.
+
+  Debugger::ReportProgress(m_progress_data.progress_id, m_progress_data.title,
+                           m_details, completed, m_total,
+                           m_progress_data.debugger_id);
+  m_prev_completed = completed;
 }
 
 ProgressManager::ProgressManager()
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
index 5b325e30bef430..6f2c45e74132c1 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
@@ -80,7 +80,8 @@ void ManualDWARFIndex::Index() {
   // indexing the unit, and then 8 extra entries for finalizing each index set.
   const uint64_t total_progress = units_to_index.size() * 2 + 8;
   Progress progress("Manually indexing DWARF", module_desc.GetData(),
-                    total_progress);
+                    total_progress, /*debugger=*/nullptr,
+                    /*minimum_report_time=*/std::chrono::milliseconds(20));
 
   // Share one thread pool across operations to avoid the overhead of
   // recreating the threads.
diff --git a/lldb/unittests/Core/ProgressReportTest.cpp b/lldb/unittests/Core/ProgressReportTest.cpp
index 0149b1de77add7..e99b1b84a2e3e6 100644
--- a/lldb/unittests/Core/ProgressReportTest.cpp
+++ b/lldb/unittests/Core/ProgressReportTest.cpp
@@ -18,6 +18,7 @@
 #include "gtest/gtest.h"
 #include <memory>
 #include <mutex>
+#include <thread>
 
 using namespace lldb;
 using namespace lldb_private;
@@ -208,6 +209,111 @@ TEST_F(ProgressReportTest, TestReportDestructionWithPartialProgress) {
   EXPECT_EQ(data->GetMessage(), "Infinite progress: Report 2");
 }
 
+TEST_F(ProgressReportTest, TestFiniteOverflow) {
+  ListenerSP listener_sp = CreateListenerFor(lldb::eBroadcastBitProgress);
+  EventSP event_sp;
+  const ProgressEventData *data;
+
+  // Increment the report beyond its limit and make sure we only get one
+  // completed event.
+  {
+    Progress progress("Finite progress", "Report 1", 10);
+    progress.Increment(11);
+    progress.Increment(47);
+  }
+
+  ASSERT_TRUE(listener_sp->GetEvent(event_sp, TIMEOUT));
+  data = ProgressEventData::GetEventDataFromEvent(event_sp.get());
+  EXPECT_TRUE(data->IsFinite());
+  EXPECT_EQ(data->GetCompleted(), 0);
+  EXPECT_EQ(data->GetTotal(), 10);
+
+  ASSERT_TRUE(listener_sp->GetEvent(event_sp, TIMEOUT));
+  data = ProgressEventData::GetEventDataFromEvent(event_sp.get());
+  EXPECT_TRUE(data->IsFinite());
+  EXPECT_EQ(data->GetCompleted(), 10);
+  EXPECT_EQ(data->GetTotal(), 10);
+
+  ASSERT_FALSE(listener_sp->GetEvent(event_sp, TIMEOUT));
+}
+
+TEST_F(ProgressReportTest, TestNonDeterministicOverflow) {
+  ListenerSP listener_sp = CreateListenerFor(lldb::eBroadcastBitProgress);
+  EventSP event_sp;
+  const ProgressEventData *data;
+  constexpr uint64_t max_minus_1 = std::numeric_limits<uint64_t>::max() - 1;
+
+  // Increment the report beyond its limit and make sure we only get one
+  // completed event. The event which overflows the counter should be ignored.
+  {
+    Progress progress("Finite progress", "Report 1");
+    progress.Increment(max_minus_1);
+    progress.Increment(max_minus_1);
+  }
+
+  ASSERT_TRUE(listener_sp->GetEvent(event_sp, TIMEOUT));
+  data = ProgressEventData::GetEventDataFromEvent(event_sp.get());
+  EXPECT_FALSE(data->IsFinite());
+  EXPECT_EQ(data->GetCompleted(), 0);
+  EXPECT_EQ(data->GetTotal(), Progress::kNonDeterministicTotal);
+
+  ASSERT_TRUE(listener_sp->GetEvent(event_sp, TIMEOUT));
+  data = ProgressEventData::GetEventDataFromEvent(event_sp.get());
+  EXPECT_FALSE(data->IsFinite());
+  EXPECT_EQ(data->GetCompleted(), max_minus_1);
+  EXPECT_EQ(data->GetTotal(), Progress::kNonDeterministicTotal);
+
+  ASSERT_TRUE(listener_sp->GetEvent(event_sp, TIMEOUT));
+  data = ProgressEventData::GetEventDataFromEvent(event_sp.get());
+  EXPECT_FALSE(data->IsFinite());
+  EXPECT_EQ(data->GetCompleted(), Progress::kNonDeterministicTotal);
+  EXPECT_EQ(data->GetTotal(), Progress::kNonDeterministicTotal);
+
+  ASSERT_FALSE(listener_sp->GetEvent(event_sp, TIMEOUT));
+}
+
+TEST_F(ProgressReportTest, TestMinimumReportTime) {
+  ListenerSP listener_sp = CreateListenerFor(lldb::eBroadcastBitProgress);
+  EventSP event_sp;
+  const ProgressEventData *data;
+
+  {
+    Progress progress("Finite progress", "Report 1", /*total=*/20,
+                      m_debugger_sp.get(),
+                      /*minimum_report_time=*/std::chrono::seconds(1));
+    // Send 10 events in quick succession. These should not generate any events.
+    for (int i = 0; i < 10; ++i)
+      progress.Increment();
+
+    // Sleep, then send 10 more. This should generate one event for the first
+    // increment, and then another for completion.
+    std::this_thread::sleep_for(std::chrono::seconds(1));
+    for (int i = 0; i < 10; ++i)
+      progress.Increment();
+  }
+
+  ASSERT_TRUE(listener_sp->GetEvent(event_sp, TIMEOUT));
+  data = ProgressEventData::GetEventDataFromEvent(event_sp.get());
+  EXPECT_TRUE(data->IsFinite());
+  EXPECT_EQ(data->GetCompleted(), 0);
+  EXPECT_EQ(data->GetTotal(), 20);
+
+  ASSERT_TRUE(listener_sp->GetEvent(event_sp, TIMEOUT));
+  data = ProgressEventData::GetEventDataFromEvent(event_sp.get());
+  EXPECT_TRUE(data->IsFinite());
+  EXPECT_EQ(data->GetCompleted(), 11);
+  EXPECT_EQ(data->GetTotal(), 20);
+
+  ASSERT_TRUE(listener_sp->GetEvent(event_sp, TIMEOUT));
+  data = ProgressEventData::GetEventDataFromEvent(event_sp.get());
+  EXPECT_TRUE(data->IsFinite());
+  EXPECT_EQ(data->GetCompleted(), 20);
+  EXPECT_EQ(data->GetTotal(), 20);
+
+  ASSERT_FALSE(listener_sp->GetEvent(event_sp, TIMEOUT));
+}
+
+
 TEST_F(ProgressReportTest, TestProgressManager) {
   ListenerSP listener_sp =
       CreateListenerFor(lldb::eBroadcastBitProgressCategory);

Copy link

github-actions bot commented Dec 10, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

For high-frequency multithreaded progress reports, the contention of
taking the progress mutex and the overhead of generating event can
significantly slow down the operation whose progress we are reporting.

This patch adds an (optional) capability to rate-limit them. It's
optional because this approach has one drawback: if the progress
reporting has a pattern where it generates a burst of activity and then
blocks (without reporting anything) for a large amount of time, it can
appear as if less progress was made that it actually was (because we
only reported the first event from the burst and dropped the other
ones).

I've also made a small refactor of the Progress class to better
distinguish between const (don't need protection), atomic (are used on
the hot path) and other (need mutex protection) members.
Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this!

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the cleanup and the implementation turned out simpler than I what I expected. LGTM.

// Increment the report beyond its limit and make sure we only get one
// completed event. The event which overflows the counter should be ignored.
{
Progress progress("Finite progress", "Report 1");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super tiny nit: should this be "non-deterministic progress" or something of the sort since we don't have a total set here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should. Thanks for catching that.

Copy link
Contributor

@chelcassanova chelcassanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@medismailben medismailben left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! LGTM!

@labath labath merged commit 0dbdc23 into llvm:main Dec 16, 2024
7 checks passed
@labath labath deleted the atomic branch December 16, 2024 10:35
JDevlieghere pushed a commit to swiftlang/llvm-project that referenced this pull request Jan 27, 2025
For high-frequency multithreaded progress reports, the contention of
taking the progress mutex and the overhead of generating event can
significantly slow down the operation whose progress we are reporting.

This patch adds an (optional) capability to rate-limit them. It's
optional because this approach has one drawback: if the progress
reporting has a pattern where it generates a burst of activity and then
blocks (without reporting anything) for a large amount of time, it can
appear as if less progress was made that it actually was (because we
only reported the first event from the burst and dropped the other
ones).

I've also made a small refactor of the Progress class to better
distinguish between const (don't need protection), atomic (are used on
the hot path) and other (need mutex protection) members.

(cherry picked from commit 0dbdc23)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants