-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb][progress][NFC] Add groundwork to keep track of progress reports #81026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lldb Author: Chelsea Cassanova (chelcassanova) ChangesAs part of the effort to improve progress reporting in LLDB (https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717) we want a way to keep track of progress reports to see if they're ongoing. The ultimate goal is to use this information to essentially keep these reports alive using timeouts to more gracefully deliver these reports. To lay the groundwork to do this, this commit adds a map to progress reports that keeps track of each category of reports (using the changes from #77547) and uses a refcount to check if these reports are still considered active or not. A refcount of at least one indicates that the report is still active. I've added an enum to Full diff: https://github.com/llvm/llvm-project/pull/81026.diff 2 Files Affected:
diff --git a/lldb/include/lldb/Core/Progress.h b/lldb/include/lldb/Core/Progress.h
index 5d882910246054..fc921482ee12e0 100644
--- a/lldb/include/lldb/Core/Progress.h
+++ b/lldb/include/lldb/Core/Progress.h
@@ -12,6 +12,7 @@
#include "lldb/Utility/ConstString.h"
#include "lldb/lldb-types.h"
#include <atomic>
+#include <map>
#include <mutex>
#include <optional>
@@ -55,6 +56,10 @@ namespace lldb_private {
class Progress {
public:
+ enum {
+ eProgressLinearReports,
+ eProgressCoalecseReports,
+ };
/// Construct a progress object that will report information.
///
/// The constructor will create a unique progress reporting object and
@@ -71,7 +76,7 @@ class Progress {
/// progress is to be reported only to specific debuggers.
Progress(std::string title, std::string details = {},
std::optional<uint64_t> total = std::nullopt,
- lldb_private::Debugger *debugger = nullptr);
+ lldb_private::Debugger *debugger = nullptr, bool type = Progress::eProgressLinearReports);
/// Destroy the progress object.
///
@@ -99,6 +104,9 @@ class Progress {
private:
void ReportProgress();
static std::atomic<uint64_t> g_id;
+ static std::atomic<uint64_t> g_refcount;
+ /// Map that tracks each progress object and if we've seen its start and stop events
+ static std::unordered_map<std::string, uint64_t> g_map;
/// The title of the progress activity.
std::string m_title;
std::string m_details;
@@ -117,6 +125,7 @@ class Progress {
/// to ensure that we don't send progress updates after progress has
/// completed.
bool m_complete = false;
+ bool m_type;
};
} // namespace lldb_private
diff --git a/lldb/source/Core/Progress.cpp b/lldb/source/Core/Progress.cpp
index 732efbc342b450..e79836033efe38 100644
--- a/lldb/source/Core/Progress.cpp
+++ b/lldb/source/Core/Progress.cpp
@@ -9,7 +9,6 @@
#include "lldb/Core/Progress.h"
#include "lldb/Core/Debugger.h"
-#include "lldb/Utility/StreamString.h"
#include <optional>
@@ -17,18 +16,27 @@ using namespace lldb;
using namespace lldb_private;
std::atomic<uint64_t> Progress::g_id(0);
+std::atomic<uint64_t> Progress::g_refcount(1);
+std::unordered_map<std::string, uint64_t> Progress::g_map = {};
Progress::Progress(std::string title, std::string details,
std::optional<uint64_t> total,
- lldb_private::Debugger *debugger)
+ lldb_private::Debugger *debugger, bool type)
: m_title(title), m_details(details), m_id(++g_id), m_completed(0),
- m_total(Progress::kNonDeterministicTotal) {
+ m_total(Progress::kNonDeterministicTotal), m_type(type) {
if (total)
m_total = *total;
if (debugger)
m_debugger_id = debugger->GetID();
std::lock_guard<std::mutex> guard(m_mutex);
+
+ if (m_type == Progress::eProgressCoalecseReports) {
+ g_map.emplace(title, g_refcount);
+
+ if (g_map.at(title) >= 1)
+ ++g_map.at(title);
+ }
ReportProgress();
}
@@ -38,6 +46,13 @@ Progress::~Progress() {
std::lock_guard<std::mutex> guard(m_mutex);
if (!m_completed)
m_completed = m_total;
+
+ if (m_type == Progress::eProgressCoalecseReports) {
+ --g_map.at(m_title);
+ if (g_map.at(m_title) == 0)
+ g_map.erase(m_title);
+ }
+
ReportProgress();
}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
As part of the effort to improve progress reporting in LLDB (https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717) we want a way to keep track of progress reports to see if they're ongoing. The ultimate goal is to use this information to essentially keep these reports alive using timeouts to more gracefully deliver these reports. To lay the groundwork to do this, this commit adds a map to progress reports that keeps track of each category of reports (using the changes from llvm#77547) and uses a refcount to check if these reports are still considered active or not. A refcount of at least one indicates that the report is still active. I've added an enum to `Progress.h` to toggle whether this behaviour is used or not. By default it is not used, so this commit is marked as NFC.
d10c1c4
to
a80637f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the progress map should be held by the debugger instance, not a static member of the Progress class, since some progress reports could be targeting a specific debugger.
@@ -55,6 +56,10 @@ namespace lldb_private { | |||
|
|||
class Progress { | |||
public: | |||
enum { | |||
eProgressLinearReports, | |||
eProgressCoalecseReports, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eProgressCoalecseReports, | |
eProgressCoalesceReports, |
@@ -55,6 +56,10 @@ namespace lldb_private { | |||
|
|||
class Progress { | |||
public: | |||
enum { | |||
eProgressLinearReports, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please comment what each enum value means
@@ -117,6 +127,7 @@ class Progress { | |||
/// to ensure that we don't send progress updates after progress has | |||
/// completed. | |||
bool m_complete = false; | |||
bool m_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing, why would you store your enum value in a bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holdover from when I used a bool for this value before switching an enum when I put the patch up.
static std::atomic<uint64_t> g_refcount; | ||
/// Map that tracks each progress object and if we've seen its start and stop | ||
/// events | ||
static std::unordered_map<std::string, uint64_t> g_map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this to be tied to the debugger instance instead of making it static here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We spoke about this a little offline but at the moment although Progress reports hold on to debugger instances, these instances are null by default and most Progress reports don't give a debugger instance in their instantiation so we wouldn't be able to have a debugger instance hold on to their own map of ongoing progress reports.
@clayborg Having a progress report holding on to a debugger instance is a good idea, but if we want to use a map to keep track of ongoing reports we need to know what will own the map? Do we want the map to be owned by a debugger instance or could the progress class hold on to its own map of ongoing reports? In the case of the former we would then want to enforce that progress reports have valid debugger instances and in the latter we could remove the debugger
field from the reports altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We spoke about this a little offline but at the moment although Progress reports hold on to debugger instances, these instances are null by default and most Progress reports don't give a debugger instance in their instantiation so we wouldn't be able to have a debugger instance hold on to their own map of ongoing progress reports.
@clayborg Having a progress report holding on to a debugger instance is a good idea, but if we want to use a map to keep track of ongoing reports we need to know what will own the map? Do we want the map to be owned by a debugger instance or could the progress class hold on to its own map of ongoing reports? In the case of the former we would then want to enforce that progress reports have valid debugger instances and in the latter we could remove the
debugger
field from the reports altogether.
The main question is if we want all progress reports to follow the currently selected mode (coalesce or individual) or if we want them to be able to pick which one they are. I would vote to have a global setting that controls if we deliver things, but would love to hear what others think.
Each Progress instance can have a debugger, but most don't, and those will get delivered to all debugger instances. So I would say that this map should live in each debugger object because Progress
objects can have a debugger set, and if they do, they will only get delivered to that debugger.
So my suggestion here is to:
- not change anything in Progress.h/cpp
- move the progress categorization code into each debugger object
- in
PrivateReportProgress
we check if anyone is listening to the neweBroadcastBitProgressByCategory
bit, and then and only then do we do any of the category counting.
So the code in PrivateReportProgress
will now look something like:
static void PrivateReportProgress(Debugger &debugger, uint64_t progress_id,
std::string title, std::string details,
uint64_t completed, uint64_t total,
bool is_debugger_specific) {
// Only deliver progress events if we have any progress listeners.
if (debugger.GetBroadcaster().EventTypeHasListeners(Debugger::eBroadcastBitProgressByCategory))
PrivateReportProgressByCategory(debugger, progress_id, is_debugger_specific, completed, total, is_debugger_specific);
const uint32_t event_type = Debugger::eBroadcastBitProgress;
if (!debugger.GetBroadcaster().EventTypeHasListeners(event_type))
return;
EventSP event_sp(new Event(
event_type,
new ProgressEventData(progress_id, std::move(title), std::move(details),
completed, total, is_debugger_specific)));
debugger.GetBroadcaster().BroadcastEvent(event_sp);
}
Then we write a new PrivateReportProgressByCategory()
function that will do all this book keeping and coordingate with the debugger which will own the category ref count map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the debugger instance keep track of a map of progress reports falls in line with what I spoke about with Ismail offline and is the direction I'd vote to go in so that the bookkeeping isn't done in the progress objects (I'm sure this isn't impossible, but given that LLDB is hierarchical it would make more sense that debuggers have progress reports rather than progress reports looking after themselves).
So we still want to keep is_debugger_specific
/ having a debugger pointer in a progress object, but do all the bookkeeping in all active debuggers and using that debugger pointer to only update the specific reports as needed if I'm understanding your approach correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. Because we have to just in case the debugger pointer was set in the Progress, so we have to track it in each debugger just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of doing this at the debugger level? The way I imagined this to work is like the other subsystems (similar to what Greg described inline). The downside of doing the bookkeeping in the debugger, is that we'll have to duplicate all this work across debuggers for progress events that are not tied to a single debugger. I don't know how easy it would be to change that (if we have access to the debugger everywhere we report progress today). Either way I'd like to understand the motivation behind it.
|
||
#include <optional> | ||
|
||
using namespace lldb; | ||
using namespace lldb_private; | ||
|
||
std::atomic<uint64_t> Progress::g_id(0); | ||
std::atomic<uint64_t> Progress::g_refcount(1); | ||
std::unordered_map<std::string, uint64_t> Progress::g_map = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you move it to the debugger instance, you don't need this anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I added comments on fixes to the current implementation in inline comments.
A few questions I have:
- do we really want each progress to select if it should be coalesced as a
Progress
constructor arguments? Or do we want a global setting on how progress events should be delivered? - The current code adds a way to increment and decrement refcounts of ongoing progress categories (titles), but doesn't do anything with them, what do we envision happening with these ref counts?
But overall I would think we would want to keep track of this at a higher layer than in this class. Like in the code that would be delivering the progress events. If we do it that way, then we could have a global setting that controls if users want events delivered by category or individually. We actually won't need a setting for this as we can just see who is listening to the process events. Anyone listening to SBDebugger::eBroadcastBitProgress
would get progress events delivered on a one by one basis, and anyone listening to a new progress bit like SBDebugger::eBroadcastBitProgressByCategory
would just get different events delivered and all of this ref counting by title code from here would be moved into Debugger.cpp where the other progress events are being handled already.
enum { | ||
eProgressLinearReports, | ||
eProgressCoalecseReports, | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a new setting for this? Something like:
(lldb) settings set progress-report [individual|grouped]
std::atomic<uint64_t> Progress::g_id(0); | ||
std::atomic<uint64_t> Progress::g_refcount(1); | ||
std::unordered_map<std::string, uint64_t> Progress::g_map = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When something isn't just an integer, like the map here, we can run into issues with this in a multi-threaded environment. Why? These variables are part of the global C++ constructor / destructor chain. If any thread uses these after the main thread shuts down it can cause a crash. So for global objects we typically will create a pointer to one and leak it with a note. See lldb/source/Core/Debugger.cpp
and look for g_debugger_list_mutex_ptr
and other variables right next to them and then see Debugger::Initialize()
.
This map also need to be threadsafe, so it needs a mutex. Might be a good idea to create a class to contain these and use an accessor to provide access to it. The solution below will make something that is thread safe and also something that will survive the global destructor chain being called and letting other threads still be able to use Progress objects without crashing the process:
namespace {
class ProgressCategoryMap {
std::mutex m_mutex;
std::unordered_map<std::string, uint64_t> m_map;
void Increment(std::string title) {
std::lock_guard<std::mutex> lock(m_mutex);
auto pair = g_map.emplace(title, 1); // No need to use g_refcount as it is just a constant 1
// pair.first will be true if insertion happened.
if (pair.first == false)
++pair.second; // Key already in map, increment the ref count.
}
void Decrement(std::string title) {
std::lock_guard<std::mutex> lock(m_mutex);
auto pos = g_map.find(title);
if (pos == g_map.end())
return;
if (--pos->second == 0)
g_map.erase(pos->second); // Decremented to zero
}
};
// Now make a global accessor to get a threadsafe copy of ProgressCategoryMap
ProgressCategoryMap *GetProgressCategoryMap() {
static std::once_flag g_once;
static ProgressCategoryMap *g_category_map_ptr = nullptr;
std::call_once(g_once, []{
// NOTE: intentional leak to avoid issues with C++ destructor chain
g_category_map_ptr = new ProgressCategoryMap();
});
return g_category_map_ptr;
}
} // anonymous namespace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much better implementation of using a global map, but if we go with having the map being kept track of in each debugger object we might not need to have this implementation of a global map anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, might not be needed. Still needs to be threadsafe though if we do move it into the Debugger.h/.cpp
|
||
#include <optional> | ||
|
||
using namespace lldb; | ||
using namespace lldb_private; | ||
|
||
std::atomic<uint64_t> Progress::g_id(0); | ||
std::atomic<uint64_t> Progress::g_refcount(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't needed. This value doesn't change and is always 1, so you can either make it a constexpr in the class description, or it isn't really needed at all actually.
g_map.emplace(title, g_refcount); | ||
|
||
if (g_map.at(title) >= 1) | ||
++g_map.at(title); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the threadsafe ProgressCategoryMap
here:
GetProgressCategoryMap()->Increment(title);
--g_map.at(m_title); | ||
if (g_map.at(m_title) == 0) | ||
g_map.erase(m_title); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the threadsafe ProgressCategoryMap
here:
GetProgressCategoryMap()->Decrement(title);
if (m_type == Progress::eProgressCoalecseReports) { | ||
g_map.emplace(title, g_refcount); | ||
|
||
if (g_map.at(title) >= 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be strictly greater than 1, otherwise you'll increment the refcount twice in the first progress report instantiation.
@@ -99,6 +105,10 @@ class Progress { | |||
private: | |||
void ReportProgress(); | |||
static std::atomic<uint64_t> g_id; | |||
static std::atomic<uint64_t> g_refcount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By making the refcount static, it's shared across all the Progress
instances, but IIUC, you want this refcount to be different for each progress category, right ? I actually I don't even think the Progress
object needs to know about its refcount, since only the code that will handle the progress event (in Debugger::HandleProgressEvent
) would need it.
if (total) | ||
m_total = *total; | ||
|
||
if (debugger) | ||
m_debugger_id = debugger->GetID(); | ||
std::lock_guard<std::mutex> guard(m_mutex); | ||
|
||
if (m_type == Progress::eProgressCoalecseReports) { | ||
g_map.emplace(title, g_refcount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a global refcount shouldn't be necessary here, we can just initialize the value to 1 and increment/decrement as needed.
My understanding is that the constructor conveys whether something is an "aggregate" progress event or not and they're broadcast differently so that the listener can decide how they want to receive these "aggregate" events.
With the previous idea in mind, you need the refcount in case there are overlapping events:
So basically if you have two events |
My idea is the user decides how to listen to the events by which
Yeah, with category, we will see "Indexing symbol table" and then we might have both "a.out" and "b.out" details from two different categories needing to be displayed. The debugger event that gets sent for |
I think we're saying the same thing?
That's a great suggestion. I hadn't considered also doing grouping by category for events that have details, but you're right that there's really no reason not to. In that case we don't need to change the constructor at all, and this is all controlled by the broadcast bit, which makes things even simpler. |
I'll start with this by changing this so that bookkeeping is done with the new bit instead of being done in the constructor for |
The discussions happening here are talking about 2 major things, how to do the bookkeeping of the map that keeps track of progress reports and where to do that bookkeeping. I think it makes sense to split up this work into smaller patches as such:
I'll put up a patch to add the bit now and start work on parts 2 and 3. I'm also going to close this PR since the work is now being split up. |
Sounds great. Thanks for putting in the work on this! Also, I do have the PR that adds settings and throttles progress events, do we still want that patch? I do believe that will improve our progress events and keep them from overwhelming clients. |
static std::atomic<uint64_t> g_refcount; | ||
/// Map that tracks each progress object and if we've seen its start and stop | ||
/// events | ||
static std::unordered_map<std::string, uint64_t> g_map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm::StringMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -12,6 +12,7 @@ | |||
#include "lldb/Utility/ConstString.h" | |||
#include "lldb/lldb-types.h" | |||
#include <atomic> | |||
#include <map> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you need the sorted iteration property you probably want to use something more efficient instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I shouldn't need a sorted map so I'll go with StringMap for the updated patch with the class that will handle the progress report map. 👍🏾
Per discussions from llvm#81026, it was decided that having a class that manages a map of progress reports would be beneficial in order to categorize them. This class is a part of the overall `Progress` class and utilizes a map that keeps a count of how many times a progress report category has been sent. This would be used with the new debugger broadcast bit added in llvm#81169 so that a user listening with that bit will receive grouped progress reports.
Per discussions from llvm#81026, it was decided that having a class that manages a map of progress reports would be beneficial in order to categorize them. This class is a part of the overall `Progress` class and utilizes a map that keeps a count of how many times a progress report category has been sent. This would be used with the new debugger broadcast bit added in llvm#81169 so that a user listening with that bit will receive grouped progress reports.
Per discussions from #81026, it was decided that having a class that manages a map of progress reports would be beneficial in order to categorize them. This class is a part of the overall `Progress` class and utilizes a map that keeps a count of how many times a progress report category has been sent. This would be used with the new debugger broadcast bit added in #81169 so that a user listening with that bit will receive grouped progress reports.
Per discussions from llvm#81026, it was decided that having a class that manages a map of progress reports would be beneficial in order to categorize them. This class is a part of the overall `Progress` class and utilizes a map that keeps a count of how many times a progress report category has been sent. This would be used with the new debugger broadcast bit added in llvm#81169 so that a user listening with that bit will receive grouped progress reports. (cherry picked from commit 327d2f8)
Per discussions from llvm#81026, it was decided that having a class that manages a map of progress reports would be beneficial in order to categorize them. This class is a part of the overall `Progress` class and utilizes a map that keeps a count of how many times a progress report category has been sent. This would be used with the new debugger broadcast bit added in llvm#81169 so that a user listening with that bit will receive grouped progress reports. (cherry picked from commit 327d2f8)
As part of the effort to improve progress reporting in LLDB (https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717) we want a way to keep track of progress reports to see if they're ongoing. The ultimate goal is to use this information to essentially keep these reports alive using timeouts to more gracefully deliver these reports.
To lay the groundwork to do this, this commit adds a map to progress reports that keeps track of each category of reports (using the changes from #77547) and uses a refcount to check if these reports are still considered active or not. A refcount of at least one indicates that the report is still active.
I've added an enum to
Progress.h
to toggle whether this behaviour is used or not. By default it is not used, so this commit is marked as NFC.