Skip to content

🍒 Progress Report Cherrypicks #8483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 28, 2024

Conversation

JDevlieghere
Copy link

No description provided.

Currently, progress events reported by the ProgressManager and broadcast
to eBroadcastBitProgressCategory always specify they're complete. The
problem is that the ProgressManager reports kNonDeterministicTotal for
both the total and the completed number of (sub)events. Because the
values are the same, the event reports itself as complete.

This patch fixes the issue by reporting 0 as the completed value for the
start event and kNonDeterministicTotal for the end event.

(cherry picked from commit ea49e04)
- Factor our common setup code.
- Split the ProgressManager test into separate tests as they test
separate things.
- Fix usage of EXPECT (which continues on failure) and ASSERT (which
halts on failure). We must use the latter when calling GetEvent as
otherwise we'll try to dereference a null EventSP.

(cherry picked from commit 4586366)
The commit introduces a new, generic, Alarm class. The class lets you to
schedule functions (callbacks) that will execute after a predefined
timeout. Once scheduled, you can cancel and reset a callback, given the
timeout hasn't expired yet.

The alarm class worker thread that sleeps until the next timeout
expires. When the thread wakes up, it checks for all the callbacks that
have expired and calls them in order. Because the callback is called
from the worker thread, the only guarantee is that a callback is called
no sooner than the timeout. A long running callback could potentially
block the worker threads and delay other callbacks from getting called.

I intentionally kept the implementation as simple as possible while
addressing the needs for the use case of coalescing progress events as
discussed in [1]. If we want to rely on this somewhere else, we can
reassess whether we need to address this class' limitations.

[1] https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717/

(cherry picked from commit f01a32f)
This implements coalescing of progress events using a timeout, as
discussed in the RFC on Discourse [1]. This PR consists of two commits
which, depending on the feedback, I may split up into two PRs. For now,
I think it's easier to review this as a whole.

1. The first commit introduces a new generic `Alarm` class. The class
lets you to schedule a function (callback) to be executed after a given
timeout expires. You can cancel and reset a callback before its
corresponding timeout expires. It achieves this with the help of a
worker thread that sleeps until the next timeout expires. The only
guarantee it provides is that your function is called no sooner than the
requested timeout. Because the callback is called directly from the
worker thread, a long running callback could potentially block the
worker thread. I intentionally kept the implementation as simple as
possible while addressing the needs for the `ProgressManager` use case.
If we want to rely on this somewhere else, we can reassess whether we
need to address those limitations.

2. The second commit uses the Alarm class to coalesce progress events.
To recap the Discourse discussion, when multiple progress events with
the same title execute in close succession, they get broadcast as one to
`eBroadcastBitProgressCategory`. The `ProgressManager` keeps track of
the in-flight progress events and when the refcount hits zero, the Alarm
class is used to schedule broadcasting the event. If a new progress
event comes in before the alarm fires, the alarm is reset (and the
process repeats when the new progress event ends). If no new event comes
in before the timeout expires, the progress event is broadcast.

[1]
https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717/

(cherry picked from commit 156c290)
@JDevlieghere
Copy link
Author

@swift-ci test

1 similar comment
@JDevlieghere
Copy link
Author

@swift-ci test

JDevlieghere and others added 3 commits March 27, 2024 20:11
llvm-project/lldb/source/Host/common/Alarm.cpp:37:5:
error: 'lock_guard' may not intend to support class template argument deduction [-Werror,-Wctad-maybe-unsupported]
    std::lock_guard alarm_guard(m_alarm_mutex);
    ^

(cherry picked from commit ba97dc8)
)

Avoid deadlocks in the Alarm class by releasing the lock before invoking
callbacks. This deadlock manifested itself in the ProgressManager:

1. On the main thread, the ProgressManager acquires its lock in
ProgressManager::Decrement and calls Alarm::Create.
  2. On the main thread, the Alarm acquires its lock in Alarm::Create.
3. On the alarm thread, the Alarm acquires its lock after waiting on the
condition variable and calls ProgressManager::Expire.
4. On the alarm thread, the ProgressManager acquires its lock in
ProgressManager::Expire.

Note how the two threads are acquiring the locks in different orders.
Deadlocks can be avoided by always acquiring locks in the same order,
but since the two mutexes here are private implementation details,
belong to different classes, that's not straightforward. Luckily, we
don't need to have the Alarm mutex locked when invoking the callbacks.
That exactly how this patch solves the issue.

(cherry picked from commit b76fd1e)
@JDevlieghere JDevlieghere force-pushed the jdevlieghere/progress-reports branch from 72cfb83 to 0067c04 Compare March 28, 2024 03:12
@JDevlieghere
Copy link
Author

@swift-ci test

@JDevlieghere
Copy link
Author

@swift-ci test windows

@JDevlieghere JDevlieghere merged commit 6a55169 into swift/release/6.0 Mar 28, 2024
@JDevlieghere JDevlieghere deleted the jdevlieghere/progress-reports branch March 28, 2024 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants