-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb] Added an exponential algorithm for the sleep time in connect_to_debug_monitor() #118222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb] Added an exponential algorithm for the sleep time in connect_to_debug_monitor() #118222
Conversation
See llvm#118032 for details.
@llvm/pr-subscribers-lldb Author: Dmitry Vasilyev (slydiman) ChangesSee #118032 for details. Full diff: https://github.com/llvm/llvm-project/pull/118222.diff 1 Files Affected:
diff --git a/lldb/packages/Python/lldbsuite/test/tools/lldb-server/gdbremote_testcase.py b/lldb/packages/Python/lldbsuite/test/tools/lldb-server/gdbremote_testcase.py
index 8c8e4abed0b454..67b07ff4ddd998 100644
--- a/lldb/packages/Python/lldbsuite/test/tools/lldb-server/gdbremote_testcase.py
+++ b/lldb/packages/Python/lldbsuite/test/tools/lldb-server/gdbremote_testcase.py
@@ -388,7 +388,7 @@ def connect_to_debug_monitor(self, attach_pid=None):
# We're using a random port algorithm to try not to collide with other ports,
# and retry a max # times.
attempts = 0
- MAX_ATTEMPTS = 20
+ MAX_ATTEMPTS = 30
while attempts < MAX_ATTEMPTS:
server = self.launch_debug_monitor(attach_pid=attach_pid)
|
20 attempts is already more than what I would consider reasonable. Increasing it doesn't sound like a good idea. If the problem is lingering sockets, then I think a better approach is to slow down the rate of socket creation, not churn out more of them. While this code tries to back off, it doesn't implement the (standard?) exponential backoff algorithm. I think that's important, as without it, it just ends up churning out more and more connections at a relative fast rate ( So, the thing I'd do is use a proper exponential algorithm and reduce the retry count (so that the total wait time ends up being slightly more than what it is now), and also increase the range of ports that we're picking from (I don't know how the (Of course, a much better solution would be to either let lldb-server pick a free port (and communicate it back to the test somehow) or to reuse the socket multiplexing approach we (you) recently implemented. That may not work for all tests (we still need to test the user-provided-port scenario somehow), but 99% of the tests don't care about the connection method, o if we can solve the problem for the majority of tests, then this alleviates the pressure on the minority that does need it.) |
…TS to 16. The sleep time will be 3, 3.6, 4.3, ..., 38, 46. The total sleep time is 262 + 0.5*MAX_CONNECT_ATTEMPTS*MAX_ATTEMPTS = 342. Note the timeout is 600, so the rest time must be enough. Increased the port range (12000..20000).
Updated the patch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's the general idea, although 16 attempts @ 262s sounds a bit too much (that's like almost three times as long as the first version with 30*3s). I worry that the tests will take too long to time out if they not able to establish a connection due to some other reason (like, imagine a bad patch causes lldb-server to never accept a connection). I'd suggest starting out with 10attempts@77 seconds. That's still higher than the initial ~60 seconds, and I really hope that should be enough since doubling the port range sort of doubles the number of attempts as well (by halving the chance of collision).
Thanks, I suggested something similar in #118032. |
API tests llvm-project/lldb/test/API/tools/lldb-server/* with the category `lldb-server` run lldb-server locally on the host with a random port. Sometimes we got errors like https://lab.llvm.org/buildbot/#/builders/195/builds/3145 https://lab.llvm.org/buildbot/#/builders/197/builds/312 The patch llvm/llvm-project#118222 improved the situation but did not fix it completely.
) API tests llvm-project/lldb/test/API/tools/lldb-server/* with the category `lldb-server` run lldb-server locally on the host with a random port. Sometimes we got errors like https://lab.llvm.org/buildbot/#/builders/195/builds/3145 https://lab.llvm.org/buildbot/#/builders/197/builds/312 The patch llvm/llvm-project#118222 improved the situation but did not fix it completely.
See #118032 for details.