-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb-dap] Fix TestDap_attach.py flakiness #137278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lldb Author: Wanyi (kusmour) ChangesThis patch makes the Recently
I took a look at the DAP msg from that test case and noticed:
NOTE: The Full diff: https://github.com/llvm/llvm-project/pull/137278.diff 1 Files Affected:
diff --git a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
index dadf6b1f8774c..20f5286da3203 100644
--- a/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
+++ b/lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
@@ -612,7 +612,13 @@ def request_attach(
if gdbRemoteHostname is not None:
args_dict["gdb-remote-hostname"] = gdbRemoteHostname
command_dict = {"command": "attach", "type": "request", "arguments": args_dict}
- return self.send_recv(command_dict)
+ response = self.send_recv(command_dict)
+
+ if response["success"]:
+ # Wait for a 'process' and 'initialized' event in any order
+ self.wait_for_event(filter=["process", "initialized"])
+ self.wait_for_event(filter=["process", "initialized"])
+ return response
def request_breakpointLocations(
self, file_path, line, end_line=None, column=None, end_column=None
|
956f2af
to
2c5ff68
Compare
lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
Outdated
Show resolved
Hide resolved
b87f1fe
to
0e6d419
Compare
lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
Outdated
Show resolved
Hide resolved
0e6d419
to
aa7540d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I'm not too familiar with the dap protocol, so would be good to get another approver.
events = events[:] # Make a copy to avoid modifying the input | ||
while events: | ||
event = self.wait_for_event(filter=events, timeout=timeout) | ||
events.remove(event["event"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we hit a timeout, wont event
be None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah it looks like its not handling the timeout correctly.
events = events[:] # Make a copy to avoid modifying the input | ||
while events: | ||
event = self.wait_for_event(filter=events, timeout=timeout) | ||
events.remove(event["event"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah it looks like its not handling the timeout correctly.
We (Linaro) are seeing the same error message with a different test on our AArch64 Linux bot:
https://lab.llvm.org/buildbot/#/builders/59/builds/16771 Likely same cause. |
Looks like these 2 test cases should be re-enabled in llvm@0b8dfb5 But only the comments got removed. The skip statement survived the change.
Summary: Looks like some test cases should be re-enabled in llvm@0b8dfb5 But only comments was removed. The skip statements survived the change. Test Plan: ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py All tests passed locally on my mac
aa7540d
to
f5f4ff8
Compare
lldb/packages/Python/lldbsuite/test/tools/lldb-dap/dap_server.py
Outdated
Show resolved
Hide resolved
f5f4ff8
to
e747925
Compare
events = events[:] # Make a copy to avoid modifying the input | ||
while events: | ||
event_dict = self.wait_for_event(filter=events, timeout=timeout) | ||
if event_dict is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want an explicit check that event_dict is None
? Or would it be better to have a truth test like
if event_dict:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
# Summary This patch makes the `request_attach` wait for events `process` and `initialized` just like `request_launch`. This ensure the DAP session can move forward somewhat correctly. Recently `TestDap_attach.test_terminate_commands` became flaky. It's hitting: ``` lldbsuite/test/tools/lldb-dap/dap_server.py", line 350, in send_recv raise ValueError(desc) ValueError: no response for "disconnect" ``` I took a look at the DAP msg from that test case and noticed: - It's not using the regular attaching, instead it's using the `attachCommands` to launch debug the binary and it will stop at entry. - The `initialized` event returned after the `disconnect` request. Which means lldb-dap didn't really get ready yet. ### NOTE The `dap_server.py` is doing things to mimic the VSCode (or other dap clients) but it had some assumptions. For example, it's still missing the `configurationDone` request and response because it relies on a continue action to trigger the `configurationDone` request. # Test Plan ``` ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py ``` To test the `wait_for_events` timeout case ``` events = self.wait_for_events(["process", "initialized", "fake", "event"], 1) if events: raise ValueError(f'no events {",".join(events)} found for within timeout 1') ``` Observed <img width="696" alt="image" src="https://github.com/user-attachments/assets/bc97c0ef-d91f-4561-8272-4d36f5f5d4e6" /> ### Also Looks like some test cases should be re-enabled in llvm@0b8dfb5 But only comments was removed. The skip statements survived the change.
# Summary This patch makes the `request_attach` wait for events `process` and `initialized` just like `request_launch`. This ensure the DAP session can move forward somewhat correctly. Recently `TestDap_attach.test_terminate_commands` became flaky. It's hitting: ``` lldbsuite/test/tools/lldb-dap/dap_server.py", line 350, in send_recv raise ValueError(desc) ValueError: no response for "disconnect" ``` I took a look at the DAP msg from that test case and noticed: - It's not using the regular attaching, instead it's using the `attachCommands` to launch debug the binary and it will stop at entry. - The `initialized` event returned after the `disconnect` request. Which means lldb-dap didn't really get ready yet. ### NOTE The `dap_server.py` is doing things to mimic the VSCode (or other dap clients) but it had some assumptions. For example, it's still missing the `configurationDone` request and response because it relies on a continue action to trigger the `configurationDone` request. # Test Plan ``` ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py ``` To test the `wait_for_events` timeout case ``` events = self.wait_for_events(["process", "initialized", "fake", "event"], 1) if events: raise ValueError(f'no events {",".join(events)} found for within timeout 1') ``` Observed <img width="696" alt="image" src="https://github.com/user-attachments/assets/bc97c0ef-d91f-4561-8272-4d36f5f5d4e6" /> ### Also Looks like some test cases should be re-enabled in llvm@0b8dfb5 But only comments was removed. The skip statements survived the change.
# Summary This patch makes the `request_attach` wait for events `process` and `initialized` just like `request_launch`. This ensure the DAP session can move forward somewhat correctly. Recently `TestDap_attach.test_terminate_commands` became flaky. It's hitting: ``` lldbsuite/test/tools/lldb-dap/dap_server.py", line 350, in send_recv raise ValueError(desc) ValueError: no response for "disconnect" ``` I took a look at the DAP msg from that test case and noticed: - It's not using the regular attaching, instead it's using the `attachCommands` to launch debug the binary and it will stop at entry. - The `initialized` event returned after the `disconnect` request. Which means lldb-dap didn't really get ready yet. ### NOTE The `dap_server.py` is doing things to mimic the VSCode (or other dap clients) but it had some assumptions. For example, it's still missing the `configurationDone` request and response because it relies on a continue action to trigger the `configurationDone` request. # Test Plan ``` ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py ``` To test the `wait_for_events` timeout case ``` events = self.wait_for_events(["process", "initialized", "fake", "event"], 1) if events: raise ValueError(f'no events {",".join(events)} found for within timeout 1') ``` Observed <img width="696" alt="image" src="https://github.com/user-attachments/assets/bc97c0ef-d91f-4561-8272-4d36f5f5d4e6" /> ### Also Looks like some test cases should be re-enabled in llvm@0b8dfb5 But only comments was removed. The skip statements survived the change.
# Summary This patch makes the `request_attach` wait for events `process` and `initialized` just like `request_launch`. This ensure the DAP session can move forward somewhat correctly. Recently `TestDap_attach.test_terminate_commands` became flaky. It's hitting: ``` lldbsuite/test/tools/lldb-dap/dap_server.py", line 350, in send_recv raise ValueError(desc) ValueError: no response for "disconnect" ``` I took a look at the DAP msg from that test case and noticed: - It's not using the regular attaching, instead it's using the `attachCommands` to launch debug the binary and it will stop at entry. - The `initialized` event returned after the `disconnect` request. Which means lldb-dap didn't really get ready yet. ### NOTE The `dap_server.py` is doing things to mimic the VSCode (or other dap clients) but it had some assumptions. For example, it's still missing the `configurationDone` request and response because it relies on a continue action to trigger the `configurationDone` request. # Test Plan ``` ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py ``` To test the `wait_for_events` timeout case ``` events = self.wait_for_events(["process", "initialized", "fake", "event"], 1) if events: raise ValueError(f'no events {",".join(events)} found for within timeout 1') ``` Observed <img width="696" alt="image" src="https://github.com/user-attachments/assets/bc97c0ef-d91f-4561-8272-4d36f5f5d4e6" /> ### Also Looks like some test cases should be re-enabled in llvm@0b8dfb5 But only comments was removed. The skip statements survived the change.
# Summary This patch makes the `request_attach` wait for events `process` and `initialized` just like `request_launch`. This ensure the DAP session can move forward somewhat correctly. Recently `TestDap_attach.test_terminate_commands` became flaky. It's hitting: ``` lldbsuite/test/tools/lldb-dap/dap_server.py", line 350, in send_recv raise ValueError(desc) ValueError: no response for "disconnect" ``` I took a look at the DAP msg from that test case and noticed: - It's not using the regular attaching, instead it's using the `attachCommands` to launch debug the binary and it will stop at entry. - The `initialized` event returned after the `disconnect` request. Which means lldb-dap didn't really get ready yet. ### NOTE The `dap_server.py` is doing things to mimic the VSCode (or other dap clients) but it had some assumptions. For example, it's still missing the `configurationDone` request and response because it relies on a continue action to trigger the `configurationDone` request. # Test Plan ``` ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py ./bin/llvm-lit -va /Users/wanyi/llvm-upstream/llvm-project/lldb/test/API/tools/lldb-dap/launch/TestDAP_launch.py ``` To test the `wait_for_events` timeout case ``` events = self.wait_for_events(["process", "initialized", "fake", "event"], 1) if events: raise ValueError(f'no events {",".join(events)} found for within timeout 1') ``` Observed <img width="696" alt="image" src="https://github.com/user-attachments/assets/bc97c0ef-d91f-4561-8272-4d36f5f5d4e6" /> ### Also Looks like some test cases should be re-enabled in llvm@0b8dfb5 But only comments was removed. The skip statements survived the change.
Summary
This patch makes the
request_attach
wait for eventsprocess
andinitialized
just likerequest_launch
. This ensure the DAP session can move forward somewhat correctly.Recently
TestDap_attach.test_terminate_commands
became flaky.It's hitting:
I took a look at the DAP msg from that test case and noticed:
attachCommands
to launch debug the binary and it will stop at entry.initialized
event returned after thedisconnect
request. Which means lldb-dap didn't really get ready yet.NOTE
The
dap_server.py
is doing things to mimic the VSCode (or other dap clients) but it had some assumptions. For example, it's still missing theconfigurationDone
request and response because it relies on a continue action to trigger theconfigurationDone
request.Test Plan
To test the
wait_for_events
timeout caseObserved

Also
Looks like some test cases should be re-enabled in 0b8dfb5
But only comments was removed. The skip statements survived the change.