-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
closes bpo-38692: Add a pidfd child process watcher to asyncio. #17069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @benjaminp. I have a few recommendations (mostly minor):
Doc/library/asyncio-policy.rst
Outdated
threads, doesn't interfere with any processes launched outside the event | ||
loop, and should scale linearly with the number of subprocesses launched by | ||
the event loop. The main disadvantage is that pidfds are specific to Linux, | ||
and only work on recent (5.3+) kernels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could remove recent, as this will likely become outdated in the relatively near future.
and only work on recent (5.3+) kernels. | |
and only work on 5.3+ kernels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The time period before a Linux kernel becomes non-"recent" is long. There will be RHEL users in 5 years that can't use PidfdChildWatcher
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be RHEL users in 5 years that can't use PidfdChildWatcher.
Good point. I hadn't considered that it would be quite that long, especially as someone who primarily uses the faster to update distros (such as Arch and Fedora).
return self._loop is not None and self._loop.is_running() | ||
|
||
def close(self): | ||
self.attach_loop(None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, it seems like close()
should clear the callbacks rather than attach_loop()
.
self.attach_loop(None) | |
self._callbacks.clear() | |
self.attach_loop(None) |
This is the current behavior in all three of the other child watchers that use a callbacks dict:
SafeChildWatcher
:
cpython/Lib/asyncio/unix_events.py
Lines 959 to 960 in 5c0c325
def close(self): | |
self._callbacks.clear() |
FastChildWatcher
:
cpython/Lib/asyncio/unix_events.py
Lines 1038 to 1039 in 5c0c325
def close(self): | |
self._callbacks.clear() |
MultiLoopChildWatcher
:
cpython/Lib/asyncio/unix_events.py
Lines 1153 to 1154 in 5c0c325
def close(self): | |
self._callbacks.clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm aware, the examples you list don't keep state in the event loop. Closing a PidfdChildWatcher
should remove it completely from the loop.
self._callbacks[pid] = pidfd, callback, args | ||
|
||
def _do_wait(self, pid): | ||
pidfd, callback, args = self._callbacks.pop(pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A warning is also logged in MultiLoopWatcher
if a pid with the given value does not exist in the callbacks. This can be helpful for debugging purposes:
pidfd, callback, args = self._callbacks.pop(pid) | |
try: | |
pidfd, callback, args = self._callbacks.pop(pid) | |
except KeyError: | |
logger.warning("Child watcher got an unexpected pid: %r", | |
pid, exc_info=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_do_wait
being called multiple times indicates a serious programming error, and I don't think knowing the affected pid is going to be more interesting that the traceback from an exception.
_, status = os.waitpid(pid, 0) | ||
os.close(pidfd) | ||
returncode = _compute_returncode(status) | ||
callback(pid, returncode, *args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me if this should be callback(pid, returncode, *args)
or self._loop.call_soon_threadsafe(callback, pid, returncode, *args)
, based on: https://github.com/python/cpython/blob/7725352d5ae92e018e6e590e50bc0f465b404807/Lib/asyncio/unix_events.py#L825-L834
Is it inherently thread-safe because PidfdChildWatcher
doesn't utilize threads (outside of the main thread)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thread safety requirement is on callback
not my invocation of it. There also aren't any threads in PidfdChildWatcher
, and _do_wait
runs from the event loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thread safety requirement is on callback not my invocation of it
Ah, I see. I had thought it was on both ends for some reason. Thanks for the clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm just reading the docstring you quoted. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm just reading the docstring you quoted. :)
Yeah, fair point. I had just misunderstood "callback() must be thread-safe", as the invocation and the callback itself must be thread-safe, rather than just the callback. I'm also the most familiar with ThreadedChildWatcher
(since that one is the default), where multiple threads are utilized and loop.call_soon_threadsafe()
is a must (within its _do_waitpid()
method).
cpython/Lib/asyncio/unix_events.py
Line 1325 in 5c0c325
loop.call_soon_threadsafe(callback, pid, returncode, *args) |
|
||
def remove_child_handler(self, pid): | ||
try: | ||
pidfd, _, _ = self._callbacks.pop(pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure which is preferable (it doesn't come up very often), but I find this slightly easier to read when showing that the remaining items in the tuple aren't significant and there's more than one. Not particularly important though.
pidfd, _, _ = self._callbacks.pop(pid) | |
pidfd, *_ = self._callbacks.pop(pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that syntax adds value here. If someone changes the shape of self._callbacks
, it's probably good for them to consider this unpacking site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that syntax adds value here.
Yeah it's fine as it is then, it's not a big deal either way.
Thanks @benjaminp! Meant to approve this after the most recent commit, everything looks good to me. |
https://bugs.python.org/issue38692