Skip to content

closes bpo-38692: Add a pidfd child process watcher to asyncio. #17069

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 14, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions Doc/library/asyncio-policy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,18 @@ implementation used by the asyncio event loop:
This solution requires a running event loop in the main thread to work, as
:class:`SafeChildWatcher`.

.. class:: PidfdChildWatcher

This implementation polls process file descriptors (pidfds) to await child
process termination. In some respects, :class:`PidfdChildWatcher` is a
"Goldilocks" child watcher implementation. It doesn't require signals or
threads, doesn't interfere with any processes launched outside the event
loop, and scales linearly with the number of subprocesses launched by the
event loop. The main disadvantage is that pidfds are specific to Linux, and
only work on recent (5.3+) kernels.

.. versionadded:: 3.9


Custom Policies
===============
Expand Down
3 changes: 3 additions & 0 deletions Doc/whatsnew/3.9.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,9 @@ that schedules a shutdown for the default executor that waits on the
:func:`asyncio.run` has been updated to use the new :term:`coroutine`.
(Contributed by Kyle Stanley in :issue:`34037`.)

Added :class:`asyncio.PidfdChildWatcher`, a Linux-specific child watcher
implementation that polls process file descriptors. (:issue:`38692`)

curses
------

Expand Down
67 changes: 67 additions & 0 deletions Lib/asyncio/unix_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -878,6 +878,73 @@ def __exit__(self, a, b, c):
raise NotImplementedError()


class PidfdChildWatcher(AbstractChildWatcher):
"""Child watcher implementation using Linux's pid file descriptors.

This child watcher polls process file descriptors (pidfds) to await child
process termination. In some respects, PidfdChildWatcher is a "Goldilocks"
child watcher implementation. It doesn't require signals or threads, doesn't
interfere with any processes launched outside the event loop, and scales
linearly with the number of subprocesses launched by the event loop. The
main disadvantage is that pidfds are specific to Linux, and only work on
recent (5.3+) kernels.
"""

def __init__(self):
self._loop = None
self._callbacks = {}

def __enter__(self):
return self

def __exit__(self, exc_type, exc_value, exc_traceback):
pass

def is_active(self):
return self._loop is not None and self._loop.is_running()

def close(self):
self.attach_loop(None)
Copy link
Contributor

@aeros aeros Nov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it seems like close() should clear the callbacks rather than attach_loop().

Suggested change
self.attach_loop(None)
self._callbacks.clear()
self.attach_loop(None)

This is the current behavior in all three of the other child watchers that use a callbacks dict:

SafeChildWatcher:

def close(self):
self._callbacks.clear()

FastChildWatcher:

def close(self):
self._callbacks.clear()

MultiLoopChildWatcher:

def close(self):
self._callbacks.clear()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I'm aware, the examples you list don't keep state in the event loop. Closing a PidfdChildWatcher should remove it completely from the loop.


def attach_loop(self, loop):
if self._loop is not None and loop is None and self._callbacks:
warnings.warn(
'A loop is being detached '
'from a child watcher with pending handlers',
RuntimeWarning)
for pidfd, _, _ in self._callbacks.values():
self._loop._remove_reader(pidfd)
os.close(pidfd)
self._callbacks.clear()
self._loop = loop

def add_child_handler(self, pid, callback, *args):
existing = self._callbacks.get(pid)
if existing is not None:
self._callbacks[pid] = existing[0], callback, args
else:
pidfd = os.pidfd_open(pid)
self._loop._add_reader(pidfd, self._do_wait, pid)
self._callbacks[pid] = pidfd, callback, args

def _do_wait(self, pid):
pidfd, callback, args = self._callbacks.pop(pid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A warning is also logged in MultiLoopWatcher if a pid with the given value does not exist in the callbacks. This can be helpful for debugging purposes:

Suggested change
pidfd, callback, args = self._callbacks.pop(pid)
try:
pidfd, callback, args = self._callbacks.pop(pid)
except KeyError:
logger.warning("Child watcher got an unexpected pid: %r",
pid, exc_info=True)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_do_wait being called multiple times indicates a serious programming error, and I don't think knowing the affected pid is going to be more interesting that the traceback from an exception.

self._loop._remove_reader(pidfd)
_, status = os.waitpid(pid, 0)
os.close(pidfd)
returncode = _compute_returncode(status)
callback(pid, returncode, *args)
Copy link
Contributor

@aeros aeros Nov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me if this should be callback(pid, returncode, *args) or self._loop.call_soon_threadsafe(callback, pid, returncode, *args), based on: https://github.com/python/cpython/blob/7725352d5ae92e018e6e590e50bc0f465b404807/Lib/asyncio/unix_events.py#L825-L834

Is it inherently thread-safe because PidfdChildWatcher doesn't utilize threads (outside of the main thread)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thread safety requirement is on callback not my invocation of it. There also aren't any threads in PidfdChildWatcher, and _do_wait runs from the event loop.

Copy link
Contributor

@aeros aeros Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjaminp

The thread safety requirement is on callback not my invocation of it

Ah, I see. I had thought it was on both ends for some reason. Thanks for the clarification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'm just reading the docstring you quoted. :)

Copy link
Contributor

@aeros aeros Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjaminp

Well, I'm just reading the docstring you quoted. :)

Yeah, fair point. I had just misunderstood "callback() must be thread-safe", as the invocation and the callback itself must be thread-safe, rather than just the callback. I'm also the most familiar with ThreadedChildWatcher (since that one is the default), where multiple threads are utilized and loop.call_soon_threadsafe() is a must (within its _do_waitpid() method).

loop.call_soon_threadsafe(callback, pid, returncode, *args)


def remove_child_handler(self, pid):
try:
pidfd, _, _ = self._callbacks.pop(pid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which is preferable (it doesn't come up very often), but I find this slightly easier to read when showing that the remaining items in the tuple aren't significant and there's more than one. Not particularly important though.

Suggested change
pidfd, _, _ = self._callbacks.pop(pid)
pidfd, *_ = self._callbacks.pop(pid)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that syntax adds value here. If someone changes the shape of self._callbacks, it's probably good for them to consider this unpacking site.

Copy link
Contributor

@aeros aeros Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjaminp

I don't think that syntax adds value here.

Yeah it's fine as it is then, it's not a big deal either way.

except KeyError:
return False
self._loop._remove_reader(pidfd)
os.close(pidfd)
return True


def _compute_returncode(status):
if os.WIFSIGNALED(status):
# The child process died because of a signal.
Expand Down
18 changes: 18 additions & 0 deletions Lib/test/test_asyncio/test_subprocess.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import os
import signal
import sys
import unittest
Expand Down Expand Up @@ -691,6 +692,23 @@ class SubprocessFastWatcherTests(SubprocessWatcherMixin,

Watcher = unix_events.FastChildWatcher

def has_pidfd_support():
if not hasattr(os, 'pidfd_open'):
return False
try:
os.close(os.pidfd_open(os.getpid()))
except OSError:
return False
return True

@unittest.skipUnless(
has_pidfd_support(),
"operating system does not support pidfds",
)
class SubprocessPidfdWatcherTests(SubprocessWatcherMixin,
test_utils.TestCase):
Watcher = unix_events.PidfdChildWatcher

else:
# Windows
class SubprocessProactorTests(SubprocessMixin, test_utils.TestCase):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Add :class:`asyncio.PidfdChildWatcher`, a Linux-specific child watcher
implementation that polls process file descriptors.