Skip to content

bpo-45274: Fix Thread._wait_for_tstate_lock() race condition #28532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 27, 2021
Merged

bpo-45274: Fix Thread._wait_for_tstate_lock() race condition #28532

merged 1 commit into from
Sep 27, 2021

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Sep 23, 2021

Fix a race condition in the Thread.join() method of the threading
module. If the function is interrupted by a signal and the signal
handler raises an exception, make sure that the thread remains in a
consistent state to prevent a deadlock.

https://bugs.python.org/issue45274

Fix a race condition in the Thread.join() method of the threading
module. If the function is interrupted by a signal and the signal
handler raises an exception, make sure that the thread remains in a
consistent state to prevent a deadlock.
# was interrupted with an exception before reaching the
# lock.release(). It can happen if a signal handler raises an
# exception, like CTRL+C which raises KeyboardInterrupt.
lock.release()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this code is 100% reliable if lock.release() gets interrupted by a second exception (ex: raised by a second signal: fatality!). Maybe a context manager could be used. But I'm exhausted, I will think about this code after a good night :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we've still got bugs in the context manager exit implementation where even it can miss something in this scenario and not call __exit__. https://bugs.python.org/issue29988
This came up during the dev sprint in 2017 when working on the interpreter loop.

So this PR, while not technically a fix, is at least an improvement and should help the single Ctrl-C KeyboardInterrupt case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole function should be reimplemented in C to better control how signals are handled.

In pure Python, I don't think that it's possible to fully control handle any possible exception at any line number.

@serhiy-storchaka proposed to rewrite acquire()+release() in C to make sure that at least the lock remains consistent: https://bugs.python.org/issue45274#msg402532 So it doesn't handle exceptions in the _stop() method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I only wanted to enhance the code, I'm not interested to rewrite threading.Thread in C.

@vstinner
Copy link
Member Author

cc @pablogsal @ambv @corona10 @serhiy-storchaka: Here is an interesting race condition in the threading module ;-) Making sure that we restore the Thread object in a consistent state is challenging. My fix should make these code less bad (more reliable, but not 100% reliable).

@vstinner
Copy link
Member Author

cc @gpshead

Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to devise some general solution to this, maybe adding some private method in C but I am ok with this solution for the time being.

@vstinner vstinner merged commit a22be49 into python:main Sep 27, 2021
@vstinner vstinner deleted the wait_for_tstate_lock branch September 27, 2021 12:20
@vstinner vstinner added needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes labels Sep 27, 2021
@miss-islington
Copy link
Contributor

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10.
🐍🍒⛏🤖

@bedevere-bot bedevere-bot removed the needs backport to 3.9 only security fixes label Sep 27, 2021
@bedevere-bot
Copy link

GH-28579 is a backport of this pull request to the 3.9 branch.

@bedevere-bot
Copy link

GH-28580 is a backport of this pull request to the 3.10 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.10 only security fixes label Sep 27, 2021
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 27, 2021
…H-28532)

Fix a race condition in the Thread.join() method of the threading
module. If the function is interrupted by a signal and the signal
handler raises an exception, make sure that the thread remains in a
consistent state to prevent a deadlock.
(cherry picked from commit a22be49)

Co-authored-by: Victor Stinner <[email protected]>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Sep 27, 2021
…H-28532)

Fix a race condition in the Thread.join() method of the threading
module. If the function is interrupted by a signal and the signal
handler raises an exception, make sure that the thread remains in a
consistent state to prevent a deadlock.
(cherry picked from commit a22be49)

Co-authored-by: Victor Stinner <[email protected]>
miss-islington added a commit that referenced this pull request Sep 27, 2021
Fix a race condition in the Thread.join() method of the threading
module. If the function is interrupted by a signal and the signal
handler raises an exception, make sure that the thread remains in a
consistent state to prevent a deadlock.
(cherry picked from commit a22be49)

Co-authored-by: Victor Stinner <[email protected]>
vstinner added a commit that referenced this pull request Sep 27, 2021
… (GH-28580)

Fix a race condition in the Thread.join() method of the threading
module. If the function is interrupted by a signal and the signal
handler raises an exception, make sure that the thread remains in a
consistent state to prevent a deadlock.
(cherry picked from commit a22be49)

Co-authored-by: Victor Stinner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants