-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
GH-125603: Don't count executing generators and coroutines as referrers in gc.gc_referrers. #125640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
markshannon
wants to merge
2
commits into
python:main
Choose a base branch
from
faster-cpython:hide-executing-generator-refs
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is okay, but it also adds some additional constraints that will make the GC in the free threading build more fragile. I think we have to make sure that any gen/coro marked as
FRAME_EXECUTING
is on aPyThreadState
's frame stack -- otherwise some deferred references may not be visible to the GC and collected while still in use.gi_frame_state = FRAME_EXECUTING
and pushing the frame to the thread's stack.gi_frame_state
to some other value. (i.e.,exit_unwind
->clear_gen_frame
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All those things are already true, because we rely on
gi_frame_state == FRAME_EXECUTING
to guard against sending to an already executing generator.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At shutdown we delete all
PyThreadState
s except for the main thread. Those threads could be running generators or coroutines stuck in some long running call (like atime.sleep()
). We run the GC after that. That seems like it would be unsafe (because we're now hiding deferred_PyStackRef
from the GC.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test program segfaults in the free threading build with this PR:
https://gist.github.com/colesbury/11d59b9987e881a3c016b086bb4ba1ff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the generator is executing, it is part of the stack, not the heap.
So, I think
tp_traverse
should not be traversing executing generators.@colesbury
How does the free-threading GC find references that are on the stack (in normal frames)?
We should do that for executing generators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, during shutdown, we delete all
PyThreadState
s except for the main thread.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me rephrase:
When deleting the thread state, how do we clean up the references?
Presumably, we should be changing the state of the generator when we clean up its frame but we aren't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer my own question. We don't cleanup the references 🙁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call
PyThreadState_Clear()
on the deleted thread states, but that doesn't clean upcurrent_frame
. We don't callPyThreadState_Delete()
for reasons that are not clear to me. Even ifPyThreadState_Clear()
cleaned upcurrent_frame
that wouldn't be sufficient because we unlink all the thread states before callingPyThreadState_Clear()
and (by your definition) that already puts them in an invalid state.This is all longstanding CPython behavior, as far as I can tell. Changing the shutdown behavior seems likely to cause different shutdown-related bugs than what we experience today.
I think we can work around this by encoding more knowledge about generators in
gc_free_threading.c
. I don't love that, but it seems less risky than messing with the shutdown behavior. Let me know if you want to go that route -- I can help with thegc_free_threading.c
changes.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to investigate fixing up thread state clean up. While it may well risk introducing bugs in the short term, I think to would be better long term. It is hard to optimize anything if we can't trust our supposed invariants.
A quick fix, until we decide on how to handle this long term, would be to traverse the frame list when deleting threads and mark all generators as suspended.