-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
[2.7] bpo-31530: Stop crashes when iterating over a file on multiple threads. #3672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2.7] bpo-31530: Stop crashes when iterating over a file on multiple threads. #3672
Conversation
@@ -652,6 +652,36 @@ def io_func(): | |||
self.f.writelines('') | |||
self._test_close_open_io(io_func) | |||
|
|||
def test_iteration_torture(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment with a least "bpo-31530".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pass | ||
self._run_workers(iterate, 10) | ||
|
||
def test_iteration_seek(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment with a least "bpo-31530".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but I would prefer to see reviews of other core developers before seeing this change merged. Especially because @benjaminp proposed PR #3670 with a different approach.
Objects/fileobject.c
Outdated
if (f->unlocked_count > 0) { | ||
PyErr_SetString(PyExc_IOError, | ||
"seek() called during concurrent " | ||
"operation on the same file object."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the final ".", it's uncommon in Python error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. But this copies the error message at line 431. Do you want to remove the final "." from all error messages in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. I looked quickly at fileobject.c error messages, but I didn't notice that the error message already existed in close(). That's a good reason to generalize the check for concurrent operation :-)
In this case, it's fine, keep the final dot :-)
@@ -0,0 +1 @@ | |||
Fixed crashes when iterating over a file on multiple threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to complete the description:
seek() and next() methods of file objects now raise an exception during concurrent operation on the same file object. A lock can be used to prevent the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and I would want to see reviews of @benjaminp.
Objects/fileobject.c
Outdated
if (f->unlocked_count > 0) { | ||
PyErr_SetString(PyExc_IOError, | ||
"seek() called during concurrent " | ||
"operation on the same file object."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. But this copies the error message at line 431. Do you want to remove the final "." from all error messages in this file?
Oh, you removed the final dot. I'm fine with that too. It's just a minor thing, I don't care much :-) (But I like the idea of being consistent in error messages.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
(But again, I would like to see a review of at least another core dev, especially @benjaminp.)
@@ -652,6 +652,36 @@ def io_func(): | |||
self.f.writelines('') | |||
self._test_close_open_io(io_func) | |||
|
|||
def test_iteration_torture(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pass | ||
self._run_workers(iterate, 10) | ||
|
||
def test_iteration_seek(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -0,0 +1 @@ | |||
Fixed crashes when iterating over a file on multiple threads. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I wrote my patch is that it seems weird that concurrent f.read() works but not iteration. This is probably fine, though, since no one can have been relying on the current behavior.
drop_readahead
could also use a assert(f->unlocked_count == 0);
assert
for l in f: | ||
pass | ||
except IOError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably a better stress test to continue looping even when you get an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the test tooooooo slow. I have killed it after 8 minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decrease the size of the file then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2**12
lines is not enough for provoking a crash, 2**13
to 2**14
lines provoke random crashes, and 2**15
lines is needed for stable crashing. After patching the test with 2**13
lines takes about 40 sec, and 2**15
lines need 3 minutes to finish. This is still too long.
for i in range(100): | ||
f.seek(i*100, 0) | ||
except IOError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Please put an assert
in drop_readahead
, though.
This looks unsafe. |
… file on multiple threads. (#3672)"
https://bugs.python.org/issue31530