Skip to content

[2.7] bpo-31530: Stop crashes when iterating over a file on multiple threads. #3672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Sep 20, 2017

@serhiy-storchaka serhiy-storchaka added the type-bug An unexpected behavior, bug, or error label Sep 20, 2017
@@ -652,6 +652,36 @@ def io_func():
self.f.writelines('')
self._test_close_open_io(io_func)

def test_iteration_torture(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment with a least "bpo-31530".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pass
self._run_workers(iterate, 10)

def test_iteration_seek(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment with a least "bpo-31530".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I would prefer to see reviews of other core developers before seeing this change merged. Especially because @benjaminp proposed PR #3670 with a different approach.

if (f->unlocked_count > 0) {
PyErr_SetString(PyExc_IOError,
"seek() called during concurrent "
"operation on the same file object.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the final ".", it's uncommon in Python error messages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But this copies the error message at line 431. Do you want to remove the final "." from all error messages in this file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. I looked quickly at fileobject.c error messages, but I didn't notice that the error message already existed in close(). That's a good reason to generalize the check for concurrent operation :-)

In this case, it's fine, keep the final dot :-)

@@ -0,0 +1 @@
Fixed crashes when iterating over a file on multiple threads.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to complete the description:

seek() and next() methods of file objects now raise an exception during concurrent operation on the same file object. A lock can be used to prevent the error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member Author

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and I would want to see reviews of @benjaminp.

if (f->unlocked_count > 0) {
PyErr_SetString(PyExc_IOError,
"seek() called during concurrent "
"operation on the same file object.");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. But this copies the error message at line 431. Do you want to remove the final "." from all error messages in this file?

@vstinner
Copy link
Member

Oh, you removed the final dot. I'm fine with that too. It's just a minor thing, I don't care much :-) (But I like the idea of being consistent in error messages.)

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

(But again, I would like to see a review of at least another core dev, especially @benjaminp.)

@@ -652,6 +652,36 @@ def io_func():
self.f.writelines('')
self._test_close_open_io(io_func)

def test_iteration_torture(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

pass
self._run_workers(iterate, 10)

def test_iteration_seek(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1 @@
Fixed crashes when iterating over a file on multiple threads.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@benjaminp benjaminp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I wrote my patch is that it seems weird that concurrent f.read() works but not iteration. This is probably fine, though, since no one can have been relying on the current behavior.

drop_readahead could also use a assert(f->unlocked_count == 0); assert

for l in f:
pass
except IOError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably a better stress test to continue looping even when you get an exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the test tooooooo slow. I have killed it after 8 minutes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decrease the size of the file then?

Copy link
Member Author

@serhiy-storchaka serhiy-storchaka Oct 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2**12 lines is not enough for provoking a crash, 2**13 to 2**14 lines provoke random crashes, and 2**15 lines is needed for stable crashing. After patching the test with 2**13 lines takes about 40 sec, and 2**15 lines need 3 minutes to finish. This is still too long.

for i in range(100):
f.seek(i*100, 0)
except IOError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here

Copy link
Contributor

@benjaminp benjaminp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Please put an assert in drop_readahead, though.

@serhiy-storchaka
Copy link
Member Author

drop_readahead could also use a assert(f->unlocked_count == 0); assert

This looks unsafe. f->unlocked_count can be non-zero if other thread performs non-read operation like tell(). And more, it seems to me that the check should be moved inside readahead_get_line_skip() because it can be called recursively, after releasing/acquiring GIL that allows other thread to increase f->unlocked_count.

@serhiy-storchaka serhiy-storchaka merged commit 6401e56 into python:2.7 Nov 10, 2017
@serhiy-storchaka serhiy-storchaka deleted the concurrent-file-iteration-2.7 branch November 10, 2017 10:59
benjaminp added a commit that referenced this pull request Dec 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants