Skip to content

[2.7] bpo-32186: Release the GIL during fstat and lseek calls #4651

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 7, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Creating io.FileIO() and builtin file() objects now release the GIL when
checking the file descriptor. io.FileIO.readall(), io.FileIO.read(), and
file.read() now release the GIL when getting the file size. Fixed hang of all
threads with inaccessible NFS server. Patch by Nir Soffer.
55 changes: 44 additions & 11 deletions Modules/_io/fileio.c
Original file line number Diff line number Diff line change
Expand Up @@ -146,9 +146,15 @@ dircheck(fileio* self, PyObject *nameobj)
{
#if defined(HAVE_FSTAT) && defined(S_IFDIR) && defined(EISDIR)
struct stat buf;
int res;
if (self->fd < 0)
return 0;
if (fstat(self->fd, &buf) == 0 && S_ISDIR(buf.st_mode)) {

Py_BEGIN_ALLOW_THREADS
res = fstat(self->fd, &buf);
Py_END_ALLOW_THREADS

if (res == 0 && S_ISDIR(buf.st_mode)) {
errno = EISDIR;
PyErr_SetFromErrnoWithFilenameObject(PyExc_IOError, nameobj);
return -1;
Expand All @@ -162,17 +168,34 @@ check_fd(int fd)
{
#if defined(HAVE_FSTAT)
struct stat buf;
if (!_PyVerify_fd(fd) || (fstat(fd, &buf) < 0 && errno == EBADF)) {
PyObject *exc;
char *msg = strerror(EBADF);
exc = PyObject_CallFunction(PyExc_OSError, "(is)",
EBADF, msg);
PyErr_SetObject(PyExc_OSError, exc);
Py_XDECREF(exc);
return -1;
int res;
PyObject *exc;
char *msg;

if (!_PyVerify_fd(fd)) {
goto badfd;
}
#endif

Py_BEGIN_ALLOW_THREADS
res = fstat(fd, &buf);
Py_END_ALLOW_THREADS

if (res < 0 && errno == EBADF) {
goto badfd;
}

return 0;

badfd:
msg = strerror(EBADF);
exc = PyObject_CallFunction(PyExc_OSError, "(is)",
EBADF, msg);
PyErr_SetObject(PyExc_OSError, exc);
Py_XDECREF(exc);
return -1;
#else
return 0;
#endif
}


Expand Down Expand Up @@ -519,9 +542,19 @@ new_buffersize(fileio *self, size_t currentsize)
#ifdef HAVE_FSTAT
off_t pos, end;
struct stat st;
if (fstat(self->fd, &st) == 0) {
int res;

Py_BEGIN_ALLOW_THREADS
res = fstat(self->fd, &st);
Py_END_ALLOW_THREADS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

wouldn't it be slightly better to make the 2 sys calls in the same ALLOW_THREADS block, given they follow each other in the usual case ?

like this:

Py_BEGIN_ALLOW_THREADS
res = fstat(self->fd, &st);
if (res == 0) {
    pos = lseek(self->fd, 0L, SEEK_CUR);
}
Py_END_ALLOW_THREADS
if (res == 0) {
    end = st.st_size;
    # etc..

regards,

gst.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @gst. The code is less clear this way, and we are not calling this in a tight loop, so I don't think this optimization is needed.

Copy link
Contributor

@gst gst Dec 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed.
but now I see that the lseek call isn't checked for error (return value == -1)..
shouldn't it be actually ?

EDIT: well, it falls back to the returned default value in such case, so guess it's good enough..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current error handling is fragile. But I want to make the minimal change needed for this issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best would be to backport _Py_fstat() from master: it returns the file size properly on Windows by using GetFileInformationByHandle(). Be honestly, I don't think that it's worth it and it's perfectly fine to release/acquire the GIL twice in this short function.


if (res == 0) {
end = st.st_size;

Py_BEGIN_ALLOW_THREADS
pos = lseek(self->fd, 0L, SEEK_CUR);
Py_END_ALLOW_THREADS

/* Files claiming a size smaller than SMALLCHUNK may
actually be streaming pseudo-files. In this case, we
apply the more aggressive algorithm below.
Expand Down
21 changes: 18 additions & 3 deletions Objects/fileobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -121,10 +121,15 @@ dircheck(PyFileObject* f)
{
#if defined(HAVE_FSTAT) && defined(S_IFDIR) && defined(EISDIR)
struct stat buf;
int res;
if (f->f_fp == NULL)
return f;
if (fstat(fileno(f->f_fp), &buf) == 0 &&
S_ISDIR(buf.st_mode)) {

Py_BEGIN_ALLOW_THREADS
res = fstat(fileno(f->f_fp), &buf);
Py_END_ALLOW_THREADS

if (res == 0 && S_ISDIR(buf.st_mode)) {
char *msg = strerror(EISDIR);
PyObject *exc = PyObject_CallFunction(PyExc_IOError, "(isO)",
EISDIR, msg, f->f_name);
Expand Down Expand Up @@ -1010,7 +1015,13 @@ new_buffersize(PyFileObject *f, size_t currentsize)
#ifdef HAVE_FSTAT
off_t pos, end;
struct stat st;
if (fstat(fileno(f->f_fp), &st) == 0) {
int res;

Py_BEGIN_ALLOW_THREADS
res = fstat(fileno(f->f_fp), &st);
Py_END_ALLOW_THREADS

if (res == 0) {
end = st.st_size;
/* The following is not a bug: we really need to call lseek()
*and* ftell(). The reason is that some stdio libraries
Expand All @@ -1021,7 +1032,11 @@ new_buffersize(PyFileObject *f, size_t currentsize)
works. We can't use the lseek() value either, because we
need to take the amount of buffered data into account.
(Yet another reason why stdio stinks. :-) */

Py_BEGIN_ALLOW_THREADS
pos = lseek(fileno(f->f_fp), 0L, SEEK_CUR);
Py_END_ALLOW_THREADS

if (pos >= 0) {
pos = ftell(f->f_fp);
}
Expand Down