Skip to content

Commit 4efaa5a

Browse files
committed
epoll: be better about file lifetimes
epoll can call out to vfs_poll() with a file pointer that may race with the last 'fput()'. That would make f_count go down to zero, and while the ep->mtx locking means that the resulting file pointer tear-down will be blocked until the poll returns, it means that f_count is already dead, and any use of it won't actually get a reference to the file any more: it's dead regardless. Make sure we have a valid ref on the file pointer before we call down to vfs_poll() from the epoll routines. Link: https://lore.kernel.org/lkml/[email protected]/ Reported-by: [email protected] Reviewed-by: Jens Axboe <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent f462ae0 commit 4efaa5a

File tree

1 file changed

+37
-1
lines changed

1 file changed

+37
-1
lines changed

fs/eventpoll.c

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -979,6 +979,34 @@ static __poll_t __ep_eventpoll_poll(struct file *file, poll_table *wait, int dep
979979
return res;
980980
}
981981

982+
/*
983+
* The ffd.file pointer may be in the process of being torn down due to
984+
* being closed, but we may not have finished eventpoll_release() yet.
985+
*
986+
* Normally, even with the atomic_long_inc_not_zero, the file may have
987+
* been free'd and then gotten re-allocated to something else (since
988+
* files are not RCU-delayed, they are SLAB_TYPESAFE_BY_RCU).
989+
*
990+
* But for epoll, users hold the ep->mtx mutex, and as such any file in
991+
* the process of being free'd will block in eventpoll_release_file()
992+
* and thus the underlying file allocation will not be free'd, and the
993+
* file re-use cannot happen.
994+
*
995+
* For the same reason we can avoid a rcu_read_lock() around the
996+
* operation - 'ffd.file' cannot go away even if the refcount has
997+
* reached zero (but we must still not call out to ->poll() functions
998+
* etc).
999+
*/
1000+
static struct file *epi_fget(const struct epitem *epi)
1001+
{
1002+
struct file *file;
1003+
1004+
file = epi->ffd.file;
1005+
if (!atomic_long_inc_not_zero(&file->f_count))
1006+
file = NULL;
1007+
return file;
1008+
}
1009+
9821010
/*
9831011
* Differs from ep_eventpoll_poll() in that internal callers already have
9841012
* the ep->mtx so we need to start from depth=1, such that mutex_lock_nested()
@@ -987,14 +1015,22 @@ static __poll_t __ep_eventpoll_poll(struct file *file, poll_table *wait, int dep
9871015
static __poll_t ep_item_poll(const struct epitem *epi, poll_table *pt,
9881016
int depth)
9891017
{
990-
struct file *file = epi->ffd.file;
1018+
struct file *file = epi_fget(epi);
9911019
__poll_t res;
9921020

1021+
/*
1022+
* We could return EPOLLERR | EPOLLHUP or something, but let's
1023+
* treat this more as "file doesn't exist, poll didn't happen".
1024+
*/
1025+
if (!file)
1026+
return 0;
1027+
9931028
pt->_key = epi->event.events;
9941029
if (!is_file_epoll(file))
9951030
res = vfs_poll(file, pt);
9961031
else
9971032
res = __ep_eventpoll_poll(file, pt, depth);
1033+
fput(file);
9981034
return res & epi->event.events;
9991035
}
10001036

0 commit comments

Comments
 (0)