Skip to content

Commit b75a2bf

Browse files
committed
Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo: "So, it turns out we had a silly bug in the most fundamental part of workqueue for a very long time. AFAICS, this dates back to pre-git era and has quite likely been there from the time workqueue was first introduced. A work item uses its PENDING bit to synchronize multiple queuers. Anyone who wins the PENDING bit owns the pending state of the work item. Whether a queuer wins or loses the race, one thing should be guaranteed - there will soon be at least one execution of the work item - where "after" means that the execution instance would be able to see all the changes that the queuer has made prior to the queueing attempt. Unfortunately, we were missing a smp_mb() after clearing PENDING for execution, so nothing guaranteed visibility of the changes that a queueing loser has made, which manifested as a reproducible blk-mq stall. Lots of kudos to Roman for debugging the problem. The patch for -stable is the minimal one. For v3.7, Peter is working on a patch to make the code path slightly more efficient and less fragile" * 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix ghost PENDING flag while doing MQ IO
2 parents 763cfc8 + 346c09f commit b75a2bf

File tree

1 file changed

+29
-0
lines changed

1 file changed

+29
-0
lines changed

kernel/workqueue.c

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -666,6 +666,35 @@ static void set_work_pool_and_clear_pending(struct work_struct *work,
666666
*/
667667
smp_wmb();
668668
set_work_data(work, (unsigned long)pool_id << WORK_OFFQ_POOL_SHIFT, 0);
669+
/*
670+
* The following mb guarantees that previous clear of a PENDING bit
671+
* will not be reordered with any speculative LOADS or STORES from
672+
* work->current_func, which is executed afterwards. This possible
673+
* reordering can lead to a missed execution on attempt to qeueue
674+
* the same @work. E.g. consider this case:
675+
*
676+
* CPU#0 CPU#1
677+
* ---------------------------- --------------------------------
678+
*
679+
* 1 STORE event_indicated
680+
* 2 queue_work_on() {
681+
* 3 test_and_set_bit(PENDING)
682+
* 4 } set_..._and_clear_pending() {
683+
* 5 set_work_data() # clear bit
684+
* 6 smp_mb()
685+
* 7 work->current_func() {
686+
* 8 LOAD event_indicated
687+
* }
688+
*
689+
* Without an explicit full barrier speculative LOAD on line 8 can
690+
* be executed before CPU#0 does STORE on line 1. If that happens,
691+
* CPU#0 observes the PENDING bit is still set and new execution of
692+
* a @work is not queued in a hope, that CPU#1 will eventually
693+
* finish the queued @work. Meanwhile CPU#1 does not see
694+
* event_indicated is set, because speculative LOAD was executed
695+
* before actual STORE.
696+
*/
697+
smp_mb();
669698
}
670699

671700
static void clear_work_data(struct work_struct *work)

0 commit comments

Comments
 (0)