Skip to content

Commit b49773e

Browse files
damien-lemoalaxboe
authored andcommitted
block: Disable write plugging for zoned block devices
Simultaneously writing to a sequential zone of a zoned block device from multiple contexts requires mutual exclusion for BIO issuing to ensure that writes happen sequentially. However, even for a well behaved user correctly implementing such synchronization, BIO plugging may interfere and result in BIOs from the different contextx to be reordered if plugging is done outside of the mutual exclusion section, e.g. the plug was started by a function higher in the call chain than the function issuing BIOs. Context A Context B | blk_start_plug() | ... | seq_write_zone() | mutex_lock(zone) | bio-0->bi_iter.bi_sector = zone->wp | zone->wp += bio_sectors(bio-0) | submit_bio(bio-0) | bio-1->bi_iter.bi_sector = zone->wp | zone->wp += bio_sectors(bio-1) | submit_bio(bio-1) | mutex_unlock(zone) | return | -----------------------> | seq_write_zone() | mutex_lock(zone) | bio-2->bi_iter.bi_sector = zone->wp | zone->wp += bio_sectors(bio-2) | submit_bio(bio-2) | mutex_unlock(zone) | <------------------------- | | blk_finish_plug() In the above example, despite the mutex synchronization ensuring the correct BIO issuing order 0, 1, 2, context A BIOs 0 and 1 end up being issued after BIO 2 of context B, when the plug is released with blk_finish_plug(). While this problem can be addressed using the blk_flush_plug_list() function (in the above example, the call must be inserted before the zone mutex lock is released), a simple generic solution in the block layer avoid this additional code in all zoned block device user code. The simple generic solution implemented with this patch is to introduce the internal helper function blk_mq_plug() to access the current context plug on BIO submission. This helper returns the current plug only if the target device is not a zoned block device or if the BIO to be plugged is not a write operation. Otherwise, the caller context plug is ignored and NULL returned, resulting is all writes to zoned block device to never be plugged. Signed-off-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
1 parent 9305d5d commit b49773e

File tree

3 files changed

+34
-2
lines changed

3 files changed

+34
-2
lines changed

block/blk-core.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -688,7 +688,7 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
688688
struct request *rq;
689689
struct list_head *plug_list;
690690

691-
plug = current->plug;
691+
plug = blk_mq_plug(q, bio);
692692
if (!plug)
693693
return false;
694694

block/blk-mq.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1973,7 +1973,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
19731973

19741974
blk_mq_bio_to_request(rq, bio, nr_segs);
19751975

1976-
plug = current->plug;
1976+
plug = blk_mq_plug(q, bio);
19771977
if (unlikely(is_flush_fua)) {
19781978
/* bypass scheduler for flush rq */
19791979
blk_insert_flush(rq);

block/blk-mq.h

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -233,4 +233,36 @@ static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap)
233233
qmap->mq_map[cpu] = 0;
234234
}
235235

236+
/*
237+
* blk_mq_plug() - Get caller context plug
238+
* @q: request queue
239+
* @bio : the bio being submitted by the caller context
240+
*
241+
* Plugging, by design, may delay the insertion of BIOs into the elevator in
242+
* order to increase BIO merging opportunities. This however can cause BIO
243+
* insertion order to change from the order in which submit_bio() is being
244+
* executed in the case of multiple contexts concurrently issuing BIOs to a
245+
* device, even if these context are synchronized to tightly control BIO issuing
246+
* order. While this is not a problem with regular block devices, this ordering
247+
* change can cause write BIO failures with zoned block devices as these
248+
* require sequential write patterns to zones. Prevent this from happening by
249+
* ignoring the plug state of a BIO issuing context if the target request queue
250+
* is for a zoned block device and the BIO to plug is a write operation.
251+
*
252+
* Return current->plug if the bio can be plugged and NULL otherwise
253+
*/
254+
static inline struct blk_plug *blk_mq_plug(struct request_queue *q,
255+
struct bio *bio)
256+
{
257+
/*
258+
* For regular block devices or read operations, use the context plug
259+
* which may be NULL if blk_start_plug() was not executed.
260+
*/
261+
if (!blk_queue_is_zoned(q) || !op_is_write(bio_op(bio)))
262+
return current->plug;
263+
264+
/* Zoned block device write operation case: do not plug the BIO */
265+
return NULL;
266+
}
267+
236268
#endif

0 commit comments

Comments
 (0)