Skip to content

Commit a1e1cb7

Browse files
committed
dm: fix redundant IO accounting for bios that need splitting
The risk of redundant IO accounting was not taken into consideration when commit 18a25da ("dm: ensure bio submission follows a depth-first tree walk") introduced IO splitting in terms of recursion via generic_make_request(). Fix this by subtracting the split bio's payload from the IO stats that were already accounted for by start_io_acct() upon dm_make_request() entry. This repeat oscillation of the IO accounting, up then down, isn't ideal but refactoring DM core's IO splitting to pre-split bios _before_ they are accounted turned out to be an excessive amount of change that will need a full development cycle to refine and verify. Before this fix: /dev/mapper/stripe_dev is a 4-way stripe using a 32k chunksize, so bios are split on 32k boundaries. # fio --name=16M --filename=/dev/mapper/stripe_dev --rw=write --bs=64k --size=16M \ --iodepth=1 --ioengine=libaio --direct=1 --refill_buffers with debugging added: [103898.310264] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=0 len=128 [103898.318704] device-mapper: core: __split_and_process_bio: recursing for following split bio: [103898.329136] device-mapper: core: start_io_acct: dm-2 WRITE bio->bi_iter.bi_sector=64 len=64 ... 16M written yet 136M (278528 * 512b) accounted: # cat /sys/block/dm-2/stat | awk '{ print $7 }' 278528 After this fix: 16M written and 16M (32768 * 512b) accounted: # cat /sys/block/dm-2/stat | awk '{ print $7 }' 32768 Fixes: 18a25da ("dm: ensure bio submission follows a depth-first tree walk") Cc: [email protected] # 4.16+ Reported-by: Bryan Gurney <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
1 parent 57c3651 commit a1e1cb7

File tree

1 file changed

+16
-0
lines changed

1 file changed

+16
-0
lines changed

drivers/md/dm.c

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1584,6 +1584,9 @@ static void init_clone_info(struct clone_info *ci, struct mapped_device *md,
15841584
ci->sector = bio->bi_iter.bi_sector;
15851585
}
15861586

1587+
#define __dm_part_stat_sub(part, field, subnd) \
1588+
(part_stat_get(part, field) -= (subnd))
1589+
15871590
/*
15881591
* Entry point to split a bio into clones and submit them to the targets.
15891592
*/
@@ -1638,6 +1641,19 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
16381641
struct bio *b = bio_split(bio, bio_sectors(bio) - ci.sector_count,
16391642
GFP_NOIO, &md->queue->bio_split);
16401643
ci.io->orig_bio = b;
1644+
1645+
/*
1646+
* Adjust IO stats for each split, otherwise upon queue
1647+
* reentry there will be redundant IO accounting.
1648+
* NOTE: this is a stop-gap fix, a proper fix involves
1649+
* significant refactoring of DM core's bio splitting
1650+
* (by eliminating DM's splitting and just using bio_split)
1651+
*/
1652+
part_stat_lock();
1653+
__dm_part_stat_sub(&dm_disk(md)->part0,
1654+
sectors[op_stat_group(bio_op(bio))], ci.sector_count);
1655+
part_stat_unlock();
1656+
16411657
bio_chain(b, bio);
16421658
ret = generic_make_request(bio);
16431659
break;

0 commit comments

Comments
 (0)