Skip to content

Commit c06dfd1

Browse files
krisman-at-collaboraMike Snitzer
authored andcommitted
dm mpath: provide high-resolution timer to HST for bio-based
The precision loss of reading IO start_time with jiffies_to_nsecs instead of using a high resolution timer degrades HST path prediction for BIO-based mpath on high load workloads. Below, I show the utilization percentage of a 10 disk multipath with asymmetrical disk access cost, while being exercised by a randwrite FIO benchmark with high submission queue depth (depth=64). It is possible to see that the HST path selection degrades heavily for high-iops in BIO-mpath, underutilizing the slower paths way beyond expected. This seems to be caused by the start_time truncation, which makes some IO to seem much slower than it actually is. In this scenario ST outperforms HST for bio-mpath, but not for mq-mpath, which already uses ktime_get_ns(). The third column shows utilization with this patch applied. It is easy to see that now HST prediction is much closer to the ideal distribution (calculated considering the real cost of each path). | | ST | HST (orig) | HST(ktime) | Best | | sdd | 0.17 | 0.20 | 0.17 | 0.18 | | sde | 0.17 | 0.20 | 0.17 | 0.18 | | sdf | 0.17 | 0.20 | 0.17 | 0.18 | | sdg | 0.06 | 0.00 | 0.06 | 0.04 | | sdh | 0.03 | 0.00 | 0.03 | 0.02 | | sdi | 0.03 | 0.00 | 0.03 | 0.02 | | sdj | 0.02 | 0.00 | 0.01 | 0.01 | | sdk | 0.02 | 0.00 | 0.01 | 0.01 | | sdl | 0.17 | 0.20 | 0.17 | 0.18 | | sdm | 0.17 | 0.20 | 0.17 | 0.18 | This issue was originally discussed [1] when we first merged HST, and this patch was left as a low hanging fruit to be solved later. Regarding the implementation, as suggested by Mike in that mail thread, in order to avoid the overhead of ktime_get_ns for other selectors, this patch adds a flag for the selector code to request the high-resolution timer. I tested this using the same benchmark used in the original HST submission. Full test and benchmark scripts are available here: https://people.collabora.com/~krisman/HST-BIO-MPATH/ [1] https://lore.kernel.org/lkml/[email protected]/T/ Signed-off-by: Gabriel Krisman Bertazi <[email protected]> [snitzer: cleaned up various implementation details] Signed-off-by: Mike Snitzer <[email protected]>
1 parent 567dd8f commit c06dfd1

File tree

3 files changed

+23
-1
lines changed

3 files changed

+23
-1
lines changed

drivers/md/dm-mpath.c

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,7 @@ struct multipath {
105105
struct dm_mpath_io {
106106
struct pgpath *pgpath;
107107
size_t nr_bytes;
108+
u64 start_time_ns;
108109
};
109110

110111
typedef int (*action_fn) (struct pgpath *pgpath);
@@ -295,6 +296,7 @@ static void multipath_init_per_bio_data(struct bio *bio, struct dm_mpath_io **mp
295296

296297
mpio->nr_bytes = bio->bi_iter.bi_size;
297298
mpio->pgpath = NULL;
299+
mpio->start_time_ns = 0;
298300
*mpio_p = mpio;
299301

300302
dm_bio_record(bio_details, bio);
@@ -647,6 +649,9 @@ static int __multipath_map_bio(struct multipath *m, struct bio *bio,
647649

648650
mpio->pgpath = pgpath;
649651

652+
if (dm_ps_use_hr_timer(pgpath->pg->ps.type))
653+
mpio->start_time_ns = ktime_get_ns();
654+
650655
bio->bi_status = 0;
651656
bio_set_dev(bio, pgpath->path.dev->bdev);
652657
bio->bi_opf |= REQ_FAILFAST_TRANSPORT;
@@ -1713,7 +1718,8 @@ static int multipath_end_io_bio(struct dm_target *ti, struct bio *clone,
17131718

17141719
if (ps->type->end_io)
17151720
ps->type->end_io(ps, &pgpath->path, mpio->nr_bytes,
1716-
dm_start_time_ns_from_clone(clone));
1721+
(mpio->start_time_ns ?:
1722+
dm_start_time_ns_from_clone(clone)));
17171723
}
17181724

17191725
return r;

drivers/md/dm-path-selector.h

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,11 +26,26 @@ struct path_selector {
2626
void *context;
2727
};
2828

29+
/*
30+
* If a path selector uses this flag, a high resolution timer is used
31+
* (via ktime_get_ns) to account for IO start time in BIO-based mpath.
32+
* This improves performance of some path selectors (i.e. HST), in
33+
* exchange for slightly higher overhead when submitting the BIO.
34+
* The extra cost is usually offset by improved path selection for
35+
* some benchmarks.
36+
*
37+
* This has no effect for request-based mpath, since it already uses a
38+
* higher precision timer by default.
39+
*/
40+
#define DM_PS_USE_HR_TIMER 0x00000001
41+
#define dm_ps_use_hr_timer(type) ((type)->features & DM_PS_USE_HR_TIMER)
42+
2943
/* Information about a path selector type */
3044
struct path_selector_type {
3145
char *name;
3246
struct module *module;
3347

48+
unsigned int features;
3449
unsigned int table_args;
3550
unsigned int info_args;
3651

drivers/md/dm-ps-historical-service-time.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -523,6 +523,7 @@ static int hst_end_io(struct path_selector *ps, struct dm_path *path,
523523
static struct path_selector_type hst_ps = {
524524
.name = "historical-service-time",
525525
.module = THIS_MODULE,
526+
.features = DM_PS_USE_HR_TIMER,
526527
.table_args = 1,
527528
.info_args = 3,
528529
.create = hst_create,

0 commit comments

Comments
 (0)