Skip to content

Commit 81f1ba5

Browse files
urezkitorvalds
authored andcommitted
mm/vmalloc: remove preempt_disable/enable when doing preloading
Some background. The preemption was disabled before to guarantee that a preloaded object is available for a CPU, it was stored for. That was achieved by combining the disabling the preemption and taking the spin lock while the ne_fit_preload_node is checked. The aim was to not allocate in atomic context when spinlock is taken later, for regular vmap allocations. But that approach conflicts with CONFIG_PREEMPT_RT philosophy. It means that calling spin_lock() with disabled preemption is forbidden in the CONFIG_PREEMPT_RT kernel. Therefore, get rid of preempt_disable() and preempt_enable() when the preload is done for splitting purpose. As a result we do not guarantee now that a CPU is preloaded, instead we minimize the case when it is not, with this change, by populating the per cpu preload pointer under the vmap_area_lock. This implies that at least each caller that has done the preallocation will not fallback to an atomic allocation later. It is possible that the preallocation would be pointless or that no preallocation is done because of the race but the data shows that this is really rare. For example i run the special test case that follows the preload pattern and path. 20 "unbind" threads run it and each does 1000000 allocations. Only 3.5 times among 1000000 a CPU was not preloaded. So it can happen but the number is negligible. [[email protected]: changelog additions] Link: http://lkml.kernel.org/r/[email protected] Fixes: 82dd23e ("mm/vmalloc.c: preload a CPU with one object for split purpose") Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Acked-by: Sebastian Andrzej Siewior <[email protected]> Acked-by: Daniel Wagner <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Oleksiy Avramchenko <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent dcf61ff commit 81f1ba5

File tree

1 file changed

+20
-17
lines changed

1 file changed

+20
-17
lines changed

mm/vmalloc.c

Lines changed: 20 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1077,31 +1077,34 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
10771077

10781078
retry:
10791079
/*
1080-
* Preload this CPU with one extra vmap_area object to ensure
1081-
* that we have it available when fit type of free area is
1082-
* NE_FIT_TYPE.
1080+
* Preload this CPU with one extra vmap_area object. It is used
1081+
* when fit type of free area is NE_FIT_TYPE. Please note, it
1082+
* does not guarantee that an allocation occurs on a CPU that
1083+
* is preloaded, instead we minimize the case when it is not.
1084+
* It can happen because of cpu migration, because there is a
1085+
* race until the below spinlock is taken.
10831086
*
10841087
* The preload is done in non-atomic context, thus it allows us
10851088
* to use more permissive allocation masks to be more stable under
1086-
* low memory condition and high memory pressure.
1089+
* low memory condition and high memory pressure. In rare case,
1090+
* if not preloaded, GFP_NOWAIT is used.
10871091
*
1088-
* Even if it fails we do not really care about that. Just proceed
1089-
* as it is. "overflow" path will refill the cache we allocate from.
1092+
* Set "pva" to NULL here, because of "retry" path.
10901093
*/
1091-
preempt_disable();
1092-
if (!__this_cpu_read(ne_fit_preload_node)) {
1093-
preempt_enable();
1094-
pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
1095-
preempt_disable();
1094+
pva = NULL;
10961095

1097-
if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
1098-
if (pva)
1099-
kmem_cache_free(vmap_area_cachep, pva);
1100-
}
1101-
}
1096+
if (!this_cpu_read(ne_fit_preload_node))
1097+
/*
1098+
* Even if it fails we do not really care about that.
1099+
* Just proceed as it is. If needed "overflow" path
1100+
* will refill the cache we allocate from.
1101+
*/
1102+
pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
11021103

11031104
spin_lock(&vmap_area_lock);
1104-
preempt_enable();
1105+
1106+
if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva))
1107+
kmem_cache_free(vmap_area_cachep, pva);
11051108

11061109
/*
11071110
* If an allocation fails, the "vend" address is

0 commit comments

Comments
 (0)