Skip to content

Commit 9f53197

Browse files
urezkitorvalds
authored andcommitted
mm/vmalloc: do not adjust the search size for alignment overhead
We used to include an alignment overhead into a search length, in that case we guarantee that a found area will definitely fit after applying a specific alignment that user specifies. From the other hand we do not guarantee that an area has the lowest address if an alignment is >= PAGE_SIZE. It means that, when a user specifies a special alignment together with a range that corresponds to an exact requested size then an allocation will fail. This is what happens to KASAN, it wants the free block that exactly matches a specified range during onlining memory banks: [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory82/state [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory83/state [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory85/state [root@vm-0 fedora]# echo online > /sys/devices/system/memory/memory84/state vmap allocation for size 16777216 failed: use vmalloc=<size> to increase size bash: vmalloc: allocation failure: 16777216 bytes, mode:0x6000c0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0 CPU: 4 PID: 1644 Comm: bash Kdump: loaded Not tainted 4.18.0-339.el8.x86_64+debug #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x8e/0xd0 warn_alloc.cold.90+0x8a/0x1b2 ? zone_watermark_ok_safe+0x300/0x300 ? slab_free_freelist_hook+0x85/0x1a0 ? __get_vm_area_node+0x240/0x2c0 ? kfree+0xdd/0x570 ? kmem_cache_alloc_node_trace+0x157/0x230 ? notifier_call_chain+0x90/0x160 __vmalloc_node_range+0x465/0x840 ? mark_held_locks+0xb7/0x120 Fix it by making sure that find_vmap_lowest_match() returns lowest start address with any given alignment value, i.e. for alignments bigger then PAGE_SIZE the algorithm rolls back toward parent nodes checking right sub-trees if the most left free block did not fit due to alignment overhead. Link: https://lkml.kernel.org/r/[email protected] Fixes: 68ad4a3 ("mm/vmalloc.c: keep track of free blocks for vmap allocation") Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> Reported-by: Ping Fang <[email protected]> Tested-by: David Hildenbrand <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Oleksiy Avramchenko <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 7cc7913 commit 9f53197

File tree

1 file changed

+13
-9
lines changed

1 file changed

+13
-9
lines changed

mm/vmalloc.c

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1195,18 +1195,14 @@ find_vmap_lowest_match(unsigned long size,
11951195
{
11961196
struct vmap_area *va;
11971197
struct rb_node *node;
1198-
unsigned long length;
11991198

12001199
/* Start from the root. */
12011200
node = free_vmap_area_root.rb_node;
12021201

1203-
/* Adjust the search size for alignment overhead. */
1204-
length = size + align - 1;
1205-
12061202
while (node) {
12071203
va = rb_entry(node, struct vmap_area, rb_node);
12081204

1209-
if (get_subtree_max_size(node->rb_left) >= length &&
1205+
if (get_subtree_max_size(node->rb_left) >= size &&
12101206
vstart < va->va_start) {
12111207
node = node->rb_left;
12121208
} else {
@@ -1216,25 +1212,33 @@ find_vmap_lowest_match(unsigned long size,
12161212
/*
12171213
* Does not make sense to go deeper towards the right
12181214
* sub-tree if it does not have a free block that is
1219-
* equal or bigger to the requested search length.
1215+
* equal or bigger to the requested search size.
12201216
*/
1221-
if (get_subtree_max_size(node->rb_right) >= length) {
1217+
if (get_subtree_max_size(node->rb_right) >= size) {
12221218
node = node->rb_right;
12231219
continue;
12241220
}
12251221

12261222
/*
12271223
* OK. We roll back and find the first right sub-tree,
12281224
* that will satisfy the search criteria. It can happen
1229-
* only once due to "vstart" restriction.
1225+
* due to "vstart" restriction or an alignment overhead
1226+
* that is bigger then PAGE_SIZE.
12301227
*/
12311228
while ((node = rb_parent(node))) {
12321229
va = rb_entry(node, struct vmap_area, rb_node);
12331230
if (is_within_this_va(va, size, align, vstart))
12341231
return va;
12351232

1236-
if (get_subtree_max_size(node->rb_right) >= length &&
1233+
if (get_subtree_max_size(node->rb_right) >= size &&
12371234
vstart <= va->va_start) {
1235+
/*
1236+
* Shift the vstart forward. Please note, we update it with
1237+
* parent's start address adding "1" because we do not want
1238+
* to enter same sub-tree after it has already been checked
1239+
* and no suitable free block found there.
1240+
*/
1241+
vstart = va->va_start + 1;
12381242
node = node->rb_right;
12391243
break;
12401244
}

0 commit comments

Comments
 (0)