Skip to content

Commit 407f1d8

Browse files
melvertorvalds
authored andcommitted
kfence: await for allocation using wait_event
Patch series "kfence: optimize timer scheduling", v2. We have observed that mostly-idle systems with KFENCE enabled wake up otherwise idle CPUs, preventing such to enter a lower power state. Debugging revealed that KFENCE spends too much active time in toggle_allocation_gate(). While the first version of KFENCE was using all the right bits to be scheduling optimal, and thus power efficient, by simply using wait_event() + wake_up(), that code was unfortunately removed. As KFENCE was exposed to various different configs and tests, the scheduling optimal code slowly disappeared. First because of hung task warnings, and finally because of deadlocks when an allocation is made by timer code with debug objects enabled. Clearly, the "fixes" were not too friendly for devices that want to be power efficient. Therefore, let's try a little harder to fix the hung task and deadlock problems that we have with wait_event() + wake_up(), while remaining as scheduling friendly and power efficient as possible. Crucially, we need to defer the wake_up() to an irq_work, avoiding any potential for deadlock. The result with this series is that on the devices where we observed a power regression, power usage returns back to baseline levels. This patch (of 3): On mostly-idle systems, we have observed that toggle_allocation_gate() is a cause of frequent wake-ups, preventing an otherwise idle CPU to go into a lower power state. A late change in KFENCE's development, due to a potential deadlock [1], required changing the scheduling-friendly wait_event_timeout() and wake_up() to an open-coded wait-loop using schedule_timeout(). [1] https://lkml.kernel.org/r/[email protected] To avoid unnecessary wake-ups, switch to using wait_event_timeout(). Unfortunately, we still cannot use a version with direct wake_up() in __kfence_alloc() due to the same potential for deadlock as in [1]. Instead, add a level of indirection via an irq_work that is scheduled if we determine that the kfence_timer requires a wake_up(). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 0ce20dd ("mm: add Kernel Electric-Fence infrastructure") Signed-off-by: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Jann Horn <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Hillf Danton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 94868a1 commit 407f1d8

File tree

2 files changed

+30
-16
lines changed

2 files changed

+30
-16
lines changed

lib/Kconfig.kfence

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ menuconfig KFENCE
77
bool "KFENCE: low-overhead sampling-based memory safety error detector"
88
depends on HAVE_ARCH_KFENCE && (SLAB || SLUB)
99
select STACKTRACE
10+
select IRQ_WORK
1011
help
1112
KFENCE is a low-overhead sampling-based detector of heap out-of-bounds
1213
access, use-after-free, and invalid-free errors. KFENCE is designed

mm/kfence/core.c

Lines changed: 29 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
#include <linux/atomic.h>
1111
#include <linux/bug.h>
1212
#include <linux/debugfs.h>
13+
#include <linux/irq_work.h>
1314
#include <linux/kcsan-checks.h>
1415
#include <linux/kfence.h>
1516
#include <linux/kmemleak.h>
@@ -587,6 +588,17 @@ late_initcall(kfence_debugfs_init);
587588

588589
/* === Allocation Gate Timer ================================================ */
589590

591+
#ifdef CONFIG_KFENCE_STATIC_KEYS
592+
/* Wait queue to wake up allocation-gate timer task. */
593+
static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
594+
595+
static void wake_up_kfence_timer(struct irq_work *work)
596+
{
597+
wake_up(&allocation_wait);
598+
}
599+
static DEFINE_IRQ_WORK(wake_up_kfence_timer_work, wake_up_kfence_timer);
600+
#endif
601+
590602
/*
591603
* Set up delayed work, which will enable and disable the static key. We need to
592604
* use a work queue (rather than a simple timer), since enabling and disabling a
@@ -604,25 +616,13 @@ static void toggle_allocation_gate(struct work_struct *work)
604616
if (!READ_ONCE(kfence_enabled))
605617
return;
606618

607-
/* Enable static key, and await allocation to happen. */
608619
atomic_set(&kfence_allocation_gate, 0);
609620
#ifdef CONFIG_KFENCE_STATIC_KEYS
621+
/* Enable static key, and await allocation to happen. */
610622
static_branch_enable(&kfence_allocation_key);
611-
/*
612-
* Await an allocation. Timeout after 1 second, in case the kernel stops
613-
* doing allocations, to avoid stalling this worker task for too long.
614-
*/
615-
{
616-
unsigned long end_wait = jiffies + HZ;
617-
618-
do {
619-
set_current_state(TASK_UNINTERRUPTIBLE);
620-
if (atomic_read(&kfence_allocation_gate) != 0)
621-
break;
622-
schedule_timeout(1);
623-
} while (time_before(jiffies, end_wait));
624-
__set_current_state(TASK_RUNNING);
625-
}
623+
624+
wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate), HZ);
625+
626626
/* Disable static key and reset timer. */
627627
static_branch_disable(&kfence_allocation_key);
628628
#endif
@@ -729,6 +729,19 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
729729
*/
730730
if (atomic_read(&kfence_allocation_gate) || atomic_inc_return(&kfence_allocation_gate) > 1)
731731
return NULL;
732+
#ifdef CONFIG_KFENCE_STATIC_KEYS
733+
/*
734+
* waitqueue_active() is fully ordered after the update of
735+
* kfence_allocation_gate per atomic_inc_return().
736+
*/
737+
if (waitqueue_active(&allocation_wait)) {
738+
/*
739+
* Calling wake_up() here may deadlock when allocations happen
740+
* from within timer code. Use an irq_work to defer it.
741+
*/
742+
irq_work_queue(&wake_up_kfence_timer_work);
743+
}
744+
#endif
732745

733746
if (!READ_ONCE(kfence_enabled))
734747
return NULL;

0 commit comments

Comments
 (0)