Skip to content

Commit 07a971f

Browse files
Gautham R. Shenoygregkh
authored andcommitted
powerpc/pseries: Fix cpu_hotplug_lock acquisition in resize_hpt()
[ Upstream commit c784be4 ] The calls to arch_add_memory()/arch_remove_memory() are always made with the read-side cpu_hotplug_lock acquired via memory_hotplug_begin(). On pSeries, arch_add_memory()/arch_remove_memory() eventually call resize_hpt() which in turn calls stop_machine() which acquires the read-side cpu_hotplug_lock again, thereby resulting in the recursive acquisition of this lock. In the absence of CONFIG_PROVE_LOCKING, we hadn't observed a system lockup during a memory hotplug operation because cpus_read_lock() is a per-cpu rwsem read, which, in the fast-path (in the absence of the writer, which in our case is a CPU-hotplug operation) simply increments the read_count on the semaphore. Thus a recursive read in the fast-path doesn't cause any problems. However, we can hit this problem in practice if there is a concurrent CPU-Hotplug operation in progress which is waiting to acquire the write-side of the lock. This will cause the second recursive read to block until the writer finishes. While the writer is blocked since the first read holds the lock. Thus both the reader as well as the writers fail to make any progress thereby blocking both CPU-Hotplug as well as Memory Hotplug operations. Memory-Hotplug CPU-Hotplug CPU 0 CPU 1 ------ ------ 1. down_read(cpu_hotplug_lock.rw_sem) [memory_hotplug_begin] 2. down_write(cpu_hotplug_lock.rw_sem) [cpu_up/cpu_down] 3. down_read(cpu_hotplug_lock.rw_sem) [stop_machine()] Lockdep complains as follows in these code-paths. swapper/0/1 is trying to acquire lock: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: stop_machine+0x2c/0x60 but task is already holding lock: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: mem_hotplug_begin+0x20/0x50 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(cpu_hotplug_lock.rw_sem); lock(cpu_hotplug_lock.rw_sem); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by swapper/0/1: #0: (____ptrval____) (&dev->mutex){....}, at: __driver_attach+0x12c/0x1b0 #1: (____ptrval____) (cpu_hotplug_lock.rw_sem){++++}, at: mem_hotplug_begin+0x20/0x50 #2: (____ptrval____) (mem_hotplug_lock.rw_sem){++++}, at: percpu_down_write+0x54/0x1a0 stack backtrace: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc5-58373-gbc99402235f3-dirty #166 Call Trace: dump_stack+0xe8/0x164 (unreliable) __lock_acquire+0x1110/0x1c70 lock_acquire+0x240/0x290 cpus_read_lock+0x64/0xf0 stop_machine+0x2c/0x60 pseries_lpar_resize_hpt+0x19c/0x2c0 resize_hpt_for_hotplug+0x70/0xd0 arch_add_memory+0x58/0xfc devm_memremap_pages+0x5e8/0x8f0 pmem_attach_disk+0x764/0x830 nvdimm_bus_probe+0x118/0x240 really_probe+0x230/0x4b0 driver_probe_device+0x16c/0x1e0 __driver_attach+0x148/0x1b0 bus_for_each_dev+0x90/0x130 driver_attach+0x34/0x50 bus_add_driver+0x1a8/0x360 driver_register+0x108/0x170 __nd_driver_register+0xd0/0xf0 nd_pmem_driver_init+0x34/0x48 do_one_initcall+0x1e0/0x45c kernel_init_freeable+0x540/0x64c kernel_init+0x2c/0x160 ret_from_kernel_thread+0x5c/0x68 Fix this issue by 1) Requiring all the calls to pseries_lpar_resize_hpt() be made with cpu_hotplug_lock held. 2) In pseries_lpar_resize_hpt() invoke stop_machine_cpuslocked() as a consequence of 1) 3) To satisfy 1), in hpt_order_set(), call mmu_hash_ops.resize_hpt() with cpu_hotplug_lock held. Fixes: dbcf929 ("powerpc/pseries: Add support for hash table resizing") Cc: [email protected] # v4.11+ Reported-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Gautham R. Shenoy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
1 parent 0b584bf commit 07a971f

File tree

2 files changed

+14
-3
lines changed

2 files changed

+14
-3
lines changed

arch/powerpc/mm/hash_utils_64.c

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@
3535
#include <linux/memblock.h>
3636
#include <linux/context_tracking.h>
3737
#include <linux/libfdt.h>
38+
#include <linux/cpu.h>
3839

3940
#include <asm/debugfs.h>
4041
#include <asm/processor.h>
@@ -1852,10 +1853,16 @@ static int hpt_order_get(void *data, u64 *val)
18521853

18531854
static int hpt_order_set(void *data, u64 val)
18541855
{
1856+
int ret;
1857+
18551858
if (!mmu_hash_ops.resize_hpt)
18561859
return -ENODEV;
18571860

1858-
return mmu_hash_ops.resize_hpt(val);
1861+
cpus_read_lock();
1862+
ret = mmu_hash_ops.resize_hpt(val);
1863+
cpus_read_unlock();
1864+
1865+
return ret;
18591866
}
18601867

18611868
DEFINE_SIMPLE_ATTRIBUTE(fops_hpt_order, hpt_order_get, hpt_order_set, "%llu\n");

arch/powerpc/platforms/pseries/lpar.c

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -643,7 +643,10 @@ static int pseries_lpar_resize_hpt_commit(void *data)
643643
return 0;
644644
}
645645

646-
/* Must be called in user context */
646+
/*
647+
* Must be called in process context. The caller must hold the
648+
* cpus_lock.
649+
*/
647650
static int pseries_lpar_resize_hpt(unsigned long shift)
648651
{
649652
struct hpt_resize_state state = {
@@ -699,7 +702,8 @@ static int pseries_lpar_resize_hpt(unsigned long shift)
699702

700703
t1 = ktime_get();
701704

702-
rc = stop_machine(pseries_lpar_resize_hpt_commit, &state, NULL);
705+
rc = stop_machine_cpuslocked(pseries_lpar_resize_hpt_commit,
706+
&state, NULL);
703707

704708
t2 = ktime_get();
705709

0 commit comments

Comments
 (0)