Skip to content

Commit cba6cfd

Browse files
diandersbroonie
authored andcommitted
regulator: core: Avoid lockdep reports when resolving supplies
An automated bot told me that there was a potential lockdep problem with regulators. This was on the chromeos-5.15 kernel, but I see nothing that would be different downstream compared to upstream. The bot said: ============================================ WARNING: possible recursive locking detected 5.15.104-lockdep-17461-gc1e499ed6604 #1 Not tainted -------------------------------------------- kworker/u16:4/115 is trying to acquire lock: ffffff8083110170 (regulator_ww_class_mutex){+.+.}-{3:3}, at: create_regulator+0x398/0x7ec but task is already holding lock: ffffff808378e170 (regulator_ww_class_mutex){+.+.}-{3:3}, at: ww_mutex_trylock+0x3c/0x7b8 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(regulator_ww_class_mutex); lock(regulator_ww_class_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by kworker/u16:4/115: #0: ffffff808006a948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x520/0x1348 #1: ffffffc00e0a7cc0 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x55c/0x1348 #2: ffffff80828a2260 (&dev->mutex){....}-{3:3}, at: __device_attach_async_helper+0xd0/0x2a4 #3: ffffff808378e170 (regulator_ww_class_mutex){+.+.}-{3:3}, at: ww_mutex_trylock+0x3c/0x7b8 stack backtrace: CPU: 2 PID: 115 Comm: kworker/u16:4 Not tainted 5.15.104-lockdep-17461-gc1e499ed6604 #1 9292e52fa83c0e23762b2b3aa1bacf5787a4d5da Hardware name: Google Quackingstick (rev0+) (DT) Workqueue: events_unbound async_run_entry_fn Call trace: dump_backtrace+0x0/0x4ec show_stack+0x34/0x50 dump_stack_lvl+0xdc/0x11c dump_stack+0x1c/0x48 __lock_acquire+0x16d4/0x6c74 lock_acquire+0x208/0x750 __mutex_lock_common+0x11c/0x11f8 ww_mutex_lock+0xc0/0x440 create_regulator+0x398/0x7ec regulator_resolve_supply+0x654/0x7c4 regulator_register_resolve_supply+0x30/0x120 class_for_each_device+0x1b8/0x230 regulator_register+0x17a4/0x1f40 devm_regulator_register+0x60/0xd0 reg_fixed_voltage_probe+0x728/0xaec platform_probe+0x150/0x1c8 really_probe+0x274/0xa20 __driver_probe_device+0x1dc/0x3f4 driver_probe_device+0x78/0x1c0 __device_attach_driver+0x1ac/0x2c8 bus_for_each_drv+0x11c/0x190 __device_attach_async_helper+0x1e4/0x2a4 async_run_entry_fn+0xa0/0x3ac process_one_work+0x638/0x1348 worker_thread+0x4a8/0x9c4 kthread+0x2e4/0x3a0 ret_from_fork+0x10/0x20 The problem was first reported soon after we made many of the regulators probe asynchronously, though nothing I've seen implies that the problems couldn't have also happened even without that. I haven't personally been able to reproduce the lockdep issue, but the issue does look somewhat legitimate. Specifically, it looks like in regulator_resolve_supply() we are holding a "rdev" lock while calling set_supply() -> create_regulator() which grabs the lock of a _different_ "rdev" (the one for our supply). This is not necessarily safe from a lockdep perspective since there is no documented ordering between these two locks. In reality, we should always be locking a regulator before the supplying regulator, so I don't expect there to be any real deadlocks in practice. However, the regulator framework in general doesn't express this to lockdep. Let's fix the issue by simply grabbing the two locks involved in the same way we grab multiple locks elsewhere in the regulator framework: using the "wound/wait" mechanisms. Fixes: eaa7995 ("regulator: core: avoid regulator_resolve_supply() race condition") Signed-off-by: Douglas Anderson <[email protected]> Link: https://lore.kernel.org/r/20230329143317.RFC.v2.2.I30d8e1ca10cfbe5403884cdd192253a2e063eb9e@changeid Signed-off-by: Mark Brown <[email protected]>
1 parent b83a177 commit cba6cfd

File tree

1 file changed

+83
-8
lines changed

1 file changed

+83
-8
lines changed

drivers/regulator/core.c

Lines changed: 83 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -207,6 +207,78 @@ static void regulator_unlock(struct regulator_dev *rdev)
207207
mutex_unlock(&regulator_nesting_mutex);
208208
}
209209

210+
/**
211+
* regulator_lock_two - lock two regulators
212+
* @rdev1: first regulator
213+
* @rdev2: second regulator
214+
* @ww_ctx: w/w mutex acquire context
215+
*
216+
* Locks both rdevs using the regulator_ww_class.
217+
*/
218+
static void regulator_lock_two(struct regulator_dev *rdev1,
219+
struct regulator_dev *rdev2,
220+
struct ww_acquire_ctx *ww_ctx)
221+
{
222+
struct regulator_dev *tmp;
223+
int ret;
224+
225+
ww_acquire_init(ww_ctx, &regulator_ww_class);
226+
227+
/* Try to just grab both of them */
228+
ret = regulator_lock_nested(rdev1, ww_ctx);
229+
WARN_ON(ret);
230+
ret = regulator_lock_nested(rdev2, ww_ctx);
231+
if (ret != -EDEADLOCK) {
232+
WARN_ON(ret);
233+
goto exit;
234+
}
235+
236+
while (true) {
237+
/*
238+
* Start of loop: rdev1 was locked and rdev2 was contended.
239+
* Need to unlock rdev1, slowly lock rdev2, then try rdev1
240+
* again.
241+
*/
242+
regulator_unlock(rdev1);
243+
244+
ww_mutex_lock_slow(&rdev2->mutex, ww_ctx);
245+
rdev2->ref_cnt++;
246+
rdev2->mutex_owner = current;
247+
ret = regulator_lock_nested(rdev1, ww_ctx);
248+
249+
if (ret == -EDEADLOCK) {
250+
/* More contention; swap which needs to be slow */
251+
tmp = rdev1;
252+
rdev1 = rdev2;
253+
rdev2 = tmp;
254+
} else {
255+
WARN_ON(ret);
256+
break;
257+
}
258+
}
259+
260+
exit:
261+
ww_acquire_done(ww_ctx);
262+
}
263+
264+
/**
265+
* regulator_unlock_two - unlock two regulators
266+
* @rdev1: first regulator
267+
* @rdev2: second regulator
268+
* @ww_ctx: w/w mutex acquire context
269+
*
270+
* The inverse of regulator_lock_two().
271+
*/
272+
273+
static void regulator_unlock_two(struct regulator_dev *rdev1,
274+
struct regulator_dev *rdev2,
275+
struct ww_acquire_ctx *ww_ctx)
276+
{
277+
regulator_unlock(rdev2);
278+
regulator_unlock(rdev1);
279+
ww_acquire_fini(ww_ctx);
280+
}
281+
210282
static bool regulator_supply_is_couple(struct regulator_dev *rdev)
211283
{
212284
struct regulator_dev *c_rdev;
@@ -1627,8 +1699,8 @@ static int set_machine_constraints(struct regulator_dev *rdev)
16271699

16281700
/**
16291701
* set_supply - set regulator supply regulator
1630-
* @rdev: regulator name
1631-
* @supply_rdev: supply regulator name
1702+
* @rdev: regulator (locked)
1703+
* @supply_rdev: supply regulator (locked))
16321704
*
16331705
* Called by platform initialisation code to set the supply regulator for this
16341706
* regulator. This ensures that a regulators supply will also be enabled by the
@@ -1800,6 +1872,8 @@ static struct regulator *create_regulator(struct regulator_dev *rdev,
18001872
struct regulator *regulator;
18011873
int err = 0;
18021874

1875+
lockdep_assert_held_once(&rdev->mutex.base);
1876+
18031877
if (dev) {
18041878
char buf[REG_STR_SIZE];
18051879
int size;
@@ -1827,9 +1901,7 @@ static struct regulator *create_regulator(struct regulator_dev *rdev,
18271901
regulator->rdev = rdev;
18281902
regulator->supply_name = supply_name;
18291903

1830-
regulator_lock(rdev);
18311904
list_add(&regulator->list, &rdev->consumer_list);
1832-
regulator_unlock(rdev);
18331905

18341906
if (dev) {
18351907
regulator->dev = dev;
@@ -1995,6 +2067,7 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
19952067
{
19962068
struct regulator_dev *r;
19972069
struct device *dev = rdev->dev.parent;
2070+
struct ww_acquire_ctx ww_ctx;
19982071
int ret = 0;
19992072

20002073
/* No supply to resolve? */
@@ -2061,23 +2134,23 @@ static int regulator_resolve_supply(struct regulator_dev *rdev)
20612134
* between rdev->supply null check and setting rdev->supply in
20622135
* set_supply() from concurrent tasks.
20632136
*/
2064-
regulator_lock(rdev);
2137+
regulator_lock_two(rdev, r, &ww_ctx);
20652138

20662139
/* Supply just resolved by a concurrent task? */
20672140
if (rdev->supply) {
2068-
regulator_unlock(rdev);
2141+
regulator_unlock_two(rdev, r, &ww_ctx);
20692142
put_device(&r->dev);
20702143
goto out;
20712144
}
20722145

20732146
ret = set_supply(rdev, r);
20742147
if (ret < 0) {
2075-
regulator_unlock(rdev);
2148+
regulator_unlock_two(rdev, r, &ww_ctx);
20762149
put_device(&r->dev);
20772150
goto out;
20782151
}
20792152

2080-
regulator_unlock(rdev);
2153+
regulator_unlock_two(rdev, r, &ww_ctx);
20812154

20822155
/*
20832156
* In set_machine_constraints() we may have turned this regulator on
@@ -2190,7 +2263,9 @@ struct regulator *_regulator_get(struct device *dev, const char *id,
21902263
return regulator;
21912264
}
21922265

2266+
regulator_lock(rdev);
21932267
regulator = create_regulator(rdev, dev, id);
2268+
regulator_unlock(rdev);
21942269
if (regulator == NULL) {
21952270
regulator = ERR_PTR(-ENOMEM);
21962271
module_put(rdev->owner);

0 commit comments

Comments
 (0)