Skip to content

Commit 90d41e7

Browse files
mcgrofgregkh
authored andcommitted
firmware: fix batched requests - send wake up on failure on direct lookups
Fix batched requests from waiting forever on failure. The firmware API batched requests feature has been broken since the API call request_firmware_direct() was introduced on commit bba3a87 ("firmware: Introduce request_firmware_direct()"), added on v3.14 *iff* the firmware being requested was not present in *certain kernel builds* [0]. When no firmware is found the worker which goes on to finish never informs waiters queued up of this, so any batched request will stall in what seems to be forever (MAX_SCHEDULE_TIMEOUT). Sadly, a reboot will also stall, as the reboot notifier was only designed to kill custom fallback workers. The issue seems to the user as a type of soft lockup, what *actually* happens underneath the hood is a wait call which never completes as we failed to issue a completion on error. For device drivers with optional firmware schemes (ie, Intel iwlwifi, or Netronome -- even though it uses request_firmware() and not request_firmware_direct()), this could mean that when you boot a system with multiple cards the firmware will seem to never load on the system, or that the card is just not responsive even the driver initialization. Due to differences in scheduling possible this should not always trigger -- one would need to to ensure that multiple requests are in place at the right time for this to work, also release_firmware() must not be called prior to any other incoming request. The complexity may not be worth supporting batched requests in the future given the wait mechanism is only used also for the fallback mechanism. We'll keep it for now and just fix it. Its reported that at least with the Intel WiFi cards on one system this issue was creeping up 50% of the boots [0]. Before this commit batched requests testing revealed: ============================================================================ CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n CONFIG_FW_LOADER_USER_HELPER=y Most common Linux distribution setup. API-type no-firmware-found firmware-found ---------------------------------------------------------------------- request_firmware() FAIL OK request_firmware_direct() FAIL OK request_firmware_nowait(uevent=true) FAIL OK request_firmware_nowait(uevent=false) FAIL OK ============================================================================ CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n CONFIG_FW_LOADER_USER_HELPER=n Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare. API-type no-firmware-found firmware-found ---------------------------------------------------------------------- request_firmware() FAIL OK request_firmware_direct() FAIL OK request_firmware_nowait(uevent=true) FAIL OK request_firmware_nowait(uevent=false) FAIL OK ============================================================================ CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y CONFIG_FW_LOADER_USER_HELPER=y Google Android setup. API-type no-firmware-found firmware-found ---------------------------------------------------------------------- request_firmware() OK OK request_firmware_direct() FAIL OK request_firmware_nowait(uevent=true) OK OK request_firmware_nowait(uevent=false) OK OK ============================================================================ Ater this commit batched testing results: ============================================================================ CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n CONFIG_FW_LOADER_USER_HELPER=y Most common Linux distribution setup. API-type no-firmware-found firmware-found ---------------------------------------------------------------------- request_firmware() OK OK request_firmware_direct() OK OK request_firmware_nowait(uevent=true) OK OK request_firmware_nowait(uevent=false) OK OK ============================================================================ CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n CONFIG_FW_LOADER_USER_HELPER=n Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare. API-type no-firmware-found firmware-found ---------------------------------------------------------------------- request_firmware() OK OK request_firmware_direct() OK OK request_firmware_nowait(uevent=true) OK OK request_firmware_nowait(uevent=false) OK OK ============================================================================ CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y CONFIG_FW_LOADER_USER_HELPER=y Google Android setup. API-type no-firmware-found firmware-found ---------------------------------------------------------------------- request_firmware() OK OK request_firmware_direct() OK OK request_firmware_nowait(uevent=true) OK OK request_firmware_nowait(uevent=false) OK OK ============================================================================ [0] https://bugzilla.kernel.org/show_bug.cgi?id=195477 Cc: stable <[email protected]> # v3.14 Fixes: bba3a87 ("firmware: Introduce request_firmware_direct()" Reported-by: Nicolas <[email protected]> Reported-by: John Ewalt <[email protected]> Reported-by: Jakub Kicinski <[email protected]> Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1 parent e44565f commit 90d41e7

File tree

1 file changed

+30
-8
lines changed

1 file changed

+30
-8
lines changed

drivers/base/firmware_class.c

Lines changed: 30 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -153,28 +153,27 @@ static void __fw_state_set(struct fw_state *fw_st,
153153
__fw_state_set(fw_st, FW_STATUS_LOADING)
154154
#define fw_state_done(fw_st) \
155155
__fw_state_set(fw_st, FW_STATUS_DONE)
156+
#define fw_state_aborted(fw_st) \
157+
__fw_state_set(fw_st, FW_STATUS_ABORTED)
156158
#define fw_state_wait(fw_st) \
157159
__fw_state_wait_common(fw_st, MAX_SCHEDULE_TIMEOUT)
158160

159-
#ifndef CONFIG_FW_LOADER_USER_HELPER
160-
161-
#define fw_state_is_aborted(fw_st) false
162-
163-
#else /* CONFIG_FW_LOADER_USER_HELPER */
164-
165161
static int __fw_state_check(struct fw_state *fw_st, enum fw_status status)
166162
{
167163
return fw_st->status == status;
168164
}
169165

166+
#define fw_state_is_aborted(fw_st) \
167+
__fw_state_check(fw_st, FW_STATUS_ABORTED)
168+
169+
#ifdef CONFIG_FW_LOADER_USER_HELPER
170+
170171
#define fw_state_aborted(fw_st) \
171172
__fw_state_set(fw_st, FW_STATUS_ABORTED)
172173
#define fw_state_is_done(fw_st) \
173174
__fw_state_check(fw_st, FW_STATUS_DONE)
174175
#define fw_state_is_loading(fw_st) \
175176
__fw_state_check(fw_st, FW_STATUS_LOADING)
176-
#define fw_state_is_aborted(fw_st) \
177-
__fw_state_check(fw_st, FW_STATUS_ABORTED)
178177
#define fw_state_wait_timeout(fw_st, timeout) \
179178
__fw_state_wait_common(fw_st, timeout)
180179

@@ -1198,6 +1197,28 @@ _request_firmware_prepare(struct firmware **firmware_p, const char *name,
11981197
return 1; /* need to load */
11991198
}
12001199

1200+
/*
1201+
* Batched requests need only one wake, we need to do this step last due to the
1202+
* fallback mechanism. The buf is protected with kref_get(), and it won't be
1203+
* released until the last user calls release_firmware().
1204+
*
1205+
* Failed batched requests are possible as well, in such cases we just share
1206+
* the struct firmware_buf and won't release it until all requests are woken
1207+
* and have gone through this same path.
1208+
*/
1209+
static void fw_abort_batch_reqs(struct firmware *fw)
1210+
{
1211+
struct firmware_buf *buf;
1212+
1213+
/* Loaded directly? */
1214+
if (!fw || !fw->priv)
1215+
return;
1216+
1217+
buf = fw->priv;
1218+
if (!fw_state_is_aborted(&buf->fw_st))
1219+
fw_state_aborted(&buf->fw_st);
1220+
}
1221+
12011222
/* called from request_firmware() and request_firmware_work_func() */
12021223
static int
12031224
_request_firmware(const struct firmware **firmware_p, const char *name,
@@ -1241,6 +1262,7 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
12411262

12421263
out:
12431264
if (ret < 0) {
1265+
fw_abort_batch_reqs(fw);
12441266
release_firmware(fw);
12451267
fw = NULL;
12461268
}

0 commit comments

Comments
 (0)