Skip to content

Commit 68b3bca

Browse files
Kalesh APrleon
authored andcommitted
RDMA/bnxt_re: Correct the sequence of device suspend
When in fatal error condition, mark device as detached first and then complete all pending HWRM commands as firmware is not going to process them and eventually time out. Move the device to error only if suspend is called when device is in Fatal state. Also, remove some outdated comments. Remove the stop_irq call which is no longer required. Fixes: cc5b9b4 ("RDMA/bnxt_re: Recover the device when FW error is detected") Signed-off-by: Kalesh AP <[email protected]> Signed-off-by: Selvin Xavier <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
1 parent bfb27ae commit 68b3bca

File tree

1 file changed

+5
-23
lines changed
  • drivers/infiniband/hw/bnxt_re

1 file changed

+5
-23
lines changed

drivers/infiniband/hw/bnxt_re/main.c

Lines changed: 5 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2347,30 +2347,19 @@ static int bnxt_re_suspend(struct auxiliary_device *adev, pm_message_t state)
23472347
rdev = en_info->rdev;
23482348
en_dev = en_info->en_dev;
23492349
mutex_lock(&bnxt_re_mutex);
2350-
/* L2 driver may invoke this callback during device error/crash or device
2351-
* reset. Current RoCE driver doesn't recover the device in case of
2352-
* error. Handle the error by dispatching fatal events to all qps
2353-
* ie. by calling bnxt_re_dev_stop and release the MSIx vectors as
2354-
* L2 driver want to modify the MSIx table.
2355-
*/
23562350

23572351
ibdev_info(&rdev->ibdev, "Handle device suspend call");
23582352
/* Check the current device state from bnxt_en_dev and move the
23592353
* device to detached state if FW_FATAL_COND is set.
23602354
* This prevents more commands to HW during clean-up,
23612355
* in case the device is already in error.
23622356
*/
2363-
if (test_bit(BNXT_STATE_FW_FATAL_COND, &rdev->en_dev->en_state))
2357+
if (test_bit(BNXT_STATE_FW_FATAL_COND, &rdev->en_dev->en_state)) {
23642358
set_bit(ERR_DEVICE_DETACHED, &rdev->rcfw.cmdq.flags);
2365-
2366-
bnxt_re_dev_stop(rdev);
2367-
bnxt_re_stop_irq(adev);
2368-
/* Move the device states to detached and avoid sending any more
2369-
* commands to HW
2370-
*/
2371-
set_bit(BNXT_RE_FLAG_ERR_DEVICE_DETACHED, &rdev->flags);
2372-
set_bit(ERR_DEVICE_DETACHED, &rdev->rcfw.cmdq.flags);
2373-
wake_up_all(&rdev->rcfw.cmdq.waitq);
2359+
set_bit(BNXT_RE_FLAG_ERR_DEVICE_DETACHED, &rdev->flags);
2360+
wake_up_all(&rdev->rcfw.cmdq.waitq);
2361+
bnxt_re_dev_stop(rdev);
2362+
}
23742363

23752364
if (rdev->pacing.dbr_pacing)
23762365
bnxt_re_set_pacing_dev_state(rdev);
@@ -2392,13 +2381,6 @@ static int bnxt_re_resume(struct auxiliary_device *adev)
23922381
return 0;
23932382

23942383
mutex_lock(&bnxt_re_mutex);
2395-
/* L2 driver may invoke this callback during device recovery, resume.
2396-
* reset. Current RoCE driver doesn't recover the device in case of
2397-
* error. Handle the error by dispatching fatal events to all qps
2398-
* ie. by calling bnxt_re_dev_stop and release the MSIx vectors as
2399-
* L2 driver want to modify the MSIx table.
2400-
*/
2401-
24022384
bnxt_re_add_device(adev, BNXT_RE_POST_RECOVERY_INIT);
24032385
rdev = en_info->rdev;
24042386
ibdev_info(&rdev->ibdev, "Device resume completed");

0 commit comments

Comments
 (0)