Skip to content

Commit 2ac9b97

Browse files
committed
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley: "Two straightforward fixes. One is a concurrency issue only affecting SAS connected SATA drives, but which could hang the storage subsystem if it triggers (because the outstanding command count on error never goes back to zero) and the other is a NO_TAG fallout from the switch to hostwide tags which causes the system to crash on module insertion (we've checked carefully and only the 53c700 family of drivers is vulnerable to this issue)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: 53c700: fix BUG on untagged commands scsi: fix race between simultaneous decrements of ->host_failed
2 parents da2f6ab + 951d77f commit 2ac9b97

File tree

4 files changed

+12
-6
lines changed

4 files changed

+12
-6
lines changed

Documentation/scsi/scsi_eh.txt

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -263,19 +263,23 @@ scmd->allowed.
263263

264264
3. scmd recovered
265265
ACTION: scsi_eh_finish_cmd() is invoked to EH-finish scmd
266-
- shost->host_failed--
267266
- clear scmd->eh_eflags
268267
- scsi_setup_cmd_retry()
269268
- move from local eh_work_q to local eh_done_q
270269
LOCKING: none
270+
CONCURRENCY: at most one thread per separate eh_work_q to
271+
keep queue manipulation lockless
271272

272273
4. EH completes
273274
ACTION: scsi_eh_flush_done_q() retries scmds or notifies upper
274-
layer of failure.
275+
layer of failure. May be called concurrently but must have
276+
a no more than one thread per separate eh_work_q to
277+
manipulate the queue locklessly
275278
- scmd is removed from eh_done_q and scmd->eh_entry is cleared
276279
- if retry is necessary, scmd is requeued using
277280
scsi_queue_insert()
278281
- otherwise, scsi_finish_command() is invoked for scmd
282+
- zero shost->host_failed
279283
LOCKING: queue or finish function performs appropriate locking
280284

281285

drivers/ata/libata-eh.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -606,7 +606,7 @@ void ata_scsi_error(struct Scsi_Host *host)
606606
ata_scsi_port_error_handler(host, ap);
607607

608608
/* finish or retry handled scmd's and clean up */
609-
WARN_ON(host->host_failed || !list_empty(&eh_work_q));
609+
WARN_ON(!list_empty(&eh_work_q));
610610

611611
DPRINTK("EXIT\n");
612612
}

drivers/scsi/53c700.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1122,7 +1122,7 @@ process_script_interrupt(__u32 dsps, __u32 dsp, struct scsi_cmnd *SCp,
11221122
} else {
11231123
struct scsi_cmnd *SCp;
11241124

1125-
SCp = scsi_host_find_tag(SDp->host, SCSI_NO_TAG);
1125+
SCp = SDp->current_cmnd;
11261126
if(unlikely(SCp == NULL)) {
11271127
sdev_printk(KERN_ERR, SDp,
11281128
"no saved request for untagged cmd\n");
@@ -1826,7 +1826,7 @@ NCR_700_queuecommand_lck(struct scsi_cmnd *SCp, void (*done)(struct scsi_cmnd *)
18261826
slot->tag, slot);
18271827
} else {
18281828
slot->tag = SCSI_NO_TAG;
1829-
/* must populate current_cmnd for scsi_host_find_tag to work */
1829+
/* save current command for reselection */
18301830
SCp->device->current_cmnd = SCp;
18311831
}
18321832
/* sanity check: some of the commands generated by the mid-layer

drivers/scsi/scsi_error.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1128,7 +1128,6 @@ static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn)
11281128
*/
11291129
void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q)
11301130
{
1131-
scmd->device->host->host_failed--;
11321131
scmd->eh_eflags = 0;
11331132
list_move_tail(&scmd->eh_entry, done_q);
11341133
}
@@ -2227,6 +2226,9 @@ int scsi_error_handler(void *data)
22272226
else
22282227
scsi_unjam_host(shost);
22292228

2229+
/* All scmds have been handled */
2230+
shost->host_failed = 0;
2231+
22302232
/*
22312233
* Note - if the above fails completely, the action is to take
22322234
* individual devices offline and flush the queue of any

0 commit comments

Comments
 (0)