Skip to content

Commit 9771271

Browse files
author
Linus Torvalds
committed
Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
2 parents 80c0531 + 93b4768 commit 9771271

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+954
-442
lines changed

Documentation/filesystems/sysfs-pci.txt

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
Accessing PCI device resources through sysfs
2+
--------------------------------------------
23

34
sysfs, usually mounted at /sys, provides access to PCI resources on platforms
45
that support it. For example, a given bus might look like this:
@@ -47,14 +48,21 @@ files, each with their own function.
4748
binary - file contains binary data
4849
cpumask - file contains a cpumask type
4950

50-
The read only files are informational, writes to them will be ignored.
51-
Writable files can be used to perform actions on the device (e.g. changing
52-
config space, detaching a device). mmapable files are available via an
53-
mmap of the file at offset 0 and can be used to do actual device programming
54-
from userspace. Note that some platforms don't support mmapping of certain
55-
resources, so be sure to check the return value from any attempted mmap.
51+
The read only files are informational, writes to them will be ignored, with
52+
the exception of the 'rom' file. Writable files can be used to perform
53+
actions on the device (e.g. changing config space, detaching a device).
54+
mmapable files are available via an mmap of the file at offset 0 and can be
55+
used to do actual device programming from userspace. Note that some platforms
56+
don't support mmapping of certain resources, so be sure to check the return
57+
value from any attempted mmap.
58+
59+
The 'rom' file is special in that it provides read-only access to the device's
60+
ROM file, if available. It's disabled by default, however, so applications
61+
should write the string "1" to the file to enable it before attempting a read
62+
call, and disable it following the access by writing "0" to the file.
5663

5764
Accessing legacy resources through sysfs
65+
----------------------------------------
5866

5967
Legacy I/O port and ISA memory resources are also provided in sysfs if the
6068
underlying platform supports them. They're located in the PCI class heirarchy,
@@ -75,6 +83,7 @@ simply dereference the returned pointer (after checking for errors of course)
7583
to access legacy memory space.
7684

7785
Supporting PCI access on new platforms
86+
--------------------------------------
7887

7988
In order to support PCI resource mapping as described above, Linux platform
8089
code must define HAVE_PCI_MMAP and provide a pci_mmap_page_range function.

Documentation/pci-error-recovery.txt

Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
2+
PCI Error Recovery
3+
------------------
4+
May 31, 2005
5+
6+
Current document maintainer:
7+
Linas Vepstas <[email protected]>
8+
9+
10+
Some PCI bus controllers are able to detect certain "hard" PCI errors
11+
on the bus, such as parity errors on the data and address busses, as
12+
well as SERR and PERR errors. These chipsets are then able to disable
13+
I/O to/from the affected device, so that, for example, a bad DMA
14+
address doesn't end up corrupting system memory. These same chipsets
15+
are also able to reset the affected PCI device, and return it to
16+
working condition. This document describes a generic API form
17+
performing error recovery.
18+
19+
The core idea is that after a PCI error has been detected, there must
20+
be a way for the kernel to coordinate with all affected device drivers
21+
so that the pci card can be made operational again, possibly after
22+
performing a full electrical #RST of the PCI card. The API below
23+
provides a generic API for device drivers to be notified of PCI
24+
errors, and to be notified of, and respond to, a reset sequence.
25+
26+
Preliminary sketch of API, cut-n-pasted-n-modified email from
27+
Ben Herrenschmidt, circa 5 april 2005
28+
29+
The error recovery API support is exposed to the driver in the form of
30+
a structure of function pointers pointed to by a new field in struct
31+
pci_driver. The absence of this pointer in pci_driver denotes an
32+
"non-aware" driver, behaviour on these is platform dependant.
33+
Platforms like ppc64 can try to simulate pci hotplug remove/add.
34+
35+
The definition of "pci_error_token" is not covered here. It is based on
36+
Seto's work on the synchronous error detection. We still need to define
37+
functions for extracting infos out of an opaque error token. This is
38+
separate from this API.
39+
40+
This structure has the form:
41+
42+
struct pci_error_handlers
43+
{
44+
int (*error_detected)(struct pci_dev *dev, pci_error_token error);
45+
int (*mmio_enabled)(struct pci_dev *dev);
46+
int (*resume)(struct pci_dev *dev);
47+
int (*link_reset)(struct pci_dev *dev);
48+
int (*slot_reset)(struct pci_dev *dev);
49+
};
50+
51+
A driver doesn't have to implement all of these callbacks. The
52+
only mandatory one is error_detected(). If a callback is not
53+
implemented, the corresponding feature is considered unsupported.
54+
For example, if mmio_enabled() and resume() aren't there, then the
55+
driver is assumed as not doing any direct recovery and requires
56+
a reset. If link_reset() is not implemented, the card is assumed as
57+
not caring about link resets, in which case, if recover is supported,
58+
the core can try recover (but not slot_reset() unless it really did
59+
reset the slot). If slot_reset() is not supported, link_reset() can
60+
be called instead on a slot reset.
61+
62+
At first, the call will always be :
63+
64+
1) error_detected()
65+
66+
Error detected. This is sent once after an error has been detected. At
67+
this point, the device might not be accessible anymore depending on the
68+
platform (the slot will be isolated on ppc64). The driver may already
69+
have "noticed" the error because of a failing IO, but this is the proper
70+
"synchronisation point", that is, it gives a chance to the driver to
71+
cleanup, waiting for pending stuff (timers, whatever, etc...) to
72+
complete; it can take semaphores, schedule, etc... everything but touch
73+
the device. Within this function and after it returns, the driver
74+
shouldn't do any new IOs. Called in task context. This is sort of a
75+
"quiesce" point. See note about interrupts at the end of this doc.
76+
77+
Result codes:
78+
- PCIERR_RESULT_CAN_RECOVER:
79+
Driever returns this if it thinks it might be able to recover
80+
the HW by just banging IOs or if it wants to be given
81+
a chance to extract some diagnostic informations (see
82+
below).
83+
- PCIERR_RESULT_NEED_RESET:
84+
Driver returns this if it thinks it can't recover unless the
85+
slot is reset.
86+
- PCIERR_RESULT_DISCONNECT:
87+
Return this if driver thinks it won't recover at all,
88+
(this will detach the driver ? or just leave it
89+
dangling ? to be decided)
90+
91+
So at this point, we have called error_detected() for all drivers
92+
on the segment that had the error. On ppc64, the slot is isolated. What
93+
happens now typically depends on the result from the drivers. If all
94+
drivers on the segment/slot return PCIERR_RESULT_CAN_RECOVER, we would
95+
re-enable IOs on the slot (or do nothing special if the platform doesn't
96+
isolate slots) and call 2). If not and we can reset slots, we go to 4),
97+
if neither, we have a dead slot. If it's an hotplug slot, we might
98+
"simulate" reset by triggering HW unplug/replug though.
99+
100+
>>> Current ppc64 implementation assumes that a device driver will
101+
>>> *not* schedule or semaphore in this routine; the current ppc64
102+
>>> implementation uses one kernel thread to notify all devices;
103+
>>> thus, of one device sleeps/schedules, all devices are affected.
104+
>>> Doing better requires complex multi-threaded logic in the error
105+
>>> recovery implementation (e.g. waiting for all notification threads
106+
>>> to "join" before proceeding with recovery.) This seems excessively
107+
>>> complex and not worth implementing.
108+
109+
>>> The current ppc64 implementation doesn't much care if the device
110+
>>> attempts i/o at this point, or not. I/O's will fail, returning
111+
>>> a value of 0xff on read, and writes will be dropped. If the device
112+
>>> driver attempts more than 10K I/O's to a frozen adapter, it will
113+
>>> assume that the device driver has gone into an infinite loop, and
114+
>>> it will panic the the kernel.
115+
116+
2) mmio_enabled()
117+
118+
This is the "early recovery" call. IOs are allowed again, but DMA is
119+
not (hrm... to be discussed, I prefer not), with some restrictions. This
120+
is NOT a callback for the driver to start operations again, only to
121+
peek/poke at the device, extract diagnostic information, if any, and
122+
eventually do things like trigger a device local reset or some such,
123+
but not restart operations. This is sent if all drivers on a segment
124+
agree that they can try to recover and no automatic link reset was
125+
performed by the HW. If the platform can't just re-enable IOs without
126+
a slot reset or a link reset, it doesn't call this callback and goes
127+
directly to 3) or 4). All IOs should be done _synchronously_ from
128+
within this callback, errors triggered by them will be returned via
129+
the normal pci_check_whatever() api, no new error_detected() callback
130+
will be issued due to an error happening here. However, such an error
131+
might cause IOs to be re-blocked for the whole segment, and thus
132+
invalidate the recovery that other devices on the same segment might
133+
have done, forcing the whole segment into one of the next states,
134+
that is link reset or slot reset.
135+
136+
Result codes:
137+
- PCIERR_RESULT_RECOVERED
138+
Driver returns this if it thinks the device is fully
139+
functionnal and thinks it is ready to start
140+
normal driver operations again. There is no
141+
guarantee that the driver will actually be
142+
allowed to proceed, as another driver on the
143+
same segment might have failed and thus triggered a
144+
slot reset on platforms that support it.
145+
146+
- PCIERR_RESULT_NEED_RESET
147+
Driver returns this if it thinks the device is not
148+
recoverable in it's current state and it needs a slot
149+
reset to proceed.
150+
151+
- PCIERR_RESULT_DISCONNECT
152+
Same as above. Total failure, no recovery even after
153+
reset driver dead. (To be defined more precisely)
154+
155+
>>> The current ppc64 implementation does not implement this callback.
156+
157+
3) link_reset()
158+
159+
This is called after the link has been reset. This is typically
160+
a PCI Express specific state at this point and is done whenever a
161+
non-fatal error has been detected that can be "solved" by resetting
162+
the link. This call informs the driver of the reset and the driver
163+
should check if the device appears to be in working condition.
164+
This function acts a bit like 2) mmio_enabled(), in that the driver
165+
is not supposed to restart normal driver I/O operations right away.
166+
Instead, it should just "probe" the device to check it's recoverability
167+
status. If all is right, then the core will call resume() once all
168+
drivers have ack'd link_reset().
169+
170+
Result codes:
171+
(identical to mmio_enabled)
172+
173+
>>> The current ppc64 implementation does not implement this callback.
174+
175+
4) slot_reset()
176+
177+
This is called after the slot has been soft or hard reset by the
178+
platform. A soft reset consists of asserting the adapter #RST line
179+
and then restoring the PCI BARs and PCI configuration header. If the
180+
platform supports PCI hotplug, then it might instead perform a hard
181+
reset by toggling power on the slot off/on. This call gives drivers
182+
the chance to re-initialize the hardware (re-download firmware, etc.),
183+
but drivers shouldn't restart normal I/O processing operations at
184+
this point. (See note about interrupts; interrupts aren't guaranteed
185+
to be delivered until the resume() callback has been called). If all
186+
device drivers report success on this callback, the patform will call
187+
resume() to complete the error handling and let the driver restart
188+
normal I/O processing.
189+
190+
A driver can still return a critical failure for this function if
191+
it can't get the device operational after reset. If the platform
192+
previously tried a soft reset, it migh now try a hard reset (power
193+
cycle) and then call slot_reset() again. It the device still can't
194+
be recovered, there is nothing more that can be done; the platform
195+
will typically report a "permanent failure" in such a case. The
196+
device will be considered "dead" in this case.
197+
198+
Result codes:
199+
- PCIERR_RESULT_DISCONNECT
200+
Same as above.
201+
202+
>>> The current ppc64 implementation does not try a power-cycle reset
203+
>>> if the driver returned PCIERR_RESULT_DISCONNECT. However, it should.
204+
205+
5) resume()
206+
207+
This is called if all drivers on the segment have returned
208+
PCIERR_RESULT_RECOVERED from one of the 3 prevous callbacks.
209+
That basically tells the driver to restart activity, tht everything
210+
is back and running. No result code is taken into account here. If
211+
a new error happens, it will restart a new error handling process.
212+
213+
That's it. I think this covers all the possibilities. The way those
214+
callbacks are called is platform policy. A platform with no slot reset
215+
capability for example may want to just "ignore" drivers that can't
216+
recover (disconnect them) and try to let other cards on the same segment
217+
recover. Keep in mind that in most real life cases, though, there will
218+
be only one driver per segment.
219+
220+
Now, there is a note about interrupts. If you get an interrupt and your
221+
device is dead or has been isolated, there is a problem :)
222+
223+
After much thinking, I decided to leave that to the platform. That is,
224+
the recovery API only precies that:
225+
226+
- There is no guarantee that interrupt delivery can proceed from any
227+
device on the segment starting from the error detection and until the
228+
restart callback is sent, at which point interrupts are expected to be
229+
fully operational.
230+
231+
- There is no guarantee that interrupt delivery is stopped, that is, ad
232+
river that gets an interrupts after detecting an error, or that detects
233+
and error within the interrupt handler such that it prevents proper
234+
ack'ing of the interrupt (and thus removal of the source) should just
235+
return IRQ_NOTHANDLED. It's up to the platform to deal with taht
236+
condition, typically by masking the irq source during the duration of
237+
the error handling. It is expected that the platform "knows" which
238+
interrupts are routed to error-management capable slots and can deal
239+
with temporarily disabling that irq number during error processing (this
240+
isn't terribly complex). That means some IRQ latency for other devices
241+
sharing the interrupt, but there is simply no other way. High end
242+
platforms aren't supposed to share interrupts between many devices
243+
anyway :)
244+
245+
246+
Revised: 31 May 2005 Linas Vepstas <[email protected]>

MAINTAINERS

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1987,6 +1987,13 @@ M: [email protected]
19871987
19881988
S: Maintained
19891989

1990+
PCI ERROR RECOVERY
1991+
P: Linas Vepstas
1992+
1993+
1994+
1995+
S: Supported
1996+
19901997
PCI SOUND DRIVERS (ES1370, ES1371 and SONICVIBES)
19911998
P: Thomas Sailer
19921999

arch/alpha/kernel/sys_alcor.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -254,14 +254,15 @@ alcor_init_pci(void)
254254
* motherboard, by looking for a 21040 TULIP in slot 6, which is
255255
* built into XLT and BRET/MAVERICK, but not available on ALCOR.
256256
*/
257-
dev = pci_find_device(PCI_VENDOR_ID_DEC,
257+
dev = pci_get_device(PCI_VENDOR_ID_DEC,
258258
PCI_DEVICE_ID_DEC_TULIP,
259259
NULL);
260260
if (dev && dev->devfn == PCI_DEVFN(6,0)) {
261261
alpha_mv.sys.cia.gru_int_req_bits = XLT_GRU_INT_REQ_BITS;
262262
printk(KERN_INFO "%s: Detected AS500 or XLT motherboard.\n",
263263
__FUNCTION__);
264264
}
265+
pci_dev_put(dev);
265266
}
266267

267268

arch/alpha/kernel/sys_sio.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,7 @@ sio_collect_irq_levels(void)
105105
struct pci_dev *dev = NULL;
106106

107107
/* Iterate through the devices, collecting IRQ levels. */
108-
while ((dev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
108+
for_each_pci_dev(dev) {
109109
if ((dev->class >> 16 == PCI_BASE_CLASS_BRIDGE) &&
110110
(dev->class >> 8 != PCI_CLASS_BRIDGE_PCMCIA))
111111
continue;
@@ -229,8 +229,8 @@ alphabook1_init_pci(void)
229229
*/
230230

231231
dev = NULL;
232-
while ((dev = pci_find_device(PCI_VENDOR_ID_NCR, PCI_ANY_ID, dev))) {
233-
if (dev->device == PCI_DEVICE_ID_NCR_53C810
232+
while ((dev = pci_get_device(PCI_VENDOR_ID_NCR, PCI_ANY_ID, dev))) {
233+
if (dev->device == PCI_DEVICE_ID_NCR_53C810
234234
|| dev->device == PCI_DEVICE_ID_NCR_53C815
235235
|| dev->device == PCI_DEVICE_ID_NCR_53C820
236236
|| dev->device == PCI_DEVICE_ID_NCR_53C825) {

arch/frv/mb93090-mb00/pci-frv.c

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -142,9 +142,7 @@ static void __init pcibios_allocate_resources(int pass)
142142
u16 command;
143143
struct resource *r, *pr;
144144

145-
while (dev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, dev),
146-
dev != NULL
147-
) {
145+
for_each_pci_dev(dev) {
148146
pci_read_config_word(dev, PCI_COMMAND, &command);
149147
for(idx = 0; idx < 6; idx++) {
150148
r = &dev->resource[idx];
@@ -188,9 +186,7 @@ static void __init pcibios_assign_resources(void)
188186
int idx;
189187
struct resource *r;
190188

191-
while (dev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, dev),
192-
dev != NULL
193-
) {
189+
for_each_pci_dev(dev) {
194190
int class = dev->class >> 8;
195191

196192
/* Don't touch classless devices and host bridges */

arch/frv/mb93090-mb00/pci-irq.c

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -48,9 +48,7 @@ void __init pcibios_fixup_irqs(void)
4848
struct pci_dev *dev = NULL;
4949
uint8_t line, pin;
5050

51-
while (dev = pci_find_device(PCI_ANY_ID, PCI_ANY_ID, dev),
52-
dev != NULL
53-
) {
51+
for_each_pci_dev(dev) {
5452
pci_read_config_byte(dev, PCI_INTERRUPT_PIN, &pin);
5553
if (pin) {
5654
dev->irq = pci_bus0_irq_routing[PCI_SLOT(dev->devfn)][pin - 1];

arch/i386/kernel/scx200.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,7 @@ static int __init scx200_init(void)
143143
{
144144
printk(KERN_INFO NAME ": NatSemi SCx200 Driver\n");
145145

146-
return pci_module_init(&scx200_pci_driver);
146+
return pci_register_driver(&scx200_pci_driver);
147147
}
148148

149149
static void __exit scx200_cleanup(void)

arch/i386/pci/acpi.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ static int __init pci_acpi_init(void)
5353
* don't use pci_enable_device().
5454
*/
5555
printk(KERN_INFO "PCI: Routing PCI interrupts for all devices because \"pci=routeirq\" specified\n");
56-
while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL)
56+
for_each_pci_dev(dev)
5757
acpi_pci_irq_enable(dev);
5858
} else
5959
printk(KERN_INFO "PCI: If a device doesn't work, try \"pci=routeirq\". If it helps, post a report\n");

0 commit comments

Comments
 (0)