Skip to content

Commit 7c3dc44

Browse files
committed
Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link (CXL) updates from Dan Williams: "To date Linux has been dependent on platform-firmware to map CXL RAM regions and handle events / errors from devices. With this update we can now parse / update the CXL memory layout, and report events / errors from devices. This is a precursor for the CXL subsystem to handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that for DDR-attached-DRAM is handled by the EDAC driver where it maps system physical address events to a field-replaceable-unit (FRU / endpoint device). In general, CXL has the potential to standardize what has historically been a pile of memory-controller-specific error handling logic. Another change of note is the default policy for handling RAM-backed device-dax instances. Previously the default access mode was "device", mmap(2) a device special file to access memory. The new default is "kmem" where the address range is assigned to the core-mm via add_memory_driver_managed(). This saves typical users from wondering why their platform memory is not visible via free(1) and stuck behind a device-file. At the same time it allows expert users to deploy policy to, for example, get dedicated access to high performance memory, or hide low performance memory from general purpose kernel allocations. This affects not only CXL, but also systems with high-bandwidth-memory that platform-firmware tags with the EFI_MEMORY_SP (special purpose) designation. Summary: - CXL RAM region enumeration: instantiate 'struct cxl_region' objects for platform firmware created memory regions - CXL RAM region provisioning: complement the existing PMEM region creation support with RAM region support - "Soft Reservation" policy change: Online (memory hot-add) soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for setting aside such memory for dedicated access via device-dax. - CXL Events and Interrupts: Takeover CXL event handling from platform-firmware (ACPI calls this CXL Memory Error Reporting) and export CXL Events via Linux Trace Events. - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL subsystem interrogate the result of CXL _OSC negotiation. - Emulate CXL DVSEC Range Registers as "decoders": Allow for first-generation devices that pre-date the definition of the CXL HDM Decoder Capability to translate the CXL DVSEC Range Registers into 'struct cxl_decoder' objects. - Set timestamp: Per spec, set the device timestamp in case of hotplug, or if platform-firwmare failed to set it. - General fixups: linux-next build issues, non-urgent fixes for pre-production hardware, unit test fixes, spelling and debug message improvements" * tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits) dax/kmem: Fix leak of memory-hotplug resources cxl/mem: Add kdoc param for event log driver state cxl/trace: Add serial number to trace points cxl/trace: Add host output to trace points cxl/trace: Standardize device information output cxl/pci: Remove locked check for dvsec_range_allowed() cxl/hdm: Add emulation when HDM decoders are not committed cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders cxl/hdm: Emulate HDM decoder from DVSEC range registers cxl/pci: Refactor cxl_hdm_decode_init() cxl/port: Export cxl_dvsec_rr_decode() to cxl_port cxl/pci: Break out range register decoding from cxl_hdm_decode_init() cxl: add RAS status unmasking for CXL cxl: remove unnecessary calling of pci_enable_pcie_error_reporting() dax/hmem: build hmem device support as module if possible dax: cxl: add CXL_REGION dependency cxl: avoid returning uninitialized error code cxl/pmem: Fix nvdimm registration races cxl/mem: Fix UAPI command comment cxl/uapi: Tag commands from cxl_query_cmd() ...
2 parents d8e4731 + e686c32 commit 7c3dc44

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+3788
-740
lines changed

Documentation/ABI/testing/sysfs-bus-cxl

Lines changed: 53 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,21 @@ Description:
9090
capability.
9191

9292

93+
What: /sys/bus/cxl/devices/{port,endpoint}X/parent_dport
94+
Date: January, 2023
95+
KernelVersion: v6.3
96+
97+
Description:
98+
(RO) CXL port objects are instantiated for each upstream port in
99+
a CXL/PCIe switch, and for each endpoint to map the
100+
corresponding memory device into the CXL port hierarchy. When a
101+
descendant CXL port (switch or endpoint) is enumerated it is
102+
useful to know which 'dport' object in the parent CXL port
103+
routes to this descendant. The 'parent_dport' symlink points to
104+
the device representing the downstream port of a CXL switch that
105+
routes to {port,endpoint}X.
106+
107+
93108
What: /sys/bus/cxl/devices/portX/dportY
94109
Date: June, 2021
95110
KernelVersion: v5.14
@@ -183,7 +198,7 @@ Description:
183198

184199
What: /sys/bus/cxl/devices/endpointX/CDAT
185200
Date: July, 2022
186-
KernelVersion: v5.20
201+
KernelVersion: v6.0
187202
188203
Description:
189204
(RO) If this sysfs entry is not present no DOE mailbox was
@@ -194,7 +209,7 @@ Description:
194209

195210
What: /sys/bus/cxl/devices/decoderX.Y/mode
196211
Date: May, 2022
197-
KernelVersion: v5.20
212+
KernelVersion: v6.0
198213
199214
Description:
200215
(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
@@ -214,7 +229,7 @@ Description:
214229

215230
What: /sys/bus/cxl/devices/decoderX.Y/dpa_resource
216231
Date: May, 2022
217-
KernelVersion: v5.20
232+
KernelVersion: v6.0
218233
219234
Description:
220235
(RO) When a CXL decoder is of devtype "cxl_decoder_endpoint",
@@ -225,7 +240,7 @@ Description:
225240

226241
What: /sys/bus/cxl/devices/decoderX.Y/dpa_size
227242
Date: May, 2022
228-
KernelVersion: v5.20
243+
KernelVersion: v6.0
229244
230245
Description:
231246
(RW) When a CXL decoder is of devtype "cxl_decoder_endpoint" it
@@ -245,7 +260,7 @@ Description:
245260

246261
What: /sys/bus/cxl/devices/decoderX.Y/interleave_ways
247262
Date: May, 2022
248-
KernelVersion: v5.20
263+
KernelVersion: v6.0
249264
250265
Description:
251266
(RO) The number of targets across which this decoder's host
@@ -260,7 +275,7 @@ Description:
260275

261276
What: /sys/bus/cxl/devices/decoderX.Y/interleave_granularity
262277
Date: May, 2022
263-
KernelVersion: v5.20
278+
KernelVersion: v6.0
264279
265280
Description:
266281
(RO) The number of consecutive bytes of host physical address
@@ -270,25 +285,25 @@ Description:
270285
interleave_granularity).
271286

272287

273-
What: /sys/bus/cxl/devices/decoderX.Y/create_pmem_region
274-
Date: May, 2022
275-
KernelVersion: v5.20
288+
What: /sys/bus/cxl/devices/decoderX.Y/create_{pmem,ram}_region
289+
Date: May, 2022, January, 2023
290+
KernelVersion: v6.0 (pmem), v6.3 (ram)
276291
277292
Description:
278293
(RW) Write a string in the form 'regionZ' to start the process
279-
of defining a new persistent memory region (interleave-set)
280-
within the decode range bounded by root decoder 'decoderX.Y'.
281-
The value written must match the current value returned from
282-
reading this attribute. An atomic compare exchange operation is
283-
done on write to assign the requested id to a region and
284-
allocate the region-id for the next creation attempt. EBUSY is
285-
returned if the region name written does not match the current
286-
cached value.
294+
of defining a new persistent, or volatile memory region
295+
(interleave-set) within the decode range bounded by root decoder
296+
'decoderX.Y'. The value written must match the current value
297+
returned from reading this attribute. An atomic compare exchange
298+
operation is done on write to assign the requested id to a
299+
region and allocate the region-id for the next creation attempt.
300+
EBUSY is returned if the region name written does not match the
301+
current cached value.
287302

288303

289304
What: /sys/bus/cxl/devices/decoderX.Y/delete_region
290305
Date: May, 2022
291-
KernelVersion: v5.20
306+
KernelVersion: v6.0
292307
293308
Description:
294309
(WO) Write a string in the form 'regionZ' to delete that region,
@@ -297,17 +312,18 @@ Description:
297312

298313
What: /sys/bus/cxl/devices/regionZ/uuid
299314
Date: May, 2022
300-
KernelVersion: v5.20
315+
KernelVersion: v6.0
301316
302317
Description:
303318
(RW) Write a unique identifier for the region. This field must
304319
be set for persistent regions and it must not conflict with the
305-
UUID of another region.
320+
UUID of another region. For volatile ram regions this
321+
attribute is a read-only empty string.
306322

307323

308324
What: /sys/bus/cxl/devices/regionZ/interleave_granularity
309325
Date: May, 2022
310-
KernelVersion: v5.20
326+
KernelVersion: v6.0
311327
312328
Description:
313329
(RW) Set the number of consecutive bytes each device in the
@@ -318,7 +334,7 @@ Description:
318334

319335
What: /sys/bus/cxl/devices/regionZ/interleave_ways
320336
Date: May, 2022
321-
KernelVersion: v5.20
337+
KernelVersion: v6.0
322338
323339
Description:
324340
(RW) Configures the number of devices participating in the
@@ -328,7 +344,7 @@ Description:
328344

329345
What: /sys/bus/cxl/devices/regionZ/size
330346
Date: May, 2022
331-
KernelVersion: v5.20
347+
KernelVersion: v6.0
332348
333349
Description:
334350
(RW) System physical address space to be consumed by the region.
@@ -343,9 +359,20 @@ Description:
343359
results in the same address being allocated.
344360

345361

362+
What: /sys/bus/cxl/devices/regionZ/mode
363+
Date: January, 2023
364+
KernelVersion: v6.3
365+
366+
Description:
367+
(RO) The mode of a region is established at region creation time
368+
and dictates the mode of the endpoint decoder that comprise the
369+
region. For more details on the possible modes see
370+
/sys/bus/cxl/devices/decoderX.Y/mode
371+
372+
346373
What: /sys/bus/cxl/devices/regionZ/resource
347374
Date: May, 2022
348-
KernelVersion: v5.20
375+
KernelVersion: v6.0
349376
350377
Description:
351378
(RO) A region is a contiguous partition of a CXL root decoder
@@ -357,7 +384,7 @@ Description:
357384

358385
What: /sys/bus/cxl/devices/regionZ/target[0..N]
359386
Date: May, 2022
360-
KernelVersion: v5.20
387+
KernelVersion: v6.0
361388
362389
Description:
363390
(RW) Write an endpoint decoder object name to 'targetX' where X
@@ -376,7 +403,7 @@ Description:
376403

377404
What: /sys/bus/cxl/devices/regionZ/commit
378405
Date: May, 2022
379-
KernelVersion: v5.20
406+
KernelVersion: v6.0
380407
381408
Description:
382409
(RW) Write a boolean 'true' string value to this attribute to

MAINTAINERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5912,6 +5912,7 @@ M: Dan Williams <[email protected]>
59125912
M: Vishal Verma <[email protected]>
59135913
M: Dave Jiang <[email protected]>
59145914
5915+
59155916
S: Supported
59165917
F: drivers/dax/
59175918

drivers/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ obj-$(CONFIG_FB_INTEL) += video/fbdev/intelfb/
7171
obj-$(CONFIG_PARPORT) += parport/
7272
obj-y += base/ block/ misc/ mfd/ nfc/
7373
obj-$(CONFIG_LIBNVDIMM) += nvdimm/
74-
obj-$(CONFIG_DAX) += dax/
74+
obj-y += dax/
7575
obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf/
7676
obj-$(CONFIG_NUBUS) += nubus/
7777
obj-y += cxl/

drivers/acpi/numa/hmat.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -718,7 +718,7 @@ static void hmat_register_target_devices(struct memory_target *target)
718718
for (res = target->memregions.child; res; res = res->sibling) {
719719
int target_nid = pxm_to_node(target->memory_pxm);
720720

721-
hmem_register_device(target_nid, res);
721+
hmem_register_resource(target_nid, res);
722722
}
723723
}
724724

@@ -869,4 +869,4 @@ static __init int hmat_init(void)
869869
acpi_put_table(tbl);
870870
return 0;
871871
}
872-
device_initcall(hmat_init);
872+
subsys_initcall(hmat_init);

drivers/acpi/pci_root.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1047,6 +1047,9 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
10471047
if (!(root->osc_control_set & OSC_PCI_EXPRESS_DPC_CONTROL))
10481048
host_bridge->native_dpc = 0;
10491049

1050+
if (!(root->osc_ext_control_set & OSC_CXL_ERROR_REPORTING_CONTROL))
1051+
host_bridge->native_cxl_error = 0;
1052+
10501053
/*
10511054
* Evaluate the "PCI Boot Configuration" _DSM Function. If it
10521055
* exists and returns 0, we must preserve any PCI resource

drivers/cxl/Kconfig

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,19 +104,29 @@ config CXL_SUSPEND
104104
depends on SUSPEND && CXL_MEM
105105

106106
config CXL_REGION
107-
bool
107+
bool "CXL: Region Support"
108108
default CXL_BUS
109109
# For MAX_PHYSMEM_BITS
110110
depends on SPARSEMEM
111111
select MEMREGION
112112
select GET_FREE_REGION
113+
help
114+
Enable the CXL core to enumerate and provision CXL regions. A CXL
115+
region is defined by one or more CXL expanders that decode a given
116+
system-physical address range. For CXL regions established by
117+
platform-firmware this option enables memory error handling to
118+
identify the devices participating in a given interleaved memory
119+
range. Otherwise, platform-firmware managed CXL is enabled by being
120+
placed in the system address map and does not need a driver.
121+
122+
If unsure say 'y'
113123

114124
config CXL_REGION_INVALIDATION_TEST
115125
bool "CXL: Region Cache Management Bypass (TEST)"
116126
depends on CXL_REGION
117127
help
118128
CXL Region management and security operations potentially invalidate
119-
the content of CPU caches without notifiying those caches to
129+
the content of CPU caches without notifying those caches to
120130
invalidate the affected cachelines. The CXL Region driver attempts
121131
to invalidate caches when those events occur. If that invalidation
122132
fails the region will fail to enable. Reasons for cache

drivers/cxl/acpi.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ struct cxl_cxims_data {
1919

2020
/*
2121
* Find a targets entry (n) in the host bridge interleave list.
22-
* CXL Specfication 3.0 Table 9-22
22+
* CXL Specification 3.0 Table 9-22
2323
*/
2424
static int cxl_xor_calc_n(u64 hpa, struct cxl_cxims_data *cximsd, int iw,
2525
int ig)
@@ -731,7 +731,8 @@ static void __exit cxl_acpi_exit(void)
731731
cxl_bus_drain();
732732
}
733733

734-
module_init(cxl_acpi_init);
734+
/* load before dax_hmem sees 'Soft Reserved' CXL ranges */
735+
subsys_initcall(cxl_acpi_init);
735736
module_exit(cxl_acpi_exit);
736737
MODULE_LICENSE("GPL v2");
737738
MODULE_IMPORT_NS(CXL);

drivers/cxl/core/Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,14 @@ obj-$(CONFIG_CXL_BUS) += cxl_core.o
33
obj-$(CONFIG_CXL_SUSPEND) += suspend.o
44

55
ccflags-y += -I$(srctree)/drivers/cxl
6+
CFLAGS_trace.o = -DTRACE_INCLUDE_PATH=. -I$(src)
7+
68
cxl_core-y := port.o
79
cxl_core-y += pmem.o
810
cxl_core-y += regs.o
911
cxl_core-y += memdev.o
1012
cxl_core-y += mbox.o
1113
cxl_core-y += pci.o
1214
cxl_core-y += hdm.o
15+
cxl_core-$(CONFIG_TRACING) += trace.o
1316
cxl_core-$(CONFIG_CXL_REGION) += region.o

drivers/cxl/core/core.h

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,18 @@ extern struct attribute_group cxl_base_attribute_group;
1111

1212
#ifdef CONFIG_CXL_REGION
1313
extern struct device_attribute dev_attr_create_pmem_region;
14+
extern struct device_attribute dev_attr_create_ram_region;
1415
extern struct device_attribute dev_attr_delete_region;
1516
extern struct device_attribute dev_attr_region;
1617
extern const struct device_type cxl_pmem_region_type;
18+
extern const struct device_type cxl_dax_region_type;
1719
extern const struct device_type cxl_region_type;
1820
void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled);
1921
#define CXL_REGION_ATTR(x) (&dev_attr_##x.attr)
2022
#define CXL_REGION_TYPE(x) (&cxl_region_type)
2123
#define SET_CXL_REGION_ATTR(x) (&dev_attr_##x.attr),
2224
#define CXL_PMEM_REGION_TYPE(x) (&cxl_pmem_region_type)
25+
#define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type)
2326
int cxl_region_init(void);
2427
void cxl_region_exit(void);
2528
#else
@@ -37,6 +40,7 @@ static inline void cxl_region_exit(void)
3740
#define CXL_REGION_TYPE(x) NULL
3841
#define SET_CXL_REGION_ATTR(x)
3942
#define CXL_PMEM_REGION_TYPE(x) NULL
43+
#define CXL_DAX_REGION_TYPE(x) NULL
4044
#endif
4145

4246
struct cxl_send_command;
@@ -56,9 +60,6 @@ resource_size_t cxl_dpa_size(struct cxl_endpoint_decoder *cxled);
5660
resource_size_t cxl_dpa_resource_start(struct cxl_endpoint_decoder *cxled);
5761
extern struct rw_semaphore cxl_dpa_rwsem;
5862

59-
bool is_switch_decoder(struct device *dev);
60-
struct cxl_switch_decoder *to_cxl_switch_decoder(struct device *dev);
61-
6263
int cxl_memdev_init(void);
6364
void cxl_memdev_exit(void);
6465
void cxl_mbox_init(void);

0 commit comments

Comments
 (0)