Skip to content

Commit ec0e2dc

Browse files
committed
Merge tag 'vfio-v6.6-rc1' of https://github.com/awilliam/linux-vfio
Pull VFIO updates from Alex Williamson: - VFIO direct character device (cdev) interface support. This extracts the vfio device fd from the container and group model, and is intended to be the native uAPI for use with IOMMUFD (Yi Liu) - Enhancements to the PCI hot reset interface in support of cdev usage (Yi Liu) - Fix a potential race between registering and unregistering vfio files in the kvm-vfio interface and extend use of a lock to avoid extra drop and acquires (Dmitry Torokhov) - A new vfio-pci variant driver for the AMD/Pensando Distributed Services Card (PDS) Ethernet device, supporting live migration (Brett Creeley) - Cleanups to remove redundant owner setup in cdx and fsl bus drivers, and simplify driver init/exit in fsl code (Li Zetao) - Fix uninitialized hole in data structure and pad capability structures for alignment (Stefan Hajnoczi) * tag 'vfio-v6.6-rc1' of https://github.com/awilliam/linux-vfio: (53 commits) vfio/pds: Send type for SUSPEND_STATUS command vfio/pds: fix return value in pds_vfio_get_lm_file() pds_core: Fix function header descriptions vfio: align capability structures vfio/type1: fix cap_migration information leak vfio/fsl-mc: Use module_fsl_mc_driver macro to simplify the code vfio/cdx: Remove redundant initialization owner in vfio_cdx_driver vfio/pds: Add Kconfig and documentation vfio/pds: Add support for firmware recovery vfio/pds: Add support for dirty page tracking vfio/pds: Add VFIO live migration support vfio/pds: register with the pds_core PF pds_core: Require callers of register/unregister to pass PF drvdata vfio/pds: Initial support for pds VFIO driver vfio: Commonize combine_ranges for use in other VFIO drivers kvm/vfio: avoid bouncing the mutex when adding and deleting groups kvm/vfio: ensure kvg instance stays around in kvm_vfio_group_add() docs: vfio: Add vfio device cdev description vfio: Compile vfio_group infrastructure optionally vfio: Move the IOMMU_CAP_CACHE_COHERENCY check in __vfio_register_dev() ...
2 parents b6f6167 + 642265e commit ec0e2dc

File tree

55 files changed

+4350
-458
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+4350
-458
lines changed

Documentation/driver-api/vfio.rst

Lines changed: 144 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,137 @@ group and can access them as follows::
239239
/* Gratuitous device reset and go... */
240240
ioctl(device, VFIO_DEVICE_RESET);
241241

242+
IOMMUFD and vfio_iommu_type1
243+
----------------------------
244+
245+
IOMMUFD is the new user API to manage I/O page tables from userspace.
246+
It intends to be the portal of delivering advanced userspace DMA
247+
features (nested translation [5]_, PASID [6]_, etc.) while also providing
248+
a backwards compatibility interface for existing VFIO_TYPE1v2_IOMMU use
249+
cases. Eventually the vfio_iommu_type1 driver, as well as the legacy
250+
vfio container and group model is intended to be deprecated.
251+
252+
The IOMMUFD backwards compatibility interface can be enabled two ways.
253+
In the first method, the kernel can be configured with
254+
CONFIG_IOMMUFD_VFIO_CONTAINER, in which case the IOMMUFD subsystem
255+
transparently provides the entire infrastructure for the VFIO
256+
container and IOMMU backend interfaces. The compatibility mode can
257+
also be accessed if the VFIO container interface, ie. /dev/vfio/vfio is
258+
simply symlink'd to /dev/iommu. Note that at the time of writing, the
259+
compatibility mode is not entirely feature complete relative to
260+
VFIO_TYPE1v2_IOMMU (ex. DMA mapping MMIO) and does not attempt to
261+
provide compatibility to the VFIO_SPAPR_TCE_IOMMU interface. Therefore
262+
it is not generally advisable at this time to switch from native VFIO
263+
implementations to the IOMMUFD compatibility interfaces.
264+
265+
Long term, VFIO users should migrate to device access through the cdev
266+
interface described below, and native access through the IOMMUFD
267+
provided interfaces.
268+
269+
VFIO Device cdev
270+
----------------
271+
272+
Traditionally user acquires a device fd via VFIO_GROUP_GET_DEVICE_FD
273+
in a VFIO group.
274+
275+
With CONFIG_VFIO_DEVICE_CDEV=y the user can now acquire a device fd
276+
by directly opening a character device /dev/vfio/devices/vfioX where
277+
"X" is the number allocated uniquely by VFIO for registered devices.
278+
cdev interface does not support noiommu devices, so user should use
279+
the legacy group interface if noiommu is wanted.
280+
281+
The cdev only works with IOMMUFD. Both VFIO drivers and applications
282+
must adapt to the new cdev security model which requires using
283+
VFIO_DEVICE_BIND_IOMMUFD to claim DMA ownership before starting to
284+
actually use the device. Once BIND succeeds then a VFIO device can
285+
be fully accessed by the user.
286+
287+
VFIO device cdev doesn't rely on VFIO group/container/iommu drivers.
288+
Hence those modules can be fully compiled out in an environment
289+
where no legacy VFIO application exists.
290+
291+
So far SPAPR does not support IOMMUFD yet. So it cannot support device
292+
cdev either.
293+
294+
vfio device cdev access is still bound by IOMMU group semantics, ie. there
295+
can be only one DMA owner for the group. Devices belonging to the same
296+
group can not be bound to multiple iommufd_ctx or shared between native
297+
kernel and vfio bus driver or other driver supporting the driver_managed_dma
298+
flag. A violation of this ownership requirement will fail at the
299+
VFIO_DEVICE_BIND_IOMMUFD ioctl, which gates full device access.
300+
301+
Device cdev Example
302+
-------------------
303+
304+
Assume user wants to access PCI device 0000:6a:01.0::
305+
306+
$ ls /sys/bus/pci/devices/0000:6a:01.0/vfio-dev/
307+
vfio0
308+
309+
This device is therefore represented as vfio0. The user can verify
310+
its existence::
311+
312+
$ ls -l /dev/vfio/devices/vfio0
313+
crw------- 1 root root 511, 0 Feb 16 01:22 /dev/vfio/devices/vfio0
314+
$ cat /sys/bus/pci/devices/0000:6a:01.0/vfio-dev/vfio0/dev
315+
511:0
316+
$ ls -l /dev/char/511\:0
317+
lrwxrwxrwx 1 root root 21 Feb 16 01:22 /dev/char/511:0 -> ../vfio/devices/vfio0
318+
319+
Then provide the user with access to the device if unprivileged
320+
operation is desired::
321+
322+
$ chown user:user /dev/vfio/devices/vfio0
323+
324+
Finally the user could get cdev fd by::
325+
326+
cdev_fd = open("/dev/vfio/devices/vfio0", O_RDWR);
327+
328+
An opened cdev_fd doesn't give the user any permission of accessing
329+
the device except binding the cdev_fd to an iommufd. After that point
330+
then the device is fully accessible including attaching it to an
331+
IOMMUFD IOAS/HWPT to enable userspace DMA::
332+
333+
struct vfio_device_bind_iommufd bind = {
334+
.argsz = sizeof(bind),
335+
.flags = 0,
336+
};
337+
struct iommu_ioas_alloc alloc_data = {
338+
.size = sizeof(alloc_data),
339+
.flags = 0,
340+
};
341+
struct vfio_device_attach_iommufd_pt attach_data = {
342+
.argsz = sizeof(attach_data),
343+
.flags = 0,
344+
};
345+
struct iommu_ioas_map map = {
346+
.size = sizeof(map),
347+
.flags = IOMMU_IOAS_MAP_READABLE |
348+
IOMMU_IOAS_MAP_WRITEABLE |
349+
IOMMU_IOAS_MAP_FIXED_IOVA,
350+
.__reserved = 0,
351+
};
352+
353+
iommufd = open("/dev/iommu", O_RDWR);
354+
355+
bind.iommufd = iommufd;
356+
ioctl(cdev_fd, VFIO_DEVICE_BIND_IOMMUFD, &bind);
357+
358+
ioctl(iommufd, IOMMU_IOAS_ALLOC, &alloc_data);
359+
attach_data.pt_id = alloc_data.out_ioas_id;
360+
ioctl(cdev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data);
361+
362+
/* Allocate some space and setup a DMA mapping */
363+
map.user_va = (int64_t)mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE,
364+
MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
365+
map.iova = 0; /* 1MB starting at 0x0 from device view */
366+
map.length = 1024 * 1024;
367+
map.ioas_id = alloc_data.out_ioas_id;;
368+
369+
ioctl(iommufd, IOMMU_IOAS_MAP, &map);
370+
371+
/* Other device operations as stated in "VFIO Usage Example" */
372+
242373
VFIO User API
243374
-------------------------------------------------------------------------------
244375

@@ -279,6 +410,7 @@ similar to a file operations structure::
279410
struct iommufd_ctx *ictx, u32 *out_device_id);
280411
void (*unbind_iommufd)(struct vfio_device *vdev);
281412
int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id);
413+
void (*detach_ioas)(struct vfio_device *vdev);
282414
int (*open_device)(struct vfio_device *vdev);
283415
void (*close_device)(struct vfio_device *vdev);
284416
ssize_t (*read)(struct vfio_device *vdev, char __user *buf,
@@ -315,9 +447,10 @@ container_of().
315447
- The [un]bind_iommufd callbacks are issued when the device is bound to
316448
and unbound from iommufd.
317449

318-
- The attach_ioas callback is issued when the device is attached to an
319-
IOAS managed by the bound iommufd. The attached IOAS is automatically
320-
detached when the device is unbound from iommufd.
450+
- The [de]attach_ioas callback is issued when the device is attached to
451+
and detached from an IOAS managed by the bound iommufd. However, the
452+
attached IOAS can also be automatically detached when the device is
453+
unbound from iommufd.
321454

322455
- The read/write/mmap callbacks implement the device region access defined
323456
by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl.
@@ -564,3 +697,11 @@ This implementation has some specifics:
564697
\-0d.1
565698
566699
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90)
700+
701+
.. [5] Nested translation is an IOMMU feature which supports two stage
702+
address translations. This improves the address translation efficiency
703+
in IOMMU virtualization.
704+
705+
.. [6] PASID stands for Process Address Space ID, introduced by PCI
706+
Express. It is a prerequisite for Shared Virtual Addressing (SVA)
707+
and Scalable I/O Virtualization (Scalable IOV).
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
.. SPDX-License-Identifier: GPL-2.0+
2+
.. note: can be edited and viewed with /usr/bin/formiko-vim
3+
4+
==========================================================
5+
PCI VFIO driver for the AMD/Pensando(R) DSC adapter family
6+
==========================================================
7+
8+
AMD/Pensando Linux VFIO PCI Device Driver
9+
Copyright(c) 2023 Advanced Micro Devices, Inc.
10+
11+
Overview
12+
========
13+
14+
The ``pds-vfio-pci`` module is a PCI driver that supports Live Migration
15+
capable Virtual Function (VF) devices in the DSC hardware.
16+
17+
Using the device
18+
================
19+
20+
The pds-vfio-pci device is enabled via multiple configuration steps and
21+
depends on the ``pds_core`` driver to create and enable SR-IOV Virtual
22+
Function devices.
23+
24+
Shown below are the steps to bind the driver to a VF and also to the
25+
associated auxiliary device created by the ``pds_core`` driver. This
26+
example assumes the pds_core and pds-vfio-pci modules are already
27+
loaded.
28+
29+
.. code-block:: bash
30+
:name: example-setup-script
31+
32+
#!/bin/bash
33+
34+
PF_BUS="0000:60"
35+
PF_BDF="0000:60:00.0"
36+
VF_BDF="0000:60:00.1"
37+
38+
# Prevent non-vfio VF driver from probing the VF device
39+
echo 0 > /sys/class/pci_bus/$PF_BUS/device/$PF_BDF/sriov_drivers_autoprobe
40+
41+
# Create single VF for Live Migration via pds_core
42+
echo 1 > /sys/bus/pci/drivers/pds_core/$PF_BDF/sriov_numvfs
43+
44+
# Allow the VF to be bound to the pds-vfio-pci driver
45+
echo "pds-vfio-pci" > /sys/class/pci_bus/$PF_BUS/device/$VF_BDF/driver_override
46+
47+
# Bind the VF to the pds-vfio-pci driver
48+
echo "$VF_BDF" > /sys/bus/pci/drivers/pds-vfio-pci/bind
49+
50+
After performing the steps above, a file in /dev/vfio/<iommu_group>
51+
should have been created.
52+
53+
54+
Enabling the driver
55+
===================
56+
57+
The driver is enabled via the standard kernel configuration system,
58+
using the make command::
59+
60+
make oldconfig/menuconfig/etc.
61+
62+
The driver is located in the menu structure at:
63+
64+
-> Device Drivers
65+
-> VFIO Non-Privileged userspace driver framework
66+
-> VFIO support for PDS PCI devices
67+
68+
Support
69+
=======
70+
71+
For general Linux networking support, please use the netdev mailing
72+
list, which is monitored by Pensando personnel::
73+
74+
75+
76+
For more specific support needs, please use the Pensando driver support
77+
email::
78+
79+

Documentation/networking/device_drivers/ethernet/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Contents:
1616
altera/altera_tse
1717
amd/pds_core
1818
amd/pds_vdpa
19+
amd/pds_vfio_pci
1920
aquantia/atlantic
2021
chelsio/cxgb
2122
cirrus/cs89x0

Documentation/virt/kvm/devices/vfio.rst

Lines changed: 31 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -9,22 +9,34 @@ Device types supported:
99
- KVM_DEV_TYPE_VFIO
1010

1111
Only one VFIO instance may be created per VM. The created device
12-
tracks VFIO groups in use by the VM and features of those groups
13-
important to the correctness and acceleration of the VM. As groups
14-
are enabled and disabled for use by the VM, KVM should be updated
15-
about their presence. When registered with KVM, a reference to the
16-
VFIO-group is held by KVM.
12+
tracks VFIO files (group or device) in use by the VM and features
13+
of those groups/devices important to the correctness and acceleration
14+
of the VM. As groups/devices are enabled and disabled for use by the
15+
VM, KVM should be updated about their presence. When registered with
16+
KVM, a reference to the VFIO file is held by KVM.
1717

1818
Groups:
19-
KVM_DEV_VFIO_GROUP
20-
21-
KVM_DEV_VFIO_GROUP attributes:
22-
KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking
23-
kvm_device_attr.addr points to an int32_t file descriptor
24-
for the VFIO group.
25-
KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking
26-
kvm_device_attr.addr points to an int32_t file descriptor
27-
for the VFIO group.
19+
KVM_DEV_VFIO_FILE
20+
alias: KVM_DEV_VFIO_GROUP
21+
22+
KVM_DEV_VFIO_FILE attributes:
23+
KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device
24+
tracking
25+
26+
kvm_device_attr.addr points to an int32_t file descriptor for the
27+
VFIO file.
28+
29+
KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM
30+
device tracking
31+
32+
kvm_device_attr.addr points to an int32_t file descriptor for the
33+
VFIO file.
34+
35+
KVM_DEV_VFIO_GROUP (legacy kvm device group restricted to the handling of VFIO group fd):
36+
KVM_DEV_VFIO_GROUP_ADD: same as KVM_DEV_VFIO_FILE_ADD for group fd only
37+
38+
KVM_DEV_VFIO_GROUP_DEL: same as KVM_DEV_VFIO_FILE_DEL for group fd only
39+
2840
KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
2941
allocated by sPAPR KVM.
3042
kvm_device_attr.addr points to a struct::
@@ -40,7 +52,10 @@ KVM_DEV_VFIO_GROUP attributes:
4052
- @tablefd is a file descriptor for a TCE table allocated via
4153
KVM_CREATE_SPAPR_TCE.
4254

43-
The GROUP_ADD operation above should be invoked prior to accessing the
55+
The FILE/GROUP_ADD operation above should be invoked prior to accessing the
4456
device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support
4557
drivers which require a kvm pointer to be set in their .open_device()
46-
callback.
58+
callback. It is the same for device file descriptor via character device
59+
open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file
60+
descriptors, FILE_ADD should be invoked before VFIO_DEVICE_BIND_IOMMUFD
61+
to support the drivers mentioned in prior sentence as well.

MAINTAINERS

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22482,6 +22482,13 @@ S: Maintained
2248222482
P: Documentation/driver-api/vfio-pci-device-specific-driver-acceptance.rst
2248322483
F: drivers/vfio/pci/*/
2248422484

22485+
VFIO PDS PCI DRIVER
22486+
M: Brett Creeley <[email protected]>
22487+
22488+
S: Maintained
22489+
F: Documentation/networking/device_drivers/ethernet/amd/pds_vfio_pci.rst
22490+
F: drivers/vfio/pci/pds/
22491+
2248522492
VFIO PLATFORM DRIVER
2248622493
M: Eric Auger <[email protected]>
2248722494

drivers/gpu/drm/i915/gvt/kvmgt.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1474,6 +1474,7 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = {
14741474
.bind_iommufd = vfio_iommufd_emulated_bind,
14751475
.unbind_iommufd = vfio_iommufd_emulated_unbind,
14761476
.attach_ioas = vfio_iommufd_emulated_attach_ioas,
1477+
.detach_ioas = vfio_iommufd_emulated_detach_ioas,
14771478
};
14781479

14791480
static int intel_vgpu_probe(struct mdev_device *mdev)

drivers/iommu/iommufd/Kconfig

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@ config IOMMUFD
1414
if IOMMUFD
1515
config IOMMUFD_VFIO_CONTAINER
1616
bool "IOMMUFD provides the VFIO container /dev/vfio/vfio"
17-
depends on VFIO && !VFIO_CONTAINER
18-
default VFIO && !VFIO_CONTAINER
17+
depends on VFIO_GROUP && !VFIO_CONTAINER
18+
default VFIO_GROUP && !VFIO_CONTAINER
1919
help
2020
IOMMUFD will provide /dev/vfio/vfio instead of VFIO. This relies on
2121
IOMMUFD providing compatibility emulation to give the same ioctls.

0 commit comments

Comments
 (0)