Skip to content

Commit d2cf1b6

Browse files
siwliu-kernelmstsirkin
authored andcommitted
vdpa: introduce .reset_map operation callback
Some device specific IOMMU parent drivers have long standing bogus behavior that mistakenly clean up the maps during .reset. By definition, this is violation to the on-chip IOMMU ops (i.e. .set_map, or .dma_map & .dma_unmap) in those offending drivers, as the removal of internal maps is completely agnostic to the upper layer, causing inconsistent view between the userspace and the kernel. Some userspace app like QEMU gets around of this brokenness by proactively removing and adding back all the maps around vdpa device reset, but such workaround actually penalize other well-behaved driver setup, where vdpa reset always comes with the associated mapping cost, especially for kernel vDPA devices (use_va=false) that have high cost on pinning. It's imperative to rectify this behavior and remove the problematic code from all those non-compliant parent drivers. The reason why a separate .reset_map op is introduced is because this allows a simple on-chip IOMMU model without exposing too much device implementation detail to the upper vdpa layer. The .dma_map/unmap or .set_map driver API is meant to be used to manipulate the IOTLB mappings, and has been abstracted in a way similar to how a real IOMMU device maps or unmaps pages for certain memory ranges. However, apart from this there also exists other mapping needs, in which case 1:1 passthrough mapping has to be used by other users (read virtio-vdpa). To ease parent/vendor driver implementation and to avoid abusing DMA ops in an unexpacted way, these on-chip IOMMU devices can start with 1:1 passthrough mapping mode initially at the time of creation. Then the .reset_map op can be used to switch iotlb back to this initial state without having to expose a complex two-dimensional IOMMU device model. The .reset_map is not a MUST for every parent that implements the .dma_map or .set_map API, because device may work with DMA ops directly by implement their own to manipulate system memory mappings, so don't have to use .reset_map to achieve a simple IOMMU device model for 1:1 passthrough mapping. Signed-off-by: Si-Wei Liu <[email protected]> Acked-by: Eugenio Pérez <[email protected]> Acked-by: Jason Wang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Tested-by: Lei Yang <[email protected]>
1 parent e0592ac commit d2cf1b6

File tree

1 file changed

+10
-0
lines changed

1 file changed

+10
-0
lines changed

include/linux/vdpa.h

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -327,6 +327,15 @@ struct vdpa_map_file {
327327
* @iova: iova to be unmapped
328328
* @size: size of the area
329329
* Returns integer: success (0) or error (< 0)
330+
* @reset_map: Reset device memory mapping to the default
331+
* state (optional)
332+
* Needed for devices that are using device
333+
* specific DMA translation and prefer mapping
334+
* to be decoupled from the virtio life cycle,
335+
* i.e. device .reset op does not reset mapping
336+
* @vdev: vdpa device
337+
* @asid: address space identifier
338+
* Returns integer: success (0) or error (< 0)
330339
* @get_vq_dma_dev: Get the dma device for a specific
331340
* virtqueue (optional)
332341
* @vdev: vdpa device
@@ -405,6 +414,7 @@ struct vdpa_config_ops {
405414
u64 iova, u64 size, u64 pa, u32 perm, void *opaque);
406415
int (*dma_unmap)(struct vdpa_device *vdev, unsigned int asid,
407416
u64 iova, u64 size);
417+
int (*reset_map)(struct vdpa_device *vdev, unsigned int asid);
408418
int (*set_group_asid)(struct vdpa_device *vdev, unsigned int group,
409419
unsigned int asid);
410420
struct device *(*get_vq_dma_dev)(struct vdpa_device *vdev, u16 idx);

0 commit comments

Comments
 (0)