diff options
author | Will Deacon <will@kernel.org> | 2024-07-12 16:54:34 +0100 |
---|---|---|
committer | Will Deacon <will@kernel.org> | 2024-07-12 16:54:34 +0100 |
commit | c2b2e5c50330b29804339df4e7adf70e9f19b793 (patch) | |
tree | 19f794db4eab30610bdc6603b8bf9978672551cd | |
parent | 710f1071f1619657ff87d0d7c019fbcba1bcd97b (diff) | |
parent | 228159802bcebd95438b54b0bd7c97798582178b (diff) |
Merge branch 'iommu/core' into iommu/next
* iommu/core:
docs: iommu: Remove outdated Documentation/userspace-api/iommu.rst
iommufd: Use atomic_long_try_cmpxchg() in incr_user_locked_vm()
iommu/iova: Add missing MODULE_DESCRIPTION() macro
iommu/dma: Prune redundant pgprot arguments
iommu: Make iommu_sva_domain_alloc() static
-rw-r--r-- | Documentation/userspace-api/index.rst | 1 | ||||
-rw-r--r-- | Documentation/userspace-api/iommu.rst | 209 | ||||
-rw-r--r-- | MAINTAINERS | 1 | ||||
-rw-r--r-- | drivers/iommu/dma-iommu.c | 16 | ||||
-rw-r--r-- | drivers/iommu/iommu-sva.c | 6 | ||||
-rw-r--r-- | drivers/iommu/iommufd/pages.c | 7 | ||||
-rw-r--r-- | drivers/iommu/iova.c | 1 | ||||
-rw-r--r-- | include/linux/iommu.h | 8 |
8 files changed, 15 insertions, 234 deletions
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst index 5926115ec0ed..2e0bb6068583 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -44,7 +44,6 @@ Devices and I/O accelerators/ocxl dma-buf-alloc-exchange gpio/index - iommu iommufd media/index dcdbas diff --git a/Documentation/userspace-api/iommu.rst b/Documentation/userspace-api/iommu.rst deleted file mode 100644 index d3108c1519d5..000000000000 --- a/Documentation/userspace-api/iommu.rst +++ /dev/null @@ -1,209 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0 -.. iommu: - -===================================== -IOMMU Userspace API -===================================== - -IOMMU UAPI is used for virtualization cases where communications are -needed between physical and virtual IOMMU drivers. For baremetal -usage, the IOMMU is a system device which does not need to communicate -with userspace directly. - -The primary use cases are guest Shared Virtual Address (SVA) and -guest IO virtual address (IOVA), wherein the vIOMMU implementation -relies on the physical IOMMU and for this reason requires interactions -with the host driver. - -.. contents:: :local: - -Functionalities -=============== -Communications of user and kernel involve both directions. The -supported user-kernel APIs are as follows: - -1. Bind/Unbind guest PASID (e.g. Intel VT-d) -2. Bind/Unbind guest PASID table (e.g. ARM SMMU) -3. Invalidate IOMMU caches upon guest requests -4. Report errors to the guest and serve page requests - -Requirements -============ -The IOMMU UAPIs are generic and extensible to meet the following -requirements: - -1. Emulated and para-virtualised vIOMMUs -2. Multiple vendors (Intel VT-d, ARM SMMU, etc.) -3. Extensions to the UAPI shall not break existing userspace - -Interfaces -========== -Although the data structures defined in IOMMU UAPI are self-contained, -there are no user API functions introduced. Instead, IOMMU UAPI is -designed to work with existing user driver frameworks such as VFIO. - -Extension Rules & Precautions ------------------------------ -When IOMMU UAPI gets extended, the data structures can *only* be -modified in two ways: - -1. Adding new fields by re-purposing the padding[] field. No size change. -2. Adding new union members at the end. May increase the structure sizes. - -No new fields can be added *after* the variable sized union in that it -will break backward compatibility when offset moves. A new flag must -be introduced whenever a change affects the structure using either -method. The IOMMU driver processes the data based on flags which -ensures backward compatibility. - -Version field is only reserved for the unlikely event of UAPI upgrade -at its entirety. - -It's *always* the caller's responsibility to indicate the size of the -structure passed by setting argsz appropriately. -Though at the same time, argsz is user provided data which is not -trusted. The argsz field allows the user app to indicate how much data -it is providing; it's still the kernel's responsibility to validate -whether it's correct and sufficient for the requested operation. - -Compatibility Checking ----------------------- -When IOMMU UAPI extension results in some structure size increase, -IOMMU UAPI code shall handle the following cases: - -1. User and kernel has exact size match -2. An older user with older kernel header (smaller UAPI size) running on a - newer kernel (larger UAPI size) -3. A newer user with newer kernel header (larger UAPI size) running - on an older kernel. -4. A malicious/misbehaving user passing illegal/invalid size but within - range. The data may contain garbage. - -Feature Checking ----------------- -While launching a guest with vIOMMU, it is strongly advised to check -the compatibility upfront, as some subsequent errors happening during -vIOMMU operation, such as cache invalidation failures cannot be nicely -escalated to the guest due to IOMMU specifications. This can lead to -catastrophic failures for the users. - -User applications such as QEMU are expected to import kernel UAPI -headers. Backward compatibility is supported per feature flags. -For example, an older QEMU (with older kernel header) can run on newer -kernel. Newer QEMU (with new kernel header) may refuse to initialize -on an older kernel if new feature flags are not supported by older -kernel. Simply recompiling existing code with newer kernel header should -not be an issue in that only existing flags are used. - -IOMMU vendor driver should report the below features to IOMMU UAPI -consumers (e.g. via VFIO). - -1. IOMMU_NESTING_FEAT_SYSWIDE_PASID -2. IOMMU_NESTING_FEAT_BIND_PGTBL -3. IOMMU_NESTING_FEAT_BIND_PASID_TABLE -4. IOMMU_NESTING_FEAT_CACHE_INVLD -5. IOMMU_NESTING_FEAT_PAGE_REQUEST - -Take VFIO as example, upon request from VFIO userspace (e.g. QEMU), -VFIO kernel code shall query IOMMU vendor driver for the support of -the above features. Query result can then be reported back to the -userspace caller. Details can be found in -Documentation/driver-api/vfio.rst. - - -Data Passing Example with VFIO ------------------------------- -As the ubiquitous userspace driver framework, VFIO is already IOMMU -aware and shares many key concepts such as device model, group, and -protection domain. Other user driver frameworks can also be extended -to support IOMMU UAPI but it is outside the scope of this document. - -In this tight-knit VFIO-IOMMU interface, the ultimate consumer of the -IOMMU UAPI data is the host IOMMU driver. VFIO facilitates user-kernel -transport, capability checking, security, and life cycle management of -process address space ID (PASID). - -VFIO layer conveys the data structures down to the IOMMU driver. It -follows the pattern below:: - - struct { - __u32 argsz; - __u32 flags; - __u8 data[]; - }; - -Here data[] contains the IOMMU UAPI data structures. VFIO has the -freedom to bundle the data as well as parse data size based on its own flags. - -In order to determine the size and feature set of the user data, argsz -and flags (or the equivalent) are also embedded in the IOMMU UAPI data -structures. - -A "__u32 argsz" field is *always* at the beginning of each structure. - -For example: -:: - - struct iommu_cache_invalidate_info { - __u32 argsz; - #define IOMMU_CACHE_INVALIDATE_INFO_VERSION_1 1 - __u32 version; - /* IOMMU paging structure cache */ - #define IOMMU_CACHE_INV_TYPE_IOTLB (1 << 0) /* IOMMU IOTLB */ - #define IOMMU_CACHE_INV_TYPE_DEV_IOTLB (1 << 1) /* Device IOTLB */ - #define IOMMU_CACHE_INV_TYPE_PASID (1 << 2) /* PASID cache */ - #define IOMMU_CACHE_INV_TYPE_NR (3) - __u8 cache; - __u8 granularity; - __u8 padding[6]; - union { - struct iommu_inv_pasid_info pasid_info; - struct iommu_inv_addr_info addr_info; - } granu; - }; - -VFIO is responsible for checking its own argsz and flags. It then -invokes appropriate IOMMU UAPI functions. The user pointers are passed -to the IOMMU layer for further processing. The responsibilities are -divided as follows: - -- Generic IOMMU layer checks argsz range based on UAPI data in the - current kernel version. - -- Generic IOMMU layer checks content of the UAPI data for non-zero - reserved bits in flags, padding fields, and unsupported version. - This is to ensure not breaking userspace in the future when these - fields or flags are used. - -- Vendor IOMMU driver checks argsz based on vendor flags. UAPI data - is consumed based on flags. Vendor driver has access to - unadulterated argsz value in case of vendor specific future - extensions. Currently, it does not perform the copy_from_user() - itself. A __user pointer can be provided in some future scenarios - where there's vendor data outside of the structure definition. - -IOMMU code treats UAPI data in two categories: - -- structure contains vendor data - (Example: iommu_uapi_cache_invalidate()) - -- structure contains only generic data - (Example: iommu_uapi_sva_bind_gpasid()) - - - -Sharing UAPI with in-kernel users ---------------------------------- -For UAPIs that are shared with in-kernel users, a wrapper function is -provided to distinguish the callers. For example, - -Userspace caller :: - - int iommu_uapi_sva_unbind_gpasid(struct iommu_domain *domain, - struct device *dev, - void __user *udata) - -In-kernel caller :: - - int iommu_sva_unbind_gpasid(struct iommu_domain *domain, - struct device *dev, ioasid_t ioasid); diff --git a/MAINTAINERS b/MAINTAINERS index aacccb376c28..59392d80a7e4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -11543,7 +11543,6 @@ L: iommu@lists.linux.dev S: Maintained T: git git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git F: Documentation/devicetree/bindings/iommu/ -F: Documentation/userspace-api/iommu.rst F: drivers/iommu/ F: include/linux/iommu.h F: include/linux/iova.h diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 43520e7275cc..18603d63ad3f 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -939,8 +939,7 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev, * but an IOMMU which supports smaller pages might not map the whole thing. */ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev, - size_t size, struct sg_table *sgt, gfp_t gfp, pgprot_t prot, - unsigned long attrs) + size_t size, struct sg_table *sgt, gfp_t gfp, unsigned long attrs) { struct iommu_domain *domain = iommu_get_dma_domain(dev); struct iommu_dma_cookie *cookie = domain->iova_cookie; @@ -1014,15 +1013,14 @@ out_free_pages: } static void *iommu_dma_alloc_remap(struct device *dev, size_t size, - dma_addr_t *dma_handle, gfp_t gfp, pgprot_t prot, - unsigned long attrs) + dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs) { struct page **pages; struct sg_table sgt; void *vaddr; + pgprot_t prot = dma_pgprot(dev, PAGE_KERNEL, attrs); - pages = __iommu_dma_alloc_noncontiguous(dev, size, &sgt, gfp, prot, - attrs); + pages = __iommu_dma_alloc_noncontiguous(dev, size, &sgt, gfp, attrs); if (!pages) return NULL; *dma_handle = sgt.sgl->dma_address; @@ -1049,8 +1047,7 @@ static struct sg_table *iommu_dma_alloc_noncontiguous(struct device *dev, if (!sh) return NULL; - sh->pages = __iommu_dma_alloc_noncontiguous(dev, size, &sh->sgt, gfp, - PAGE_KERNEL, attrs); + sh->pages = __iommu_dma_alloc_noncontiguous(dev, size, &sh->sgt, gfp, attrs); if (!sh->pages) { kfree(sh); return NULL; @@ -1619,8 +1616,7 @@ static void *iommu_dma_alloc(struct device *dev, size_t size, if (gfpflags_allow_blocking(gfp) && !(attrs & DMA_ATTR_FORCE_CONTIGUOUS)) { - return iommu_dma_alloc_remap(dev, size, handle, gfp, - dma_pgprot(dev, PAGE_KERNEL, attrs), attrs); + return iommu_dma_alloc_remap(dev, size, handle, gfp, attrs); } if (IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index 18a35e798b72..25e581299226 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -10,6 +10,8 @@ #include "iommu-priv.h" static DEFINE_MUTEX(iommu_sva_lock); +static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, + struct mm_struct *mm); /* Allocate a PASID for the mm within range (inclusive) */ static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct device *dev) @@ -277,8 +279,8 @@ static int iommu_sva_iopf_handler(struct iopf_group *group) return 0; } -struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, - struct mm_struct *mm) +static struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, + struct mm_struct *mm) { const struct iommu_ops *ops = dev_iommu_ops(dev); struct iommu_domain *domain; diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c index 528f356238b3..117f644a0c5b 100644 --- a/drivers/iommu/iommufd/pages.c +++ b/drivers/iommu/iommufd/pages.c @@ -809,13 +809,14 @@ static int incr_user_locked_vm(struct iopt_pages *pages, unsigned long npages) lock_limit = task_rlimit(pages->source_task, RLIMIT_MEMLOCK) >> PAGE_SHIFT; + + cur_pages = atomic_long_read(&pages->source_user->locked_vm); do { - cur_pages = atomic_long_read(&pages->source_user->locked_vm); new_pages = cur_pages + npages; if (new_pages > lock_limit) return -ENOMEM; - } while (atomic_long_cmpxchg(&pages->source_user->locked_vm, cur_pages, - new_pages) != cur_pages); + } while (!atomic_long_try_cmpxchg(&pages->source_user->locked_vm, + &cur_pages, new_pages)); return 0; } diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index d59d0ea2fd21..16c6adff3eb7 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -1000,4 +1000,5 @@ void iova_cache_put(void) EXPORT_SYMBOL_GPL(iova_cache_put); MODULE_AUTHOR("Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>"); +MODULE_DESCRIPTION("IOMMU I/O Virtual Address management"); MODULE_LICENSE("GPL"); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index f943f5583df1..8d9b2d990bf8 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -1527,8 +1527,6 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm); void iommu_sva_unbind_device(struct iommu_sva *handle); u32 iommu_sva_get_pasid(struct iommu_sva *handle); -struct iommu_domain *iommu_sva_domain_alloc(struct device *dev, - struct mm_struct *mm); #else static inline struct iommu_sva * iommu_sva_bind_device(struct device *dev, struct mm_struct *mm) @@ -1553,12 +1551,6 @@ static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm) } static inline void mm_pasid_drop(struct mm_struct *mm) {} - -static inline struct iommu_domain * -iommu_sva_domain_alloc(struct device *dev, struct mm_struct *mm) -{ - return NULL; -} #endif /* CONFIG_IOMMU_SVA */ #ifdef CONFIG_IOMMU_IOPF |