summaryrefslogtreecommitdiff
path: root/drivers/iommu/amd_iommu_v2.c
AgeCommit message (Collapse)Author
2020-06-10iommu/amd: Move AMD IOMMU driver into subdirectoryJoerg Roedel
Move all files related to the AMD IOMMU driver into its own subdirectory. Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/20200609130303.26974-2-joro@8bytes.org
2020-05-29iommu/amd: Merge private header filesJoerg Roedel
Merge amd_iommu_proto.h into amd_iommu.h. Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20200527115313.7426-9-joro@8bytes.org
2020-05-29iommu/amd: Unexport get_dev_data()Joerg Roedel
This function is internal to the AMD IOMMU driver and only exported because the amd_iommu_v2 modules calls it. But the reason it is called from there could better be handled by amd_iommu_is_attach_deferred(). So unexport get_dev_data() and use amd_iommu_is_attach_deferred() instead. Signed-off-by: Joerg Roedel <jroedel@suse.de> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20200527115313.7426-3-joro@8bytes.org
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation inc 59 temple place suite 330 boston ma 02111 1307 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 136 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-30iommu/amd: Remove clear_flush_young notifierPeter Xu
AMD IOMMU driver is using the clear_flush_young() to do cache flushing but that's actually already covered by invalidate_range(). Remove the extra notifier and the chunks. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-11-28iommu/amd: Use pr_fmt()Joerg Roedel
Make use of pr_fmt instead of having the 'AMD-Vi' prefix added manually at every printk() call. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2018-10-26Revert "mm, mmu_notifier: annotate mmu notifiers with blockable invalidate ↵Michal Hocko
callbacks" Revert 5ff7091f5a2ca ("mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacks"). MMU_INVALIDATE_DOES_NOT_BLOCK flags was the only one used and it is no longer needed since 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers"). We now have a full support for per range !blocking behavior so we can drop the stop gap workaround which the per notifier flag was used for. Link: http://lkml.kernel.org/r/20180827112623.8992-4-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17mm: convert return type of handle_mm_fault() caller to vm_fault_tSouptick Joarder
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t") In this patch all the caller of handle_mm_fault() are changed to return vm_fault_t type. Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Tony Luck <tony.luck@intel.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: James Hogan <jhogan@kernel.org> Cc: Ley Foon Tan <lftan@altera.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: David S. Miller <davem@davemloft.net> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-02-06Merge tag 'pci-v4.16-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - skip AER driver error recovery callbacks for correctable errors reported via ACPI APEI, as we already do for errors reported via the native path (Tyler Baicar) - fix DPC shared interrupt handling (Alex Williamson) - print full DPC interrupt number (Keith Busch) - enable DPC only if AER is available (Keith Busch) - simplify DPC code (Bjorn Helgaas) - calculate ASPM L1 substate parameter instead of hardcoding it (Bjorn Helgaas) - enable Latency Tolerance Reporting for ASPM L1 substates (Bjorn Helgaas) - move ASPM internal interfaces out of public header (Bjorn Helgaas) - allow hot-removal of VGA devices (Mika Westerberg) - speed up unplug and shutdown by assuming Thunderbolt controllers don't support Command Completed events (Lukas Wunner) - add AtomicOps support for GPU and Infiniband drivers (Felix Kuehling, Jay Cornwall) - expose "ari_enabled" in sysfs to help NIC naming (Stuart Hayes) - clean up PCI DMA interface usage (Christoph Hellwig) - remove PCI pool API (replaced with DMA pool) (Romain Perier) - deprecate pci_get_bus_and_slot(), which assumed PCI domain 0 (Sinan Kaya) - move DT PCI code from drivers/of/ to drivers/pci/ (Rob Herring) - add PCI-specific wrappers for dev_info(), etc (Frederick Lawler) - remove warnings on sysfs mmap failure (Bjorn Helgaas) - quiet ROM validation messages (Alex Deucher) - remove redundant memory alloc failure messages (Markus Elfring) - fill in types for compile-time VGA and other I/O port resources (Bjorn Helgaas) - make "pci=pcie_scan_all" work for Root Ports as well as Downstream Ports to help AmigaOne X1000 (Bjorn Helgaas) - add SPDX tags to all PCI files (Bjorn Helgaas) - quirk Marvell 9128 DMA aliases (Alex Williamson) - quirk broken INTx disable on Ceton InfiniTV4 (Bjorn Helgaas) - fix CONFIG_PCI=n build by adding dummy pci_irqd_intx_xlate() (Niklas Cassel) - use DMA API to get MSI address for DesignWare IP (Niklas Cassel) - fix endpoint-mode DMA mask configuration (Kishon Vijay Abraham I) - fix ARTPEC-6 incorrect IS_ERR() usage (Wei Yongjun) - add support for ARTPEC-7 SoC (Niklas Cassel) - add endpoint-mode support for ARTPEC (Niklas Cassel) - add Cadence PCIe host and endpoint controller driver (Cyrille Pitchen) - handle multiple INTx status bits being set in dra7xx (Vignesh R) - translate dra7xx hwirq range to fix INTD handling (Vignesh R) - remove deprecated Exynos PHY initialization code (Jaehoon Chung) - fix MSI erratum workaround for HiSilicon Hip06/Hip07 (Dongdong Liu) - fix NULL pointer dereference in iProc BCMA driver (Ray Jui) - fix Keystone interrupt-controller-node lookup (Johan Hovold) - constify qcom driver structures (Julia Lawall) - rework Tegra config space mapping to increase space available for endpoints (Vidya Sagar) - simplify Tegra driver by using bus->sysdata (Manikanta Maddireddy) - remove PCI_REASSIGN_ALL_BUS usage on Tegra (Manikanta Maddireddy) - add support for Global Fabric Manager Server (GFMS) event to Microsemi Switchtec switch driver (Logan Gunthorpe) - add IDs for Switchtec PSX 24xG3 and PSX 48xG3 (Kelvin Cao) * tag 'pci-v4.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits) PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe endpoint controller PCI: endpoint: Fix EPF device name to support multi-function devices PCI: endpoint: Add the function number as argument to EPC ops PCI: cadence: Add host driver for Cadence PCIe controller dt-bindings: PCI: cadence: Add DT bindings for Cadence PCIe host controller PCI: Add vendor ID for Cadence PCI: Add generic function to probe PCI host controllers PCI: generic: fix missing call of pci_free_resource_list() PCI: OF: Add generic function to parse and allocate PCI resources PCI: Regroup all PCI related entries into drivers/pci/Makefile PCI/DPC: Reformat DPC register definitions PCI/DPC: Add and use DPC Status register field definitions PCI/DPC: Squash dpc_rp_pio_get_info() into dpc_process_rp_pio_error() PCI/DPC: Remove unnecessary RP PIO register structs PCI/DPC: Push dpc->rp_pio_status assignment into dpc_rp_pio_get_info() PCI/DPC: Squash dpc_rp_pio_print_error() into dpc_rp_pio_get_info() PCI/DPC: Make RP PIO log size check more generic PCI/DPC: Rename local "status" to "dpc_status" PCI/DPC: Squash dpc_rp_pio_print_tlp_header() into dpc_rp_pio_print_error() ...
2018-01-31mm, mmu_notifier: annotate mmu notifiers with blockable invalidate callbacksDavid Rientjes
Commit 4d4bbd8526a8 ("mm, oom_reaper: skip mm structs with mmu notifiers") prevented the oom reaper from unmapping private anonymous memory with the oom reaper when the oom victim mm had mmu notifiers registered. The rationale is that doing mmu_notifier_invalidate_range_{start,end}() around the unmap_page_range(), which is needed, can block and the oom killer will stall forever waiting for the victim to exit, which may not be possible without reaping. That concern is real, but only true for mmu notifiers that have blockable invalidate_range_{start,end}() callbacks. This patch adds a "flags" field to mmu notifier ops that can set a bit to indicate that these callbacks do not block. The implementation is steered toward an expensive slowpath, such as after the oom reaper has grabbed mm->mmap_sem of a still alive oom victim. [rientjes@google.com: mmu_notifier_invalidate_range_end() can also call the invalidate_range() must not block, fix comment] Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1801091339570.240101@chino.kir.corp.google.com [akpm@linux-foundation.org: make mm_has_blockable_invalidate_notifiers() return bool, use rwsem_is_locked()] Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1712141329500.74052@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Dimitri Sivanich <sivanich@hpe.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Joerg Roedel <joro@8bytes.org> Cc: Doug Ledford <dledford@redhat.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Mike Marciniszyn <mike.marciniszyn@intel.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-11iommu/amd: Deprecate pci_get_bus_and_slot()Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as where a PCI device is present. This restricts the device drivers to be reused for other domain numbers. Getting ready to remove pci_get_bus_and_slot() function in favor of pci_get_domain_bus_and_slot(). Hard-code the domain number as 0 for the AMD IOMMU driver. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Signed-off-by: Bjorn Helgaas <helgaas@kernel.org> Reviewed-by: Gary R Hook <gary.hook@amd.com> Acked-by: Joerg Roedel <jroedel@suse.de>
2017-09-09Merge tag 'iommu-updates-v4.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "Slightly more changes than usual this time: - KDump Kernel IOMMU take-over code for AMD IOMMU. The code now tries to preserve the mappings of the kernel so that master aborts for devices are avoided. Master aborts cause some devices to fail in the kdump kernel, so this code makes the dump more likely to succeed when AMD IOMMU is enabled. - common flush queue implementation for IOVA code users. The code is still optional, but AMD and Intel IOMMU drivers had their own implementation which is now unified. - finish support for iommu-groups. All drivers implement this feature now so that IOMMU core code can rely on it. - finish support for 'struct iommu_device' in iommu drivers. All drivers now use the interface. - new functions in the IOMMU-API for explicit IO/TLB flushing. This will help to reduce the number of IO/TLB flushes when IOMMU drivers support this interface. - support for mt2712 in the Mediatek IOMMU driver - new IOMMU driver for QCOM hardware - system PM support for ARM-SMMU - shutdown method for ARM-SMMU-v3 - some constification patches - various other small improvements and fixes" * tag 'iommu-updates-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits) iommu/vt-d: Don't be too aggressive when clearing one context entry iommu: Introduce Interface for IOMMU TLB Flushing iommu/s390: Constify iommu_ops iommu/vt-d: Avoid calling virt_to_phys() on null pointer iommu/vt-d: IOMMU Page Request needs to check if address is canonical. arm/tegra: Call bus_set_iommu() after iommu_device_register() iommu/exynos: Constify iommu_ops iommu/ipmmu-vmsa: Make ipmmu_gather_ops const iommu/ipmmu-vmsa: Rereserving a free context before setting up a pagetable iommu/amd: Rename a few flush functions iommu/amd: Check if domain is NULL in get_domain() and return -EBUSY iommu/mediatek: Fix a build warning of BIT(32) in ARM iommu/mediatek: Fix a build fail of m4u_type iommu: qcom: annotate PM functions as __maybe_unused iommu/pamu: Fix PAMU boot crash memory: mtk-smi: Degrade SMI init to module_init iommu/mediatek: Enlarge the validate PA range for 4GB mode iommu/mediatek: Disable iommu clock when system suspend iommu/mediatek: Move pgtable allocation into domain_alloc iommu/mediatek: Merge 2 M4U HWs into one iommu domain ...
2017-08-31iommu/amd: update to new mmu_notifier semanticJérôme Glisse
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: iommu@lists.linux-foundation.org Cc: Joerg Roedel <jroedel@suse.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-15iommu/amd: Don't copy GCR3 table root pointerBaoquan He
When iommu is pre_enabled in kdump kernel, if a device is set up with guest translations (DTE.GV=1), then don't copy GCR3 table root pointer but move the device over to an empty guest-cr3 table and handle the faults in the PPR log (which answer them with INVALID). After all these PPR faults are recoverable for the device and we should not allow the device to change old-kernels data when we don't have to. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-04-24iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()Pan Bian
In function amd_iommu_bind_pasid(), the control flow jumps to label out_free when pasid_state->mm and mm is NULL. And mmput(mm) is called. In function mmput(mm), mm is referenced without validation. This will result in a NULL dereference bug. This patch fixes the bug. Signed-off-by: Pan Bian <bianpan2016@163.com> Fixes: f0aac63b873b ('iommu/amd: Don't hold a reference to mm_struct') Signed-off-by: Joerg Roedel <jroedel@suse.de>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<linux/sched/mm.h> We are going to split <linux/sched/mm.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/mm.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. The APIs that are going to be moved first are: mm_alloc() __mmdrop() mmdrop() mmdrop_async_fn() mmdrop_async() mmget_not_zero() mmput() mmput_async() get_task_mm() mm_access() mm_release() Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-29iommu/amd: Missing error code in amd_iommu_init_device()Dan Carpenter
We should set "ret" to -EINVAL if iommu_group_get() fails. Fixes: 55c99a4dc50f ("iommu/amd: Use iommu_attach_group()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-08-01Merge tag 'iommu-updates-v4.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: - big-endian support and preparation for defered probing for the Exynos IOMMU driver - simplifications in iommu-group id handling - support for Mediatek generation one IOMMU hardware - conversion of the AMD IOMMU driver to use the generic IOVA allocator. This driver now also benefits from the recent scalability improvements in the IOVA code. - preparations to use generic DMA mapping code in the Rockchip IOMMU driver - device tree adaption and conversion to use generic page-table code for the MSM IOMMU driver - an iova_to_phys optimization in the ARM-SMMU driver to greatly improve page-table teardown performance with VFIO - various other small fixes and conversions * tag 'iommu-updates-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (59 commits) iommu/amd: Initialize dma-ops domains with 3-level page-table iommu/amd: Update Alias-DTE in update_device_table() iommu/vt-d: Return error code in domain_context_mapping_one() iommu/amd: Use container_of to get dma_ops_domain iommu/amd: Flush iova queue before releasing dma_ops_domain iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back iommu/amd: Use dev_data->domain in get_domain() iommu/amd: Optimize map_sg and unmap_sg iommu/amd: Introduce dir2prot() helper iommu/amd: Implement timeout to flush unmap queues iommu/amd: Implement flush queue iommu/amd: Allow NULL pointer parameter for domain_flush_complete() iommu/amd: Set up data structures for flush queue iommu/amd: Remove align-parameter from __map_single() iommu/amd: Remove other remains of old address allocator iommu/amd: Make use of the generic IOVA allocator iommu/amd: Remove special mapping code for dma_ops path iommu/amd: Pass gfp-flags to iommu_map_page() iommu/amd: Implement apply_dm_region call-back iommu/amd: Create a list of reserved iova addresses ...
2016-07-26mm: do not pass mm_struct into handle_mm_faultKirill A. Shutemov
We always have vma->vm_mm around. Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-21iommu/amd: Remove create_workqueueBhaktipriya Shridhar
alloc_workqueue replaces deprecated create_workqueue(). A dedicated workqueue has been used since the workitem (viz &fault->work), is involved in IO page-fault handling. WQ_MEM_RECLAIM has been set to guarantee forward progress under memory pressure, which is a requirement here. Since there are only a fixed number of work items, explicit concurrency limit is unnecessary. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2016-02-18mm/core: Do not enforce PKEY permissions on remote mm accessDave Hansen
We try to enforce protection keys in software the same way that we do in hardware. (See long example below). But, we only want to do this when accessing our *own* process's memory. If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then tried to PTRACE_POKE a target process which just happened to have some mprotect_pkey(pkey=6) memory, we do *not* want to deny the debugger access to that memory. PKRU is fundamentally a thread-local structure and we do not want to enforce it on access to _another_ thread's data. This gets especially tricky when we have workqueues or other delayed-work mechanisms that might run in a random process's context. We can check that we only enforce pkeys when operating on our *own* mm, but delayed work gets performed when a random user context is active. We might end up with a situation where a delayed-work gup fails when running randomly under its "own" task but succeeds when running under another process. We want to avoid that. To avoid that, we use the new GUP flag: FOLL_REMOTE and add a fault flag: FAULT_FLAG_REMOTE. They indicate that we are walking an mm which is not guranteed to be the same as current->mm and should not be subject to protection key enforcement. Thanks to Jerome Glisse for pointing out this scenario. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boaz Harrosh <boaz@plexistor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <dchinner@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Dominik Vogt <vogt@linux.vnet.ibm.com> Cc: Eric B Munson <emunson@akamai.com> Cc: Geliang Tang <geliangtang@163.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Low <jason.low2@hp.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Shachar Raindel <raindel@mellanox.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xie XiuQi <xiexiuqi@huawei.com> Cc: iommu@lists.linux-foundation.org Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-12-14iommu/amd: Constify mmu_notifier_ops structuresJulia Lawall
This mmu_notifier_ops structure is never modified, so declare it as const, like the other mmu_notifier_ops structures. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-14iommu/amd: Cleanup error handling in do_fault()Joerg Roedel
Get rid of the three error paths that look the same and move error handling to a single place. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-By: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-14iommu/amd: Correctly set flags for handle_mm_fault callJoerg Roedel
Instead of just checking for a write access, calculate the flags that are passed to handle_mm_fault() more precisly and use the pre-defined macros. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-By: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-12-14iommu/amd: Do proper access checking before calling handle_mm_fault()Joerg Roedel
The handle_mm_fault function expects the caller to do the access checks. Not doing so and calling the function with wrong permissions is a bug (catched by a BUG_ON). So fix this bug by adding proper access checking to the io page-fault code in the AMD IOMMUv2 driver. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Acked-By: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-10-15iommu/amd: Fix BUG when faulting a PROT_NONE VMAJay Cornwall
handle_mm_fault indirectly triggers a BUG in do_numa_page when given a VMA without read/write/execute access. Check this condition in do_fault. do_fault -> handle_mm_fault -> handle_pte_fault -> do_numa_page mm/memory.c 3147 static int do_numa_page(struct mm_struct *mm, struct vm_area_struct *vma, .... 3159 /* A PROT_NONE fault should not end up here */ 3160 BUG_ON(!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE))); Signed-off-by: Jay Cornwall <jay@jcornwall.me> Cc: <stable@vger.kernel.org> # v4.1+ Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-08-13iommu/amd: Use BUG_ON instead of if () BUG()Joerg Roedel
Found by a coccicheck script. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-07-30iommu/amd: Use iommu_attach_group()Joerg Roedel
Since the conversion to default domains the iommu_attach_device function only works for devices with their own group. But this isn't always true for current IOMMUv2 capable devices, so use iommu_attach_group instead. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-05-04iommu/amd: Fix bug in put_pasid_state_waitOded Gabbay
This patch fixes a bug in put_pasid_state_wait that appeared in kernel 4.0 The bug is that pasid_state->count wasn't decremented before entering the wait_event. Thus, the condition in wait_event will never be true. The fix is to decrement (atomically) the pasid_state->count before the wait_event. Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Cc: stable@vger.kernel.org #v4.0 Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-03-04iommu/amd: Small cleanup in mn_release()Dan Carpenter
"pasid_state->device_state" and "dev_state" are the same, but it's nicer to use dev_state consistently. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-02-04Merge branches 'arm/renesas', 'arm/smmu', 'arm/omap', 'ppc/pamu', 'x86/amd' ↵Joerg Roedel
and 'core' into next Conflicts: drivers/iommu/Kconfig drivers/iommu/Makefile
2015-02-04iommu: Update my email addressJoerg Roedel
The AMD address is dead for a long time already, replace it with a working one. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-02-04iommu/amd: Use wait_event in put_pasid_state_waitJoerg Roedel
Now that I learned about possible spurious wakeups this place needs fixing too. Replace the self-coded sleep variant with the generic wait_event() helper. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2015-02-04iommu/amd: Fix amd_iommu_free_device()Peter Zijlstra
put_device_state_wait() doesn't loop on the condition and a spurious wakeup will have it free the device state even though there might still be references out to it. Fix this by using 'normal' wait primitives. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2014-12-15Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm updates from Dave Airlie: "Highlights: - AMD KFD driver merge This is the AMD HSA interface for exposing a lowlevel interface for GPGPU use. They have an open source userspace built on top of this interface, and the code looks as good as it was going to get out of tree. - Initial atomic modesetting work The need for an atomic modesetting interface to allow userspace to try and send a complete set of modesetting state to the driver has arisen, and been suffering from neglect this past year. No more, the start of the common code and changes for msm driver to use it are in this tree. Ongoing work to get the userspace ioctl finished and the code clean will probably wait until next kernel. - DisplayID 1.3 and tiled monitor exposed to userspace. Tiled monitor property is now exposed for userspace to make use of. - Rockchip drm driver merged. - imx gpu driver moved out of staging Other stuff: - core: panel - MIPI DSI + new panels. expose suggested x/y properties for virtual GPUs - i915: Initial Skylake (SKL) support gen3/4 reset work start of dri1/ums removal infoframe tracking fixes for lots of things. - nouveau: tegra k1 voltage support GM204 modesetting support GT21x memory reclocking work - radeon: CI dpm fixes GPUVM improvements Initial DPM fan control - rcar-du: HDMI support added removed some support for old boards slave encoder driver for Analog Devices adv7511 - exynos: Exynos4415 SoC support - msm: a4xx gpu support atomic helper conversion - tegra: iommu support universal plane support ganged-mode DSI support - sti: HDMI i2c improvements - vmwgfx: some late fixes. - qxl: use suggested x/y properties" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (969 commits) drm: sti: fix module compilation issue drm/i915: save/restore GMBUS freq across suspend/resume on gen4 drm: sti: correctly cleanup CRTC and planes drm: sti: add HQVDP plane drm: sti: add cursor plane drm: sti: enable auxiliary CRTC drm: sti: fix delay in VTG programming drm: sti: prepare sti_tvout to support auxiliary crtc drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off} drm: sti: fix hdmi avi infoframe drm: sti: remove event lock while disabling vblank drm: sti: simplify gdp code drm: sti: clear all mixer control drm: sti: remove gpio for HDMI hot plug detection drm: sti: allow to change hdmi ddc i2c adapter drm/doc: Document drm_add_modes_noedid() usage drm/i915: Remove '& 0xffff' from the mask given to WA_REG() drm/i915: Invert the mask and val arguments in wa_add() and WA_REG() drm: Zero out DRM object memory upon cleanup drm/i915/bdw: Fix the write setting up the WIZ hashing mode ...
2014-12-13Merge branch 'akpm' (second patch-bomb from Andrew)Linus Torvalds
Merge second patchbomb from Andrew Morton: - the rest of MM - misc fs fixes - add execveat() syscall - new ratelimit feature for fault-injection - decompressor updates - ipc/ updates - fallocate feature creep - fsnotify cleanups - a few other misc things * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (99 commits) cgroups: Documentation: fix trivial typos and wrong paragraph numberings parisc: percpu: update comments referring to __get_cpu_var percpu: update local_ops.txt to reflect this_cpu operations percpu: remove __get_cpu_var and __raw_get_cpu_var macros fsnotify: remove destroy_list from fsnotify_mark fsnotify: unify inode and mount marks handling fallocate: create FAN_MODIFY and IN_MODIFY events mm/cma: make kmemleak ignore CMA regions slub: fix cpuset check in get_any_partial slab: fix cpuset check in fallback_alloc shmdt: use i_size_read() instead of ->i_size ipc/shm.c: fix overly aggressive shmdt() when calls span multiple segments ipc/msg: increase MSGMNI, remove scaling ipc/sem.c: increase SEMMSL, SEMMNI, SEMOPM ipc/sem.c: change memory barrier in sem_lock() to smp_rmb() lib/decompress.c: consistency of compress formats for kernel image decompress_bunzip2: off by one in get_next_block() usr/Kconfig: make initrd compression algorithm selection not expert fault-inject: add ratelimit option ratelimit: add initialization macro ...
2014-12-13iommu/amd: use handle_mm_fault directlyJesse Barnes
This could be useful for debug in the future if we want to track major/minor faults more closely, and also avoids the put_page trick we used with gup. In order to do this, we also track the task struct in the PASID state structure. This lets us update the appropriate task stats after the fault has been handled, and may aid with debug in the future as well. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Tested-by: Oded Gabbay <oded.gabbay@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-12iommu/amd: Fix accounting of device_stateOded Gabbay
This patch fixes a bug in the accounting of the device_state. In the current code, the device_state was put (decremented) too many times, which sometimes lead to the driver getting stuck permanently in put_device_state_wait(). That happen because the device_state->count would go below zero, which is never supposed to happen. The root cause is that the device_state was decremented in put_pasid_state() and put_pasid_state_wait() but also in all the functions that call those functions. Therefore, the device_state was decremented twice in each of these code paths. The fix is to decouple the device_state accounting from the pasid_state accounting - remove the call to put_device_state() from the put_pasid_state() and the put_pasid_state_wait()) Signed-off-by: Oded Gabbay <oded.gabbay@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2014-09-24kvm: Fix page ageing bugsAndres Lagar-Cavilla
1. We were calling clear_flush_young_notify in unmap_one, but we are within an mmu notifier invalidate range scope. The spte exists no more (due to range_start) and the accessed bit info has already been propagated (due to kvm_pfn_set_accessed). Simply call clear_flush_young. 2. We clear_flush_young on a primary MMU PMD, but this may be mapped as a collection of PTEs by the secondary MMU (e.g. during log-dirty). This required expanding the interface of the clear_flush_young mmu notifier, so a lot of code has been trivially touched. 3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate the access bit by blowing the spte. This requires proper synchronizing with MMU notifier consumers, like every other removal of spte's does. Signed-off-by: Andres Lagar-Cavilla <andreslc@google.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-07-30iommu/amd: Fix 2 typos in commentsJoerg Roedel
amd_iommu_pasid_bind -> amd_iommu_bind_pasid Signed-off-by: Joerg Roedel <jroedel@suse.de>
2014-07-30iommu/amd: Fix device_state reference countingJoerg Roedel
The references to the device state are not dropped everywhere. This might cause a dead-lock in amd_iommu_free_device(). Fix it. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-30iommu/amd: Remove change_pte mmu_notifier call-backJoerg Roedel
All calls to this call-back are wrapped with mmu_notifer_invalidate_range_start()/end(), making this notifier pretty useless, so remove it. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-30iommu/amd: Don't set pasid_state->mm to NULL in unbind_pasidJoerg Roedel
With calling te mmu_notifier_register function we hold a reference to the mm_struct that needs to be released in mmu_notifier_unregister. This is true even if the notifier was already unregistered from exit_mmap and the .release call-back has already run. So make sure we call mmu_notifier_unregister unconditionally in amd_iommu_unbind_pasid. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-10iommu/amd: Don't call the inv_ctx_cb when pasid is not set upJoerg Roedel
On the error path of amd_iommu_bind_pasid() we call mmu_notifier_unregister() for cleanup. This calls mn_release() which calls the users inv_ctx_cb function if one is available. Since the pasid is not set up yet there is nothing the user can to tear down in this call-back. So don't call inv_ctx_cb on the error path of amd_iommu_unbind_pasid() and make life of the users simpler. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <Oded.Gabbay@amd.com>
2014-07-10iommu/amd: Don't hold a reference to task_structJoerg Roedel
Since we are only caring about the lifetime of the mm_struct and not the task we can't safely keep a reference to it. The reference is also not needed anymore, so remove that code entirely. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <Oded.Gabbay@amd.com>
2014-07-10iommu/amd: Don't hold a reference to mm_structJoerg Roedel
With mmu_notifiers we don't need to hold a reference to the mm_struct during the time the pasid is bound to a device. We can rely on the .mn_release call back to inform us when the mm_struct goes away. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <Oded.Gabbay@amd.com>
2014-07-10iommu/amd: Add pasid_state->invalid flagJoerg Roedel
This is used to signal the ppr_notifer function that no more faults should be processes on this pasid_state. This way we can keep the pasid_state safely in the state-table so that it can be freed in the amd_iommu_unbind_pasid() function. This allows us to not hold a reference to the mm_struct during the whole pasid-binding-time. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <Oded.Gabbay@amd.com>
2014-07-10iommu/amd: Drop pasid_state reference in ppr_notifer error pathJoerg Roedel
In case we are not able to allocate a fault structure a reference to the pasid_state will be leaked. Fix that by dropping the reference in the error path in case we hold one. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <Oded.Gabbay@amd.com>
2014-07-10iommu/amd: Get rid of __unbind_pasidJoerg Roedel
Unbind_pasid is only called from mn_release which already has the pasid_state. Use this to simplify the unbind_pasid path and get rid of __unbind_pasid. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <Oded.Gabbay@amd.com>
2014-07-10iommu/amd: Don't free pasid_state in mn_release pathJoerg Roedel
The mmu_notifier state is part of pasid_state so it can't be freed in the mn_release path. Free the pasid_state after mmu_notifer_unregister has completed. Signed-off-by: Joerg Roedel <jroedel@suse.de> Tested-by: Oded Gabbay <Oded.Gabbay@amd.com>