pm24.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2024-10-06	x86/reboot: emergency callbacks are now registered by common KVM code	Paolo Bonzini
	Guard them with CONFIG_KVM_X86_COMMON rather than the two vendor modules. In practice this has no functional change, because CONFIG_KVM_X86_COMMON is set if and only if at least one vendor-specific module is being built. However, it is cleaner to specify CONFIG_KVM_X86_COMMON for functions that are used in kvm.ko. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Fixes: 590b09b1d88e ("KVM: x86: Register "emergency disable" callbacks when virt is enabled") Fixes: 6d55a94222db ("x86/reboot: Unconditionally define cpu_emergency_virt_cb typedef") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-06	KVM: x86: leave kvm.ko out of the build if no vendor module is requested	Paolo Bonzini
	kvm.ko is nothing but library code shared by kvm-intel.ko and kvm-amd.ko. It provides no functionality on its own and it is unnecessary unless one of the vendor-specific module is compiled. In particular, /dev/kvm is not created until one of kvm-intel.ko or kvm-amd.ko is loaded. Use CONFIG_KVM to decide if it is built-in or a module, but use the vendor-specific modules for the actual decision on whether to build it. This also fixes a build failure when CONFIG_KVM_INTEL and CONFIG_KVM_AMD are both disabled. The cpu_emergency_register_virt_callback() function is called from kvm.ko, but it is only defined if at least one of CONFIG_KVM_INTEL and CONFIG_KVM_AMD is provided. Fixes: 590b09b1d88e ("KVM: x86: Register "emergency disable" callbacks when virt is enabled") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-03	KVM: x86/mmu: fix KVM_X86_QUIRK_SLOT_ZAP_ALL for shadow MMU	Paolo Bonzini
	As was tried in commit 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot"), all shadow pages, i.e. non-leaf SPTEs, need to be zapped. All of the accounting for a shadow page is tied to the memslot, i.e. the shadow page holds a reference to the memslot, for all intents and purposes. Deleting the memslot without removing all relevant shadow pages, as is done when KVM_X86_QUIRK_SLOT_ZAP_ALL is disabled, results in NULL pointer derefs when tearing down the VM. Reintroduce from that commit the code that walks the whole memslot when there are active shadow MMU pages. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-10-01	KVM: selftests: Fix build on architectures other than x86_64	Mark Brown
	The recent addition of support for testing with the x86 specific quirk KVM_X86_QUIRK_SLOT_ZAP_ALL disabled in the generic memslot tests broke the build of the KVM selftests for all other architectures: In file included from include/kvm_util.h:8, from include/memstress.h:13, from memslot_modification_stress_test.c:21: memslot_modification_stress_test.c: In function ‘main’: memslot_modification_stress_test.c:176:38: error: ‘KVM_X86_QUIRK_SLOT_ZAP_ALL’ undeclared (first use in this function) 176 \| KVM_X86_QUIRK_SLOT_ZAP_ALL); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~ Add __x86_64__ guard defines to avoid building the relevant code on other architectures. Fixes: 61de4c34b51c ("KVM: selftests: Test memslot move in memslot_perf_test with quirk disabled") Fixes: 218f6415004a ("KVM: selftests: Allow slot modification stress test with quirk disabled") Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Message-ID: <20240930-kvm-build-breakage-v1-1-866fad3cc164@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-27	Documentation: KVM: fix warning in "make htmldocs"	Paolo Bonzini
	The warning Documentation/virt/kvm/locking.rst:31: ERROR: Unexpected indentation. is caused by incorrectly treating a line as the continuation of a paragraph, rather than as the first line in a bullet list. Fixed: 44d174596260 ("KVM: Use dedicated mutex to protect kvm_usage_count to avoid deadlock") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-17	Merge tag 'kvm-x86-vmx-6.12' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini
	KVM VMX changes for 6.12: - Set FINAL/PAGE in the page fault error code for EPT Violations if and only if the GVA is valid. If the GVA is NOT valid, there is no guest-side page table walk and so stuffing paging related metadata is nonsensical. - Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of emulating posted interrupt delivery to L2. - Add a lockdep assertion to detect unsafe accesses of vmcs12 structures. - Harden eVMCS loading against an impossible NULL pointer deref (really truly should be impossible). - Minor SGX fix and a cleanup.
2024-09-17	Merge tag 'kvm-x86-svm-6.12' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini
	KVM SVM changes for 6.12: - Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is enabled, i.e. when the CPU has already flushed the RSB. - Trace the per-CPU host save area as a VMCB pointer to improve readability and cleanup the retrieval of the SEV-ES host save area. - Remove unnecessary accounting of temporary nested VMCB related allocations.
2024-09-17	Merge tag 'kvm-x86-pat_vmx_msrs-6.12' of https://github.com/kvm-x86/linux ↵	Paolo Bonzini
	into HEAD KVM VMX and x86 PAT MSR macro cleanup for 6.12: - Add common defines for the x86 architectural memory types, i.e. the types that are shared across PAT, MTRRs, VMCSes, and EPTPs. - Clean up the various VMX MSR macros to make the code self-documenting (inasmuch as possible), and to make it less painful to add new macros.
2024-09-17	Merge tag 'kvm-x86-mmu-6.12' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini
	KVM x86 MMU changes for 6.12: - Overhaul the "unprotect and retry" logic to more precisely identify cases where retrying is actually helpful, and to harden all retry paths against putting the guest into an infinite retry loop. - Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in the shadow MMU. - Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for adding MGLRU support in KVM. - Misc cleanups
2024-09-17	Merge tag 'kvm-x86-selftests-6.12' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini
	KVM selftests changes for 6.12: - Fix a goof that caused some Hyper-V tests to be skipped when run on bare metal, i.e. NOT in a VM. - Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES guest. - Explicitly include one-off assets in .gitignore. Past Sean was completely wrong about not being able to detect missing .gitignore entries. - Verify userspace single-stepping works when KVM happens to handle a VM-Exit in its fastpath. - Misc cleanups
2024-09-17	Merge tag 'kvm-x86-misc-6.12' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini
	KVM x86 misc changes for 6.12 - Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10 functionality that is on the horizon). - Rework common MSR handling code to suppress errors on userspace accesses to unsupported-but-advertised MSRs. This will allow removing (almost?) all of KVM's exemptions for userspace access to MSRs that shouldn't exist based on the vCPU model (the actual cleanup is non-trivial future work). - Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the 64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv) stores the entire 64-bit value a the ICR offset. - Fix a bug where KVM would fail to exit to userspace if one was triggered by a fastpath exit handler. - Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when there's already a pending wake event at the time of the exit. - Finally fix the RSM vs. nested VM-Enter WARN by forcing the vCPU out of guest mode prior to signalling SHUTDOWN (architecturally, the SHUTDOWN is supposed to hit L1, not L2).
2024-09-17	Merge tag 'kvm-x86-generic-6.12' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini
	KVK generic changes for 6.12: - Fix a bug that results in KVM prematurely exiting to userspace for coalesced MMIO/PIO in many cases, clean up the related code, and add a testcase. - Fix a bug in kvm_clear_guest() where it would trigger a buffer overflow _if_ the gpa+len crosses a page boundary, which thankfully is guaranteed to not happen in the current code base. Add WARNs in more helpers that read/write guest memory to detect similar bugs.
2024-09-17	Merge branch 'kvm-redo-enable-virt' into HEAD	Paolo Bonzini
	Register KVM's cpuhp and syscore callbacks when enabling virtualization in hardware, as the sole purpose of said callbacks is to disable and re-enable virtualization as needed. The primary motivation for this series is to simplify dealing with enabling virtualization for Intel's TDX, which needs to enable virtualization when kvm-intel.ko is loaded, i.e. long before the first VM is created. That said, this is a nice cleanup on its own. By registering the callbacks on-demand, the callbacks themselves don't need to check kvm_usage_count, because their very existence implies a non-zero count. Patch 1 (re)adds a dedicated lock for kvm_usage_count. This avoids a lock ordering issue between cpus_read_lock() and kvm_lock. The lock ordering issue still exist in very rare cases, and will be fixed for good by switching vm_list to an (S)RCU-protected list. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-17	Merge branch 'kvm-memslot-zap-quirk' into HEAD	Paolo Bonzini
	Today whenever a memslot is moved or deleted, KVM invalidates the entire page tables and generates fresh ones based on the new memslot layout. This behavior traditionally was kept because of a bug which was never fully investigated and caused VM instability with assigned GeForce GPUs. It generally does not have a huge overhead, because the old MMU is able to reuse cached page tables and the new one is more scalabale and can resolve EPT violations/nested page faults in parallel, but it has worse performance if the guest frequently deletes and adds small memslots, and it's entirely not viable for TDX. This is because TDX requires re-accepting of private pages after page dropping. For non-TDX VMs, this series therefore introduces the KVM_X86_QUIRK_SLOT_ZAP_ALL quirk, enabling users to control the behavior of memslot zapping when a memslot is moved/deleted. The quirk is turned on by default, leading to the zapping of all SPTEs when a memslot is moved/deleted; users however have the option to turn off the quirk, which limits the zapping only to those SPTEs hat lie within the range of memslot being moved/deleted. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-17	Merge tag 'kvm-s390-next-6.12-1' of ↵	Paolo Bonzini
	https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD * New ucontrol selftest * Inline assembly touchups
2024-09-16	s390: Enable KVM_S390_UCONTROL config in debug_defconfig	Christoph Schlameuss
	To simplify testing enable UCONTROL KVM by default in debug kernels. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20240807154512.316936-11-schlameuss@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-ID: <20240807154512.316936-11-schlameuss@linux.ibm.com>
2024-09-16	selftests: kvm: s390: Add VM run test case	Christoph Schlameuss
	Add test case running code interacting with registers within a ucontrol VM. * Add uc_gprs test case The test uses the same VM setup using the fixture and debug macros introduced in earlier patches in this series. Signed-off-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20240807154512.316936-7-schlameuss@linux.ibm.com [frankja@linux.ibm.com: Removed leftover comment line] Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-ID: <20240807154512.316936-7-schlameuss@linux.ibm.com>
2024-09-15	Merge tag 'kvm-riscv-6.12-1' of https://github.com/kvm-riscv/linux into HEAD	Paolo Bonzini
	KVM/riscv changes for 6.12 - Fix sbiret init before forwarding to userspace - Don't zero-out PMU snapshot area before freeing data - Allow legacy PMU access from guest - Fix to allow hpmcounter31 from the guest
2024-09-15	Merge tag 'loongarch-kvm-6.12' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.12 1. Revert qspinlock to test-and-set simple lock on VM. 2. Add Loongson Binary Translation extension support. 3. Add PMU support for guest. 4. Enable paravirt feature control from VMM. 5. Implement function kvm_para_has_feature().
2024-09-15	Merge tag 'kvmarm-6.12' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.12 * New features: - Add a Stage-2 page table dumper, reusing the main ptdump infrastructure, and allowing easier debugging of the our page-table infrastructure - Add FP8 support to the KVM/arm64 floating point handling. - Add NV support for the AT family of instructions, which mostly results in adding a page table walker that deals with most of the complexity of the architecture. * Improvements, fixes and cleanups: - Add selftest checks for a bunch of timer emulation corner cases - Fix the multiple of cases where KVM/arm64 doesn't correctly handle the guest trying to use a GICv3 that isn't advertised - Remove REG_HIDDEN_USER from the sysreg infrastructure, making things little more simple - Prevent MTE tags being restored by userspace if we are actively logging writes, as that's a recipe for disaster - Correct the refcount on a page that is not considered for MTE tag copying (such as a device) - Relax the synchronisation when walking a page table to split block mappings, moving it at the end the walk, as there is no need to perform it on every store. - Fix boundary check when transfering memory using FFA - Fix pKVM TLB invalidation, only affecting currently out of tree code but worth addressing for peace of mind
2024-09-12	LoongArch: KVM: Implement function kvm_para_has_feature()	Bibo Mao
	Implement function kvm_para_has_feature() to detect supported paravirt features. It can be used by device driver to detect and enable paravirt features, such as the EIOINTC irqchip driver is able to detect feature KVM_FEATURE_VIRT_EXTIOI and do some optimization. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-12	LoongArch: KVM: Enable paravirt feature control from VMM	Bibo Mao
	Export kernel paravirt features to user space, so that VMM can control each single paravirt feature. By default paravirt features will be the same with kvm supported features if VMM does not set it. Also a new feature KVM_FEATURE_VIRT_EXTIOI is added which can be set from user space. This feature indicates that the virt EIOINTC can route interrupts to 256 vCPUs, rather than 4 vCPUs like with real HW. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-12	LoongArch: KVM: Add PMU support for guest	Song Gao
	On LoongArch, the host and guest have their own PMU CSRs registers and they share PMU hardware resources. A set of PMU CSRs consists of a CTRL register and a CNTR register. We can set which PMU CSRs are used by the guest by writing to the GCFG register [24:26] bits. On KVM side: - Save the host PMU CSRs into structure kvm_context. - If the host supports the PMU feature. - When entering guest mode, save the host PMU CSRs and restore the guest PMU CSRs. - When exiting guest mode, save the guest PMU CSRs and restore the host PMU CSRs. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Song Gao <gaosong@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-12	Merge branch kvm-arm64/visibility-cleanups into kvmarm-master/next	Marc Zyngier
	* kvm-arm64/visibility-cleanups: : . : Remove REG_HIDDEN_USER from the sysreg infrastructure, making things : a little more simple. From the cover letter: : : "Since 4d4f52052ba8 ("KVM: arm64: nv: Drop EL12 register traps that are : redirected to VNCR") and the admission that KVM would never be supporting : the original FEAT_NV, REG_HIDDEN_USER only had a few users, all of which : could either be replaced by a more ad-hoc mechanism, or removed altogether." : . KVM: arm64: Get rid of REG_HIDDEN_USER visibility qualifier KVM: arm64: Simplify visibility handling of AArch32 SPSR_* KVM: arm64: Simplify handling of CNTKCTL_EL12 Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-12	Merge branch kvm-arm64/s2-ptdump into kvmarm-master/next	Marc Zyngier
	* kvm-arm64/s2-ptdump: : . : Stage-2 page table dumper, reusing the main ptdump infrastructure, : courtesy of Sebastian Ene. From the cover letter: : : "This series extends the ptdump support to allow dumping the guest : stage-2 pagetables. When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump : registers the new following files under debugfs: : - /sys/debug/kvm/<guest_id>/stage2_page_tables : - /sys/debug/kvm/<guest_id>/stage2_levels : - /sys/debug/kvm/<guest_id>/ipa_range : : This allows userspace tools (eg. cat) to dump the stage-2 pagetables by : reading the 'stage2_page_tables' file. : [...]" : . KVM: arm64: Register ptdump with debugfs on guest creation arm64: ptdump: Don't override the level when operating on the stage-2 tables arm64: ptdump: Use the ptdump description from a local context arm64: ptdump: Expose the attribute parsing functionality KVM: arm64: Move pagetable definitions to common header Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-12	Merge branch kvm-arm64/nv-at-pan into kvmarm-master/next	Marc Zyngier
	* kvm-arm64/nv-at-pan: : . : Add NV support for the AT family of instructions, which mostly results : in adding a page table walker that deals with most of the complexity : of the architecture. : : From the cover letter: : : "Another task that a hypervisor supporting NV on arm64 has to deal with : is to emulate the AT instruction, because we multiplex all the S1 : translations on a single set of registers, and the guest S2 is never : truly resident on the CPU. : : So given that we lie about page tables, we also have to lie about : translation instructions, hence the emulation. Things are made : complicated by the fact that guest S1 page tables can be swapped out, : and that our shadow S2 is likely to be incomplete. So while using AT : to emulate AT is tempting (and useful), it is not going to always : work, and we thus need a fallback in the shape of a SW S1 walker." : . KVM: arm64: nv: Add support for FEAT_ATS1A KVM: arm64: nv: Plumb handling of AT S1* traps from EL2 KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3 KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration KVM: arm64: nv: Add SW walker for AT S1 emulation KVM: arm64: nv: Make ps_to_output_size() generally available KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W} KVM: arm64: nv: Add basic emulation of AT S1E2{R,W} KVM: arm64: nv: Add basic emulation of AT S1E1{R,W}P KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W} KVM: arm64: nv: Honor absence of FEAT_PAN2 KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor KVM: arm64: nv: Enforce S2 alignment when contiguous bit is set arm64: Add ESR_ELx_FSC_ADDRSZ_L() helper arm64: Add system register encoding for PSTATE.PAN arm64: Add PAR_EL1 field description arm64: Add missing APTable and TCR_ELx.HPD masks KVM: arm64: Make kvm_at() take an OP_AT_* Signed-off-by: Marc Zyngier <maz@kernel.org> # Conflicts: # arch/arm64/kvm/nested.c
2024-09-12	Merge branch kvm-arm64/selftests-6.12 into kvmarm-master/next	Marc Zyngier
	* kvm-arm64/selftests-6.12: : . : KVM/arm64 selftest updates for 6.12 : : - Check for a bunch of timer emulation corner cases (COlton Lewis) : . KVM: arm64: selftests: Add arch_timer_edge_cases selftest KVM: arm64: selftests: Ensure pending interrupts are handled in arch_timer test Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-12	Merge branch kvm-arm64/vgic-sre-traps into kvmarm-master/next	Marc Zyngier
	* kvm-arm64/vgic-sre-traps: : . : Fix the multiple of cases where KVM/arm64 doesn't correctly : handle the guest trying to use a GICv3 that isn't advertised. : : From the cover letter: : : "It recently appeared that, when running on a GICv3-equipped platform : (which is what non-ancient arm64 HW has), not configuring a GICv3 : for the guest could result in less than desirable outcomes. : : We have multiple issues to fix: : : - for registers that always trap (the SGI registers) or that may : trap (the SRE register), we need to check whether a GICv3 has been : instantiated before acting upon the trap. : : - for registers that only conditionally trap, we must actively trap : them even in the absence of a GICv3 being instantiated, and handle : those traps accordingly. : : - finally, ID registers must reflect the absence of a GICv3, so that : we are consistent. : : This series goes through all these requirements. The main complexity : here is to apply a GICv3 configuration on the host in the absence of a : GICv3 in the guest. This is pretty hackish, but I don't have a much : better solution so far. : : As part of making wider use of of the trap bits, we fully define the : trap routing as per the architecture, something that we eventually : need for NV anyway." : . KVM: arm64: selftests: Cope with lack of GICv3 in set_id_regs KVM: arm64: Add selftest checking how the absence of GICv3 is handled KVM: arm64: Unify UNDEF injection helpers KVM: arm64: Make most GICv3 accesses UNDEF if they trap KVM: arm64: Honor guest requested traps in GICv3 emulation KVM: arm64: Add trap routing information for ICH_HCR_EL2 KVM: arm64: Add ICH_HCR_EL2 to the vcpu state KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest KVM: arm64: Add helper for last ditch idreg adjustments KVM: arm64: Force GICv3 trap activation when no irqchip is configured on VHE KVM: arm64: Force SRE traps when SRE access is not enabled KVM: arm64: Move GICv3 trap configuration to kvm_calculate_traps() Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-12	Merge branch kvm-arm64/fpmr into kvmarm-master/next	Marc Zyngier
	* kvm-arm64/fpmr: : . : Add FP8 support to the KVM/arm64 floating point handling. : : This includes new ID registers (ID_AA64PFR2_EL1 ID_AA64FPFR0_EL1) : being made visible to guests, as well as a new confrol register : (FPMR) which gets context-switched. : . KVM: arm64: Expose ID_AA64PFR2_EL1 to userspace and guests KVM: arm64: Enable FP8 support when available and configured KVM: arm64: Expose ID_AA64FPFR0_EL1 as a writable ID reg KVM: arm64: Honor trap routing for FPMR KVM: arm64: Add save/restore support for FPMR KVM: arm64: Move FPMR into the sysreg array KVM: arm64: Add predicate for FPMR support in a VM KVM: arm64: Move SVCR into the sysreg array Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-12	Merge branch kvm-arm64/mmu-misc-6.12 into kvmarm-master/next	Marc Zyngier
	* kvm-arm64/mmu-misc-6.12: : . : Various minor MMU improvements and bug-fixes: : : - Prevent MTE tags being restored by userspace if we are actively : logging writes, as that's a recipe for disaster : : - Correct the refcount on a page that is not considered for MTE : tag copying (such as a device) : : - When walking a page table to split blocks, keep the DSB at the end : the walk, as there is no need to perform it on every store. : : - Fix boundary check when transfering memory using FFA : . KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer KVM: arm64: Disallow copying MTE to guest memory while KVM is dirty logging KVM: arm64: Release pfn, i.e. put page, if copying MTE tags hits ZONE_DEVICE KVM: arm64: Move data barrier to end of split walk Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-11	KVM: arm64: Get rid of REG_HIDDEN_USER visibility qualifier	Marc Zyngier
	Now that REG_HIDDEN_USER has no direct user anymore, remove it entirely and update all users of sysreg_hidden_user() to call sysreg_hidden() instead. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240904082419.1982402-4-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-11	KVM: arm64: Simplify visibility handling of AArch32 SPSR_*	Marc Zyngier
	Since SPSR_* are not associated with any register in the sysreg array, nor do they have .get_user()/.set_user() helpers, they are invisible to userspace with that encoding. Therefore hidden_user_visibility() serves no purpose here, and can be safely removed. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240904082419.1982402-3-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-11	KVM: arm64: Simplify handling of CNTKCTL_EL12	Marc Zyngier
	We go trough a great deal of effort to map CNTKCTL_EL12 to CNTKCTL_EL1 while hidding this mapping from userspace via a special visibility helper. However, it would be far simpler to just provide an accessor doing the mapping job, removing the need for a visibility helper. With that done, we can also remove the EL12_REG() macro which serves no purpose. Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240904082419.1982402-2-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-11	LoongArch: KVM: Add vm migration support for LBT registers	Bibo Mao
	Every vcpu has separate LBT registers. And there are four scr registers, one flags and ftop register for LBT extension. When VM migrates, VMM needs to get LBT registers for every vcpu. Here macro KVM_REG_LOONGARCH_LBT is added for new vcpu lbt register type, the following macro is added to get/put LBT registers. KVM_REG_LOONGARCH_LBT_SCR0 KVM_REG_LOONGARCH_LBT_SCR1 KVM_REG_LOONGARCH_LBT_SCR2 KVM_REG_LOONGARCH_LBT_SCR3 KVM_REG_LOONGARCH_LBT_EFLAGS KVM_REG_LOONGARCH_LBT_FTOP Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-11	LoongArch: KVM: Add Binary Translation extension support	Bibo Mao
	Loongson Binary Translation (LBT) is used to accelerate binary translation, which contains 4 scratch registers (scr0 to scr3), x86/ARM eflags (eflags) and x87 fpu stack pointer (ftop). Like FPU extension, here a lazy enabling method is used for LBT. the LBT context is saved/restored on the vcpu context switch path. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-11	LoongArch: KVM: Add VM feature detection function	Bibo Mao
	Loongson SIMD Extension (LSX), Loongson Advanced SIMD Extension (LASX) and Loongson Binary Translation (LBT) features are defined in register CPUCFG2. Two kinds of LSX/LASX/LBT feature detection are added here, one is VCPU feature, and the other is VM feature. VCPU feature dection can only work with VCPU thread itself, and requires VCPU thread is created already. So LSX/LASX/LBT feature detection for VM is added also, it can be done even if VM is not created, and also can be done by any threads besides VCPU threads. Here ioctl command KVM_HAS_DEVICE_ATTR is added for VM, and macro KVM_LOONGARCH_VM_FEAT_CTRL is added to check supported feature. And five sub-features relative with LSX/LASX/LBT are added as following: KVM_LOONGARCH_VM_FEAT_LSX KVM_LOONGARCH_VM_FEAT_LASX KVM_LOONGARCH_VM_FEAT_X86BT KVM_LOONGARCH_VM_FEAT_ARMBT KVM_LOONGARCH_VM_FEAT_MIPSBT Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-11	LoongArch: Revert qspinlock to test-and-set simple lock on VM	Bibo Mao
	Similar with x86, when VM is detected, revert to a simple test-and-set lock to avoid the horrors of queue preemption. Tested on 3C5000 Dual-way machine with 32 cores and 2 numa nodes, test case is kcbench on kernel mainline 6.10, the detailed command is "kcbench --src /root/src/linux" Performance on host machine kernel compile time performance impact Original 150.29 seconds With patch 150.19 seconds almost no impact Performance on virtual machine: 1. 1 VM with 32 vCPUs and 2 numa node, numa node pinned kernel compile time performance impact Original 170.87 seconds With patch 171.73 seconds almost no impact 2. 2 VMs, each VM with 32 vCPUs and 2 numa node, numa node pinned kernel compile time performance impact Original 2362.04 seconds With patch 354.73 seconds +565% Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-09-10	KVM: arm64: Register ptdump with debugfs on guest creation	Sebastian Ene
	While arch/*/mem/ptdump handles the kernel pagetable dumping code, introduce KVM/ptdump to show the guest stage-2 pagetables. The separation is necessary because most of the definitions from the stage-2 pagetable reside in the KVM path and we will be invoking functionality specific to KVM. Introduce the PTDUMP_STAGE2_DEBUGFS config. When a guest is created, register a new file entry under the guest debugfs dir which allows userspace to show the contents of the guest stage-2 pagetables when accessed. [maz: moved function prototypes from kvm_host.h to kvm_mmu.h] Signed-off-by: Sebastian Ene <sebastianene@google.com> Reviewed-by: Vincent Donnefort <vdonnefort@google.com> Link: https://lore.kernel.org/r/20240909124721.1672199-6-sebastianene@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-10	arm64: ptdump: Don't override the level when operating on the stage-2 tables	Sebastian Ene
	Ptdump uses the init_mm structure directly to dump the kernel pagetables. When ptdump is called on the stage-2 pagetables, this mm argument is not used. Prevent the level from being overwritten by checking the argument against NULL. Signed-off-by: Sebastian Ene <sebastianene@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240909124721.1672199-5-sebastianene@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-10	arm64: ptdump: Use the ptdump description from a local context	Sebastian Ene
	Rename the attributes description array to allow the parsing method to use the description from a local context. To be able to do this, store a pointer to the description array in the state structure. This will allow for the later introduced callers (stage_2 ptdump) to specify their own page table description format to the ptdump parser. Signed-off-by: Sebastian Ene <sebastianene@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240909124721.1672199-4-sebastianene@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-10	arm64: ptdump: Expose the attribute parsing functionality	Sebastian Ene
	Reuse the descriptor parsing functionality to keep the same output format as the original ptdump code. In order for this to happen, move the state tracking objects into a common header. [maz: Fixed note_page() stub as suggested by Will] Signed-off-by: Sebastian Ene <sebastianene@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240909124721.1672199-3-sebastianene@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-10	KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer	Snehal Koukuntla
	When we share memory through FF-A and the description of the buffers exceeds the size of the mapped buffer, the fragmentation API is used. The fragmentation API allows specifying chunks of descriptors in subsequent FF-A fragment calls and no upper limit has been established for this. The entire memory region transferred is identified by a handle which can be used to reclaim the transferred memory. To be able to reclaim the memory, the description of the buffers has to fit in the ffa_desc_buf. Add a bounds check on the FF-A sharing path to prevent the memory reclaim from failing. Also do_ffa_mem_xfer() does not need __always_inline, except for the BUILD_BUG_ON() aspect, which gets moved to a macro. [maz: fixed the BUILD_BUG_ON() breakage with LLVM, thanks to Wei-Lin Chang for the timely report] Fixes: 634d90cf0ac65 ("KVM: arm64: Handle FFA_MEM_LEND calls from the host") Cc: stable@vger.kernel.org Reviewed-by: Sebastian Ene <sebastianene@google.com> Signed-off-by: Snehal Koukuntla <snehalreddy@google.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240909180154.3267939-1-snehalreddy@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-10	KVM: SVM: let alternatives handle the cases when RSB filling is required	Amit Shah
	Remove superfluous RSB filling after a VMEXIT when the CPU already has flushed the RSB after a VMEXIT when AutoIBRS is enabled. The initial implementation for adding RETPOLINES added an ALTERNATIVES implementation for filling the RSB after a VMEXIT in commit 117cc7a908c8 ("x86/retpoline: Fill return stack buffer on vmexit"). Later, X86_FEATURE_RSB_VMEXIT was added in commit 9756bba28470 ("x86/speculation: Fill RSB on vmexit for IBRS") to handle stuffing the RSB if RETPOLINE=y or KERNEL_IBRS=y, i.e. to also stuff the RSB if the kernel is configured to do IBRS mitigations on entry/exit. The AutoIBRS (on AMD) feature implementation added in commit e7862eda309e ("x86/cpu: Support AMD Automatic IBRS") used the already-implemented logic for EIBRS in spectre_v2_determine_rsb_fill_type_on_vmexit() -- but did not update the code at VMEXIT to act on the mode selected in that function -- resulting in VMEXITs continuing to clear the RSB when RETPOLINES are enabled, despite the presence of AutoIBRS. Signed-off-by: Amit Shah <amit.shah@amd.com> Link: https://lore.kernel.org/r/20240807123531.69677-1-amit@kernel.org [sean: massage changeloge, drop comment about AMD not needing RSB_VMEXIT_LITE] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-09-10	KVM: arm64: Move pagetable definitions to common header	Sebastian Ene
	In preparation for using the stage-2 definitions in ptdump, move some of these macros in the common header. Signed-off-by: Sebastian Ene <sebastianene@google.com> Link: https://lore.kernel.org/r/20240909124721.1672199-2-sebastianene@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-09-09	KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid	Sean Christopherson
	Set PFERR_GUEST_{FINAL,PAGE}_MASK based on EPT_VIOLATION_GVA_TRANSLATED if and only if EPT_VIOLATION_GVA_IS_VALID is also set in exit qualification. Per the SDM, bit 8 (EPT_VIOLATION_GVA_TRANSLATED) is valid if and only if bit 7 (EPT_VIOLATION_GVA_IS_VALID) is set, and is '0' if bit 7 is '0'. Bit 7 (a.k.a. EPT_VIOLATION_GVA_IS_VALID) Set if the guest linear-address field is valid. The guest linear-address field is valid for all EPT violations except those resulting from an attempt to load the guest PDPTEs as part of the execution of the MOV CR instruction and those due to trace-address pre-translation Bit 8 (a.k.a. EPT_VIOLATION_GVA_TRANSLATED) If bit 7 is 1: • Set if the access causing the EPT violation is to a guest-physical address that is the translation of a linear address. • Clear if the access causing the EPT violation is to a paging-structure entry as part of a page walk or the update of an accessed or dirty bit. Reserved if bit 7 is 0 (cleared to 0). Failure to guard the logic on GVA_IS_VALID results in KVM marking the page fault as PFERR_GUEST_PAGE_MASK when there is no known GVA, which can put the vCPU into an infinite loop due to kvm_mmu_page_fault() getting false positive on its PFERR_NESTED_GUEST_PAGE logic (though only because that logic is also buggy/flawed). In practice, this is largely a non-issue because so GVA_IS_VALID is almost always set. However, when TDX comes along, GVA_IS_VALID will never be set, as the TDX Module deliberately clears bits 12:7 in exit qualification, e.g. so that the faulting virtual address and other metadata that aren't practically useful for the hypervisor aren't leaked to the untrusted host. When exit is due to EPT violation, bits 12-7 of the exit qualification are cleared to 0. Fixes: eebed2438923 ("kvm: nVMX: Add support for fast unprotection of nested guest page tables") Reviewed-by: Yuan Yao <yuan.yao@intel.com> Link: https://lore.kernel.org/r/20240831001538.336683-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-09-09	KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent	Sean Christopherson
	Use KVM_PAGES_PER_HPAGE() instead of open coding equivalent logic that is anything but obvious. No functional change intended, and verified by compiling with the below assertions: BUILD_BUG_ON((1UL << KVM_HPAGE_GFN_SHIFT(PG_LEVEL_4K)) != KVM_PAGES_PER_HPAGE(PG_LEVEL_4K)); BUILD_BUG_ON((1UL << KVM_HPAGE_GFN_SHIFT(PG_LEVEL_2M)) != KVM_PAGES_PER_HPAGE(PG_LEVEL_2M)); BUILD_BUG_ON((1UL << KVM_HPAGE_GFN_SHIFT(PG_LEVEL_1G)) != KVM_PAGES_PER_HPAGE(PG_LEVEL_1G)); Link: https://lore.kernel.org/r/20240809194335.1726916-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-09-09	KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals	Sean Christopherson
	Replace all of the open coded '1' literals used to mark a PTE list as having many/multiple entries with a proper define. It's hard enough to read the code with one magic bit, and a future patch to support "locking" a single rmap will add another. No functional change intended. Link: https://lore.kernel.org/r/20240809194335.1726916-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-09-09	KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range()	Sean Christopherson
	Fold mmu_spte_age() into its sole caller now that aging and testing for young SPTEs is handled in a common location, i.e. doesn't require more helpers. Opportunistically remove the use of mmu_spte_get_lockless(), as mmu_lock is held (for write!), and marking SPTEs for access tracking outside of mmu_lock is unsafe (at least, as written). I.e. using the lockless accessor is quite misleading. No functional change intended. Link: https://lore.kernel.org/r/20240809194335.1726916-16-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-09-09	KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper	Sean Christopherson
	Rework kvm_handle_gfn_range() into an aging-specic helper, kvm_rmap_age_gfn_range(). In addition to purging a bunch of unnecessary boilerplate code, this sets the stage for aging rmap SPTEs outside of mmu_lock. Note, there's a small functional change, as kvm_test_age_gfn() will now return immediately if a young SPTE is found, whereas previously KVM would continue iterating over other levels. Link: https://lore.kernel.org/r/20240809194335.1726916-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-09-09	KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed	Sean Christopherson
	Convert kvm_unmap_gfn_range(), which is the helper that zaps rmap SPTEs in response to an mmu_notifier invalidation, to use __kvm_rmap_zap_gfn_range() and feed in range->may_block. In other words, honor NEED_RESCHED by way of cond_resched() when zapping rmaps. This fixes a long-standing issue where KVM could process an absurd number of rmap entries without ever yielding, e.g. if an mmu_notifier fired on a PUD (or larger) range. Opportunistically rename __kvm_zap_rmap() to kvm_zap_rmap(), and drop the old kvm_zap_rmap(). Ideally, the shuffling would be done in a different patch, but that just makes the compiler unhappy, e.g. arch/x86/kvm/mmu/mmu.c:1462:13: error: ‘kvm_zap_rmap’ defined but not used Reported-by: Peter Xu <peterx@redhat.com> Link: https://lore.kernel.org/r/20240809194335.1726916-14-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>