summaryrefslogtreecommitdiff
path: root/tools/testing/selftests/kvm
AgeCommit message (Collapse)Author
2022-12-12Merge remote-tracking branch 'kvm/queue' into HEADPaolo Bonzini
x86 Xen-for-KVM: * Allow the Xen runstate information to cross a page boundary * Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured * add support for 32-bit guests in SCHEDOP_poll x86 fixes: * One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). * Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. * Clean up the MSR filter docs. * Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. * Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. * Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency. * Advertise (on AMD) that the SMM_CTL MSR is not supported * Remove unnecessary exports Selftests: * Fix an inverted check in the access tracking perf test, and restore support for asserting that there aren't too many idle pages when running on bare metal. * Fix an ordering issue in the AMX test introduced by recent conversions to use kvm_cpu_has(), and harden the code to guard against similar bugs in the future. Anything that tiggers caching of KVM's supported CPUID, kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if the caching occurs before the test opts in via prctl(). * Fix build errors that occur in certain setups (unsure exactly what is unique about the problematic setup) due to glibc overriding static_assert() to a variant that requires a custom message. * Introduce actual atomics for clear/set_bit() in selftests Documentation: * Remove deleted ioctls from documentation * Various fixes
2022-12-09KVM: selftests: Allocate ucall pool from MEM_REGION_DATAOliver Upton
MEM_REGION_TEST_DATA is meant to hold data explicitly used by a selftest, not implicit allocations due to the selftests infrastructure. Allocate the ucall pool from MEM_REGION_DATA much like the rest of the selftests library allocations. Fixes: 426729b2cf2e ("KVM: selftests: Add ucall pool based implementation") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Message-Id: <20221207214809.489070-5-oliver.upton@linux.dev> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-09KVM: arm64: selftests: Align VA space allocator with TTBR0Oliver Upton
An interesting feature of the Arm architecture is that the stage-1 MMU supports two distinct VA regions, controlled by TTBR{0,1}_EL1. As KVM selftests on arm64 only uses TTBR0_EL1, the VA space is constrained to [0, 2^(va_bits-1)). This is different from other architectures that allow for addressing low and high regions of the VA space from a single page table. KVM selftests' VA space allocator presumes the valid address range is split between low and high memory based the MSB, which of course is a poor match for arm64's TTBR0 region. Allow architectures to override the default VA space layout. Make use of the override to align vpages_valid with the behavior of TTBR0 on arm64. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Message-Id: <20221207214809.489070-4-oliver.upton@linux.dev> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-09Merge tag 'kvmarm-6.2' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.2 - Enable the per-vcpu dirty-ring tracking mechanism, together with an option to keep the good old dirty log around for pages that are dirtied by something other than a vcpu. - Switch to the relaxed parallel fault handling, using RCU to delay page table reclaim and giving better performance under load. - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option, which multi-process VMMs such as crosvm rely on. - Merge the pKVM shadow vcpu state tracking that allows the hypervisor to have its own view of a vcpu, keeping that state private. - Add support for the PMUv3p5 architecture revision, bringing support for 64bit counters on systems that support it, and fix the no-quite-compliant CHAIN-ed counter support for the machines that actually exist out there. - Fix a handful of minor issues around 52bit VA/PA support (64kB pages only) as a prefix of the oncoming support for 4kB and 16kB pages. - Add/Enable/Fix a bunch of selftests covering memslots, breakpoints, stage-2 faults and access tracking. You name it, we got it, we probably broke it. - Pick a small set of documentation and spelling fixes, because no good merge window would be complete without those. As a side effect, this tag also drags: - The 'kvmarm-fixes-6.1-3' tag as a dependency to the dirty-ring series - A shared branch with the arm64 tree that repaints all the system registers to match the ARM ARM's naming, and resulting in interesting conflicts
2022-12-05Merge branch kvm-arm64/dirty-ring into kvmarm-master/nextMarc Zyngier
* kvm-arm64/dirty-ring: : . : Add support for the "per-vcpu dirty-ring tracking with a bitmap : and sprinkles on top", courtesy of Gavin Shan. : : This branch drags the kvmarm-fixes-6.1-3 tag which was already : merged in 6.1-rc4 so that the branch is in a working state. : . KVM: Push dirty information unconditionally to backup bitmap KVM: selftests: Automate choosing dirty ring size in dirty_log_test KVM: selftests: Clear dirty ring states between two modes in dirty_log_test KVM: selftests: Use host page size to map ring buffer in dirty_log_test KVM: arm64: Enable ring-based dirty memory tracking KVM: Support dirty ring in conjunction with bitmap KVM: Move declaration of kvm_cpu_dirty_log_size() to kvm_dirty_ring.h KVM: x86: Introduce KVM_REQ_DIRTY_RING_SOFT_FULL Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05Merge branch kvm-arm64/selftest/access-tracking into kvmarm-master/nextMarc Zyngier
* kvm-arm64/selftest/access-tracking: : . : Small series to add support for arm64 to access_tracking_perf_test and : correct a couple bugs along the way. : : Patches courtesy of Oliver Upton. : . KVM: selftests: Build access_tracking_perf_test for arm64 KVM: selftests: Have perf_test_util signal when to stop vCPUs Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05Merge branch kvm-arm64/selftest/s2-faults into kvmarm-master/nextMarc Zyngier
* kvm-arm64/selftest/s2-faults: : . : New KVM/arm64 selftests exercising various sorts of S2 faults, courtesy : of Ricardo Koller. From the cover letter: : : "This series adds a new aarch64 selftest for testing stage 2 fault handling : for various combinations of guest accesses (e.g., write, S1PTW), backing : sources (e.g., anon), and types of faults (e.g., read on hugetlbfs with a : hole, write on a readonly memslot). Each test tries a different combination : and then checks that the access results in the right behavior (e.g., uffd : faults with the right address and write/read flag). [...]" : . KVM: selftests: aarch64: Add mix of tests into page_fault_test KVM: selftests: aarch64: Add readonly memslot tests into page_fault_test KVM: selftests: aarch64: Add dirty logging tests into page_fault_test KVM: selftests: aarch64: Add userfaultfd tests into page_fault_test KVM: selftests: aarch64: Add aarch64/page_fault_test KVM: selftests: Use the right memslot for code, page-tables, and data allocations KVM: selftests: Fix alignment in virt_arch_pgd_alloc() and vm_vaddr_alloc() KVM: selftests: Add vm->memslots[] and enum kvm_mem_region_type KVM: selftests: Stash backing_src_type in struct userspace_mem_region tools: Copy bitfield.h from the kernel sources KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h macros KVM: selftests: Add missing close and munmap in __vm_mem_region_delete() KVM: selftests: aarch64: Add virt_get_pte_hva() library function KVM: selftests: Add a userfaultfd library Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05Merge branch kvm-arm64/selftest/linked-bps into kvmarm-master/nextMarc Zyngier
* kvm-arm64/selftest/linked-bps: : . : Additional selftests for the arm64 breakpoints/watchpoints, : courtesy of Reiji Watanabe. From the cover letter: : : "This series adds test cases for linked {break,watch}points to the : debug-exceptions test, and expands {break,watch}point tests to : use non-zero {break,watch}points (the current test always uses : {break,watch}point#0)." : . KVM: arm64: selftests: Test with every breakpoint/watchpoint KVM: arm64: selftests: Add a test case for a linked watchpoint KVM: arm64: selftests: Add a test case for a linked breakpoint KVM: arm64: selftests: Change debug_version() to take ID_AA64DFR0_EL1 KVM: arm64: selftests: Stop unnecessary test stage tracking of debug-exceptions KVM: arm64: selftests: Add helpers to enable debug exceptions KVM: arm64: selftests: Remove the hard-coded {b,w}pn#0 from debug-exceptions KVM: arm64: selftests: Add write_dbg{b,w}{c,v}r helpers in debug-exceptions KVM: arm64: selftests: Use FIELD_GET() to extract ID register fields Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-05Merge branch kvm-arm64/selftest/memslot-fixes into kvmarm-master/nextMarc Zyngier
* kvm-arm64/selftest/memslot-fixes: : . : KVM memslot selftest fixes for non-4kB page sizes, courtesy : of Gavin Shan. From the cover letter: : : "kvm/selftests/memslots_perf_test doesn't work with 64KB-page-size-host : and 4KB-page-size-guest on aarch64. In the implementation, the host and : guest page size have been hardcoded to 4KB. It's ovbiously not working : on aarch64 which supports 4KB, 16KB, 64KB individually on host and guest. : : This series tries to fix it. After the series is applied, the test runs : successfully with 64KB-page-size-host and 4KB-page-size-guest." : . KVM: selftests: memslot_perf_test: Report optimal memory slots KVM: selftests: memslot_perf_test: Consolidate memory KVM: selftests: memslot_perf_test: Support variable guest page size KVM: selftests: memslot_perf_test: Probe memory slots for once KVM: selftests: memslot_perf_test: Consolidate loop conditions in prepare_vm() KVM: selftests: memslot_perf_test: Use data->nslots in prepare_vm() Signed-off-by: Marc Zyngier <maz@kernel.org>
2022-12-02KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"Colin Ian King
There is a spelling mistake in some help text. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Message-Id: <20221201091354.1613652-1-colin.i.king@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02tools: Drop "atomic_" prefix from atomic test_and_set_bit()Sean Christopherson
Drop the "atomic_" prefix from tools' atomic_test_and_set_bit() to match the kernel nomenclature where test_and_set_bit() is atomic, and __test_and_set_bit() provides the non-atomic variant. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221119013450.2643007-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02KVM: selftests: Use non-atomic clear/set bit helpers in KVM testsSean Christopherson
Use the dedicated non-atomic helpers for {clear,set}_bit() and their test variants, i.e. the double-underscore versions. Depsite being defined in atomic.h, and despite the kernel versions being atomic in the kernel, tools' {clear,set}_bit() helpers aren't actually atomic. Move to the double-underscore versions so that the versions that are expected to be atomic (for kernel developers) can be made atomic without affecting users that don't want atomic operations. Leave the usage in ucall_free() as-is, it's the one place in tools/ that actually wants/needs atomic behavior. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221119013450.2643007-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02KVM: arm64: selftests: Enable single-step without a "full" ucall()Sean Christopherson
Add a new ucall hook, GUEST_UCALL_NONE(), to allow tests to make ucalls without allocating a ucall struct, and use it to enable single-step in ARM's debug-exceptions test. Like the disable single-step path, the enabling path also needs to ensure that no exclusive access sequences are attempted after enabling single-step, as the exclusive monitor is cleared on ERET from the debug exception taken to EL2. The test currently "works" because clear_bit() isn't actually an atomic operation... yet. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20221119013450.2643007-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-02Merge tag 'kvm-x86-fixes-6.2-1' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
Misc KVM x86 fixes and cleanups for 6.2: - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0). - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few years back when eliminating unnecessary barriers when switching between vmcs01 and vmcs02. - Clean up the MSR filter docs. - Clean up vmread_error_trampoline() to make it more obvious that params must be passed on the stack, even for x86-64. - Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective of the current guest CPUID. - Fudge around a race with TSC refinement that results in KVM incorrectly thinking a guest needs TSC scaling when running on a CPU with a constant TSC, but no hardware-enumerated TSC frequency.
2022-12-01KVM: selftests: Define and use a custom static assert in lib headersSean Christopherson
Define and use kvm_static_assert() in the common KVM selftests headers to provide deterministic behavior, and to allow creating static asserts without dummy messages. The kernel's static_assert() makes the message param optional, and on the surface, tools/include/linux/build_bug.h appears to follow suit. However, glibc may override static_assert() and redefine it as a direct alias of _Static_assert(), which makes the message parameter mandatory. This leads to non-deterministic behavior as KVM selftests code that utilizes static_assert() without a custom message may or not compile depending on the order of includes. E.g. recently added asserts in x86_64/processor.h fail on some systems with errors like In file included from lib/memstress.c:11:0: include/x86_64/processor.h: In function ‘this_cpu_has_p’: include/x86_64/processor.h:193:34: error: expected ‘,’ before ‘)’ token static_assert(low_bit < high_bit); \ ^ due to _Static_assert() expecting a comma before a message. The "message optional" version of static_assert() uses macro magic to strip away the comma when presented with empty an __VA_ARGS__ #ifndef static_assert #define static_assert(expr, ...) __static_assert(expr, ##__VA_ARGS__, #expr) #define __static_assert(expr, msg, ...) _Static_assert(expr, msg) #endif // static_assert and effectively generates "_Static_assert(expr, #expr)". The incompatible version of static_assert() gets defined by this snippet in /usr/include/assert.h: #if defined __USE_ISOC11 && !defined __cplusplus # undef static_assert # define static_assert _Static_assert #endif which yields "_Static_assert(expr)" and thus fails as above. KVM selftests don't actually care about using C11, but __USE_ISOC11 gets defined because of _GNU_SOURCE, which many tests do #define. _GNU_SOURCE triggers a massive pile of defines in /usr/include/features.h, including _ISOC11_SOURCE: /* If _GNU_SOURCE was defined by the user, turn on all the other features. */ #ifdef _GNU_SOURCE # undef _ISOC95_SOURCE # define _ISOC95_SOURCE 1 # undef _ISOC99_SOURCE # define _ISOC99_SOURCE 1 # undef _ISOC11_SOURCE # define _ISOC11_SOURCE 1 # undef _POSIX_SOURCE # define _POSIX_SOURCE 1 # undef _POSIX_C_SOURCE # define _POSIX_C_SOURCE 200809L # undef _XOPEN_SOURCE # define _XOPEN_SOURCE 700 # undef _XOPEN_SOURCE_EXTENDED # define _XOPEN_SOURCE_EXTENDED 1 # undef _LARGEFILE64_SOURCE # define _LARGEFILE64_SOURCE 1 # undef _DEFAULT_SOURCE # define _DEFAULT_SOURCE 1 # undef _ATFILE_SOURCE # define _ATFILE_SOURCE 1 #endif which further down in /usr/include/features.h leads to: /* This is to enable the ISO C11 extension. */ #if (defined _ISOC11_SOURCE \ || (defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L)) # define __USE_ISOC11 1 #endif To make matters worse, /usr/include/assert.h doesn't guard against multiple inclusion by turning itself into a nop, but instead #undefs a few macros and continues on. As a result, it's all but impossible to ensure the "message optional" version of static_assert() will actually be used, e.g. explicitly including assert.h and #undef'ing static_assert() doesn't work as a later inclusion of assert.h will again redefine its version. #ifdef _ASSERT_H # undef _ASSERT_H # undef assert # undef __ASSERT_VOID_CAST # ifdef __USE_GNU # undef assert_perror # endif #endif /* assert.h */ #define _ASSERT_H 1 #include <features.h> Fixes: fcba483e8246 ("KVM: selftests: Sanity check input to ioctls() at build time") Fixes: ee3795536664 ("KVM: selftests: Refactor X86_FEATURE_* framework to prep for X86_PROPERTY_*") Fixes: 53a7dc0f215e ("KVM: selftests: Add X86_PROPERTY_* framework to retrieve CPUID values") Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221122013309.1872347-1-seanjc@google.com
2022-12-01KVM: selftests: Do kvm_cpu_has() checks before creating VM+vCPUSean Christopherson
Move the AMX test's kvm_cpu_has() checks before creating the VM+vCPU, there are no dependencies between the two operations. Opportunistically add a comment to call out that enabling off-by-default XSAVE-managed features must be done before KVM_GET_SUPPORTED_CPUID is cached. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221128225735.3291648-5-seanjc@google.com
2022-12-01KVM: selftests: Disallow "get supported CPUID" before REQ_XCOMP_GUEST_PERMSean Christopherson
Disallow using kvm_get_supported_cpuid() and thus caching KVM's supported CPUID info before enabling XSAVE-managed features that are off-by-default and must be enabled by ARCH_REQ_XCOMP_GUEST_PERM. Caching the supported CPUID before all XSAVE features are enabled can result in false negatives due to testing features that were cached before they were enabled. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221128225735.3291648-4-seanjc@google.com
2022-12-01KVM: selftests: Move __vm_xsave_require_permission() below CPUID helpersSean Christopherson
Move __vm_xsave_require_permission() below the CPUID helpers so that a future change can reference the cached result of KVM_GET_SUPPORTED_CPUID while keeping the definition of the variable close to its intended user, kvm_get_supported_cpuid(). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221128225735.3291648-3-seanjc@google.com
2022-12-01KVM: selftests: Move XFD CPUID checking out of __vm_xsave_require_permission()Lei Wang
Move the kvm_cpu_has() check on X86_FEATURE_XFD out of the helper to enable off-by-default XSAVE-managed features and into the one test that currenty requires XFD (XFeature Disable) support. kvm_cpu_has() uses kvm_get_supported_cpuid() and thus caches KVM_GET_SUPPORTED_CPUID, and so using kvm_cpu_has() before ARCH_REQ_XCOMP_GUEST_PERM effectively results in the test caching stale values, e.g. subsequent checks on AMX_TILE will get false negatives. Although off-by-default features are nonsensical without XFD, checking for XFD virtualization prior to enabling such features isn't strictly required. Signed-off-by: Lei Wang <lei4.wang@intel.com> Fixes: 7fbb653e01fd ("KVM: selftests: Check KVM's supported CPUID, not host CPUID, for XFD") Link: https://lore.kernel.org/r/20221125023839.315207-1-lei4.wang@intel.com [sean: add Fixes, reword changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221128225735.3291648-2-seanjc@google.com
2022-12-01KVM: selftests: Restore assert for non-nested VMs in access tracking testSean Christopherson
Restore the assert (on x86-64) that <10% of pages are still idle when NOT running as a nested VM in the access tracking test. The original assert was converted to a "warning" to avoid false failures when running the test in a VM, but the non-nested case does not suffer from the same "infinite TLB size" issue. Using the HYPERVISOR flag isn't infallible as VMMs aren't strictly required to enumerate the "feature" in CPUID, but practically speaking anyone that is running KVM selftests in VMs is going to be using a VMM and hypervisor that sets the HYPERVISOR flag. Cc: David Matlack <dmatlack@google.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221129175300.4052283-3-seanjc@google.com
2022-12-01KVM: selftests: Fix inverted "warning" in access tracking perf testSean Christopherson
Warn if the number of idle pages is greater than or equal to 10% of the total number of pages, not if the percentage of idle pages is less than 10%. The original code asserted that less than 10% of pages were still idle, but the check got inverted when the assert was converted to a warning. Opportunistically clean up the warning; selftests are 64-bit only, there is no need to use "%PRIu64" instead of "%lu". Fixes: 6336a810db5c ("KVM: selftests: replace assertion with warning in access_tracking_perf_test") Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221129175300.4052283-2-seanjc@google.com
2022-11-30KVM: selftests: Verify userspace can stuff IA32_FEATURE_CONTROL at willSean Christopherson
Verify the KVM allows userspace to set all supported bits in the IA32_FEATURE_CONTROL MSR irrespective of the current guest CPUID, and that all unsupported bits are rejected. Throw the testcase into vmx_msrs_test even though it's not technically a VMX MSR; it's close enough, and the most frequently feature controlled by the MSR is VMX. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220607232353.3375324-4-seanjc@google.com
2022-11-30KVM: x86/xen: Add runstate tests for 32-bit mode and crossing page boundaryDavid Woodhouse
Torture test the cases where the runstate crosses a page boundary, and and especially the case where it's configured in 32-bit mode and doesn't, but then switching to 64-bit mode makes it go onto the second page. To simplify this, make the KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST ioctl also update the guest runstate area. It already did so if the actual runstate changed, as a side-effect of kvm_xen_update_runstate(). So doing it in the plain adjustment case is making it more consistent, as well as giving us a nice way to trigger the update without actually running the vCPU again and changing the values. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Paul Durrant <paul@xen.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30KVM: x86/xen: Allow XEN_RUNSTATE_UPDATE flag behaviour to be configuredDavid Woodhouse
Closer inspection of the Xen code shows that we aren't supposed to be using the XEN_RUNSTATE_UPDATE flag unconditionally. It should be explicitly enabled by guests through the HYPERVISOR_vm_assist hypercall. If we randomly set the top bit of ->state_entry_time for a guest that hasn't asked for it and doesn't expect it, that could make the runtimes fail to add up and confuse the guest. Without the flag it's perfectly safe for a vCPU to read its own vcpu_runstate_info; just not for one vCPU to read *another's*. I briefly pondered adding a word for the whole set of VMASST_TYPE_* flags but the only one we care about for HVM guests is this, so it seemed a bit pointless. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20221127122210.248427-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-30KVM: x86/xen: Compatibility fixes for shared runstate areaDavid Woodhouse
The guest runstate area can be arbitrarily byte-aligned. In fact, even when a sane 32-bit guest aligns the overall structure nicely, the 64-bit fields in the structure end up being unaligned due to the fact that the 32-bit ABI only aligns them to 32 bits. So setting the ->state_entry_time field to something|XEN_RUNSTATE_UPDATE is buggy, because if it's unaligned then we can't update the whole field atomically; the low bytes might be observable before the _UPDATE bit is. Xen actually updates the *byte* containing that top bit, on its own. KVM should do the same. In addition, we cannot assume that the runstate area fits within a single page. One option might be to make the gfn_to_pfn cache cope with regions that cross a page — but getting a contiguous virtual kernel mapping of a discontiguous set of IOMEM pages is a distinctly non-trivial exercise, and it seems this is the *only* current use case for the GPC which would benefit from it. An earlier version of the runstate code did use a gfn_to_hva cache for this purpose, but it still had the single-page restriction because it used the uhva directly — because it needs to be able to do so atomically when the vCPU is being scheduled out, so it used pagefault_disable() around the accesses and didn't just use kvm_write_guest_cached() which has a fallback path. So... use a pair of GPCs for the first and potential second page covering the runstate area. We can get away with locking both at once because nothing else takes more than one GPC lock at a time so we can invent a trivial ordering rule. The common case where it's all in the same page is kept as a fast path, but in both cases, the actual guest structure (compat or not) is built up from the fields in @vx, following preset pointers to the state and times fields. The only difference is whether those pointers point to the kernel stack (in the split case) or to guest memory directly via the GPC. The fast path is also fixed to use a byte access for the XEN_RUNSTATE_UPDATE bit, then the only real difference is the dual memcpy. Finally, Xen also does write the runstate area immediately when it's configured. Flip the kvm_xen_update_runstate() and …_guest() functions and call the latter directly when the runstate area is set. This means that other ioctls which modify the runstate also write it immediately to the guest when they do so, which is also intended. Update the xen_shinfo_test to exercise the pathological case where the XEN_RUNSTATE_UPDATE flag in the top byte of the state_entry_time is actually in a different page to the rest of the 64-bit word. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-29KVM: selftests: Build access_tracking_perf_test for arm64Oliver Upton
Does exactly what it says on the tin. Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221118211503.4049023-3-oliver.upton@linux.dev
2022-11-29KVM: selftests: Have perf_test_util signal when to stop vCPUsOliver Upton
Signal that a test run is complete through perf_test_args instead of having tests open code a similar solution. Ensure that the field resets to false at the beginning of a test run as the structure is reused between test runs, eliminating a couple of bugs: access_tracking_perf_test hangs indefinitely on a subsequent test run, as 'done' remains true. The bug doesn't amount to much right now, as x86 supports a single guest mode. However, this is a precondition of enabling the test for other architectures with >1 guest mode, like arm64. memslot_modification_stress_test has the exact opposite problem, where subsequent test runs complete immediately as 'run_vcpus' remains false. Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> [oliver: added commit message, preserve spin_wait_for_next_iteration()] Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221118211503.4049023-2-oliver.upton@linux.dev
2022-11-21KVM: selftests: Rename 'evmcs_test' to 'hyperv_evmcs'Vitaly Kuznetsov
Conform to the rest of Hyper-V emulation selftests which have 'hyperv' prefix. Get rid of '_test' suffix as well as the purpose of this code is fairly obvious. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-49-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: hyperv_svm_test: Introduce L2 TLB flush testVitaly Kuznetsov
Enable Hyper-V L2 TLB flush and check that Hyper-V TLB flush hypercalls from L2 don't exit to L1 unless 'TlbLockCount' is set in the Partition assist page. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-48-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: evmcs_test: Introduce L2 TLB flush testVitaly Kuznetsov
Enable Hyper-V L2 TLB flush and check that Hyper-V TLB flush hypercalls from L2 don't exit to L1 unless 'TlbLockCount' is set in the Partition assist page. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-47-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Introduce rdmsr_from_l2() and use it for MSR-Bitmap testsVitaly Kuznetsov
Hyper-V MSR-Bitmap tests do RDMSR from L2 to exit to L1. While 'evmcs_test' correctly clobbers all GPRs (which are not preserved), 'hyperv_svm_test' does not. Introduce a more generic rdmsr_from_l2() to avoid code duplication and remove hardcoding of MSRs. Do not put it in common code because it is really just a selftests bug rather than a processor feature that requires it. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-46-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Stuff RAX/RCX with 'safe' values in vmmcall()/vmcall()Vitaly Kuznetsov
vmmcall()/vmcall() are used to exit from L2 to L1 and no concrete hypercall ABI is currenty followed. With the introduction of Hyper-V L2 TLB flush it becomes (theoretically) possible that L0 will take responsibility for handling the call and no L1 exit will happen. Prevent this by stuffing RAX (KVM ABI) and RCX (Hyper-V ABI) with 'safe' values. While on it, convert vmmcall() to 'static inline', make it setup stack frame and move to include/x86_64/svm_util.h. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-45-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Allocate Hyper-V partition assist pageVitaly Kuznetsov
In preparation to testing Hyper-V L2 TLB flush hypercalls, allocate so-called Partition assist page. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-44-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Create a vendor independent helper to allocate Hyper-V ↵Vitaly Kuznetsov
specific test pages There's no need to pollute VMX and SVM code with Hyper-V specific stuff and allocate Hyper-V specific test pages for all test as only few really need them. Create a dedicated struct and an allocation helper. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-43-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Split off load_evmcs() from load_vmcs()Vitaly Kuznetsov
In preparation to putting Hyper-V specific test pages to a dedicated struct, move eVMCS load logic from load_vmcs(). Tests call load_vmcs() directly and the only one which needs 'enlightened' version is evmcs_test so there's not much gain in having this merged. Temporary pass both GPA and HVA to load_evmcs(). Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-42-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Move Hyper-V VP assist page enablement out of evmcs.hVitaly Kuznetsov
Hyper-V VP assist page is not eVMCS specific, it is also used for enlightened nSVM. Move the code to vendor neutral place. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-41-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Sync 'struct hv_vp_assist_page' definition with hyperv-tlfs.hVitaly Kuznetsov
'struct hv_vp_assist_page' definition doesn't match TLFS. Also, define 'struct hv_nested_enlightenments_control' and use it instead of opaque '__u64'. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-40-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Sync 'struct hv_enlightened_vmcs' definition with hyperv-tlfs.hVitaly Kuznetsov
'struct hv_enlightened_vmcs' definition in selftests is not '__packed' and so we rely on the compiler doing the right padding. This is not obvious so it seems beneficial to use the same definition as in kernel. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-39-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-21KVM: selftests: Hyper-V PV TLB flush selftestVitaly Kuznetsov
Introduce a selftest for Hyper-V PV TLB flush hypercalls (HvFlushVirtualAddressSpace/HvFlushVirtualAddressSpaceEx, HvFlushVirtualAddressList/HvFlushVirtualAddressListEx). The test creates one 'sender' vCPU and two 'worker' vCPU which do busy loop reading from a certain GVA checking the observed value. Sender vCPU swaos the data page with another page filled with a different value. The expectation for workers is also altered. Without TLB flush on worker vCPUs, they may continue to observe old value. To guard against accidental TLB flushes for worker vCPUs the test is repeated 100 times. Hyper-V TLB flush hypercalls are tested in both 'normal' and 'XMM fast' modes. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-38-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18KVM: selftests: Export vm_vaddr_unused_gap() to make it possible to request ↵Vitaly Kuznetsov
unmapped ranges Currently, tests can only request a new vaddr range by using vm_vaddr_alloc()/vm_vaddr_alloc_page()/vm_vaddr_alloc_pages() but these functions allocate and map physical pages too. Make it possible to request unmapped range too. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-36-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18KVM: selftests: Fill in vm->vpages_mapped bitmap in virt_map() tooVitaly Kuznetsov
Similar to vm_vaddr_alloc(), virt_map() needs to reflect the mapping in vm->vpages_mapped. While on it, remove unneeded code wrapping in vm_vaddr_alloc(). Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-35-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18KVM: selftests: Hyper-V PV IPI selftestVitaly Kuznetsov
Introduce a selftest for Hyper-V PV IPI hypercalls (HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx). The test creates one 'sender' vCPU and two 'receiver' vCPU and then issues various combinations of send IPI hypercalls in both 'normal' and 'fast' (with XMM input where necessary) mode. Later, the test checks whether IPIs were delivered to the expected destination vCPU[s]. Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-34-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18KVM: selftests: Move the function doing Hyper-V hypercall to a common headerVitaly Kuznetsov
All Hyper-V specific tests issuing hypercalls need this. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-33-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18KVM: selftests: Move HYPERV_LINUX_OS_ID definition to a common headerVitaly Kuznetsov
HYPERV_LINUX_OS_ID needs to be written to HV_X64_MSR_GUEST_OS_ID by each Hyper-V specific selftest. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-32-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18KVM: selftests: Better XMM read/write helpersVitaly Kuznetsov
set_xmm()/get_xmm() helpers are fairly useless as they only read 64 bits from 128-bit registers. Moreover, these helpers are not used. Borrow _kvm_read_sse_reg()/_kvm_write_sse_reg() from KVM limiting them to XMM0-XMM8 for now. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-31-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18x86/hyperv: KVM: Rename "hv_enlightenments" to "hv_vmcb_enlightenments"Sean Christopherson
Now that KVM isn't littered with "struct hv_enlightenments" casts, rename the struct to "hv_vmcb_enlightenments" to highlight the fact that the struct is specifically for SVM's VMCB. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18KVM: SVM: Add a proper field for Hyper-V VMCB enlightenmentsSean Christopherson
Add a union to provide hv_enlightenments side-by-side with the sw_reserved bytes that Hyper-V's enlightenments overlay. Casting sw_reserved everywhere is messy, confusing, and unnecessarily unsafe. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18KVM: selftests: Move "struct hv_enlightenments" to x86_64/svm.hSean Christopherson
Move Hyper-V's VMCB "struct hv_enlightenments" to the svm.h header so that the struct can be referenced in "struct vmcb_control_area". Alternatively, a dedicated header for SVM+Hyper-V could be added, a la x86_64/evmcs.h, but it doesn't appear that Hyper-V will end up needing a wholesale replacement for the VMCB. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-18x86/hyperv: Move VMCB enlightenment definitions to hyperv-tlfs.hSean Christopherson
Move Hyper-V's VMCB enlightenment definitions to the TLFS header; the definitions come directly from the TLFS[*], not from KVM. No functional change intended. [*] https://learn.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/datatypes/hv_svm_enlightened_vmcb_fields [vitaly: rename VMCB_HV_ -> HV_VMCB_ to match the rest of hyperv-tlfs.h, keep svm/hyperv.h] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20221101145426.251680-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-17Merge branch 'kvm-svm-harden' into HEADPaolo Bonzini
This fixes three issues in nested SVM: 1) in the shutdown_interception() vmexit handler we call kvm_vcpu_reset(). However, if running nested and L1 doesn't intercept shutdown, the function resets vcpu->arch.hflags without properly leaving the nested state. This leaves the vCPU in inconsistent state and later triggers a kernel panic in SVM code. The same bug can likely be triggered by sending INIT via local apic to a vCPU which runs a nested guest. On VMX we are lucky that the issue can't happen because VMX always intercepts triple faults, thus triple fault in L2 will always be redirected to L1. Plus, handle_triple_fault() doesn't reset the vCPU. INIT IPI can't happen on VMX either because INIT events are masked while in VMX mode. Secondarily, KVM doesn't honour SHUTDOWN intercept bit of L1 on SVM. A normal hypervisor should always intercept SHUTDOWN, a unit test on the other hand might want to not do so. Finally, the guest can trigger a kernel non rate limited printk on SVM from the guest, which is fixed as well. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>