summaryrefslogtreecommitdiff
path: root/arch/riscv
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2024-07-20 12:41:03 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2024-07-20 12:41:03 -0700
commit2c9b3512402ed192d1f43f4531fb5da947e72bd0 (patch)
treed63534a1e9cf5b12a1362a348e2237c9c592a493 /arch/riscv
parentc43a20e4a520b37c2ef6d4f422de989992c9129f (diff)
parent332d2c1d713e232e163386c35a3ba0c1b90df83f (diff)
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
Diffstat (limited to 'arch/riscv')
-rw-r--r--arch/riscv/include/asm/kvm_aia_aplic.h58
-rw-r--r--arch/riscv/include/asm/kvm_aia_imsic.h38
-rw-r--r--arch/riscv/include/asm/kvm_host.h1
-rw-r--r--arch/riscv/kvm/aia.c35
-rw-r--r--arch/riscv/kvm/aia_aplic.c2
-rw-r--r--arch/riscv/kvm/aia_device.c2
-rw-r--r--arch/riscv/kvm/aia_imsic.c2
-rw-r--r--arch/riscv/kvm/trace.h67
-rw-r--r--arch/riscv/kvm/vcpu.c9
-rw-r--r--arch/riscv/kvm/vcpu_exit.c2
10 files changed, 101 insertions, 115 deletions
diff --git a/arch/riscv/include/asm/kvm_aia_aplic.h b/arch/riscv/include/asm/kvm_aia_aplic.h
deleted file mode 100644
index 6dd1a4809ec1..000000000000
--- a/arch/riscv/include/asm/kvm_aia_aplic.h
+++ /dev/null
@@ -1,58 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2021 Western Digital Corporation or its affiliates.
- * Copyright (C) 2022 Ventana Micro Systems Inc.
- */
-#ifndef __KVM_RISCV_AIA_IMSIC_H
-#define __KVM_RISCV_AIA_IMSIC_H
-
-#include <linux/bitops.h>
-
-#define APLIC_MAX_IDC BIT(14)
-#define APLIC_MAX_SOURCE 1024
-
-#define APLIC_DOMAINCFG 0x0000
-#define APLIC_DOMAINCFG_RDONLY 0x80000000
-#define APLIC_DOMAINCFG_IE BIT(8)
-#define APLIC_DOMAINCFG_DM BIT(2)
-#define APLIC_DOMAINCFG_BE BIT(0)
-
-#define APLIC_SOURCECFG_BASE 0x0004
-#define APLIC_SOURCECFG_D BIT(10)
-#define APLIC_SOURCECFG_CHILDIDX_MASK 0x000003ff
-#define APLIC_SOURCECFG_SM_MASK 0x00000007
-#define APLIC_SOURCECFG_SM_INACTIVE 0x0
-#define APLIC_SOURCECFG_SM_DETACH 0x1
-#define APLIC_SOURCECFG_SM_EDGE_RISE 0x4
-#define APLIC_SOURCECFG_SM_EDGE_FALL 0x5
-#define APLIC_SOURCECFG_SM_LEVEL_HIGH 0x6
-#define APLIC_SOURCECFG_SM_LEVEL_LOW 0x7
-
-#define APLIC_IRQBITS_PER_REG 32
-
-#define APLIC_SETIP_BASE 0x1c00
-#define APLIC_SETIPNUM 0x1cdc
-
-#define APLIC_CLRIP_BASE 0x1d00
-#define APLIC_CLRIPNUM 0x1ddc
-
-#define APLIC_SETIE_BASE 0x1e00
-#define APLIC_SETIENUM 0x1edc
-
-#define APLIC_CLRIE_BASE 0x1f00
-#define APLIC_CLRIENUM 0x1fdc
-
-#define APLIC_SETIPNUM_LE 0x2000
-#define APLIC_SETIPNUM_BE 0x2004
-
-#define APLIC_GENMSI 0x3000
-
-#define APLIC_TARGET_BASE 0x3004
-#define APLIC_TARGET_HART_IDX_SHIFT 18
-#define APLIC_TARGET_HART_IDX_MASK 0x3fff
-#define APLIC_TARGET_GUEST_IDX_SHIFT 12
-#define APLIC_TARGET_GUEST_IDX_MASK 0x3f
-#define APLIC_TARGET_IPRIO_MASK 0xff
-#define APLIC_TARGET_EIID_MASK 0x7ff
-
-#endif
diff --git a/arch/riscv/include/asm/kvm_aia_imsic.h b/arch/riscv/include/asm/kvm_aia_imsic.h
deleted file mode 100644
index da5881d2bde0..000000000000
--- a/arch/riscv/include/asm/kvm_aia_imsic.h
+++ /dev/null
@@ -1,38 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2021 Western Digital Corporation or its affiliates.
- * Copyright (C) 2022 Ventana Micro Systems Inc.
- */
-#ifndef __KVM_RISCV_AIA_IMSIC_H
-#define __KVM_RISCV_AIA_IMSIC_H
-
-#include <linux/types.h>
-#include <asm/csr.h>
-
-#define IMSIC_MMIO_PAGE_SHIFT 12
-#define IMSIC_MMIO_PAGE_SZ (1UL << IMSIC_MMIO_PAGE_SHIFT)
-#define IMSIC_MMIO_PAGE_LE 0x00
-#define IMSIC_MMIO_PAGE_BE 0x04
-
-#define IMSIC_MIN_ID 63
-#define IMSIC_MAX_ID 2048
-
-#define IMSIC_EIDELIVERY 0x70
-
-#define IMSIC_EITHRESHOLD 0x72
-
-#define IMSIC_EIP0 0x80
-#define IMSIC_EIP63 0xbf
-#define IMSIC_EIPx_BITS 32
-
-#define IMSIC_EIE0 0xc0
-#define IMSIC_EIE63 0xff
-#define IMSIC_EIEx_BITS 32
-
-#define IMSIC_FIRST IMSIC_EIDELIVERY
-#define IMSIC_LAST IMSIC_EIE63
-
-#define IMSIC_MMIO_SETIPNUM_LE 0x00
-#define IMSIC_MMIO_SETIPNUM_BE 0x04
-
-#endif
diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm_host.h
index e65d1584d48e..2e2254fd2a2a 100644
--- a/arch/riscv/include/asm/kvm_host.h
+++ b/arch/riscv/include/asm/kvm_host.h
@@ -287,7 +287,6 @@ struct kvm_vcpu_arch {
};
static inline void kvm_arch_sync_events(struct kvm *kvm) {}
-static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
#define KVM_RISCV_GSTAGE_TLB_MIN_ORDER 12
diff --git a/arch/riscv/kvm/aia.c b/arch/riscv/kvm/aia.c
index 0f0a9d11bb5f..2967d305c442 100644
--- a/arch/riscv/kvm/aia.c
+++ b/arch/riscv/kvm/aia.c
@@ -10,12 +10,12 @@
#include <linux/kernel.h>
#include <linux/bitops.h>
#include <linux/irq.h>
+#include <linux/irqchip/riscv-imsic.h>
#include <linux/irqdomain.h>
#include <linux/kvm_host.h>
#include <linux/percpu.h>
#include <linux/spinlock.h>
#include <asm/cpufeature.h>
-#include <asm/kvm_aia_imsic.h>
struct aia_hgei_control {
raw_spinlock_t lock;
@@ -394,6 +394,8 @@ int kvm_riscv_aia_alloc_hgei(int cpu, struct kvm_vcpu *owner,
{
int ret = -ENOENT;
unsigned long flags;
+ const struct imsic_global_config *gc;
+ const struct imsic_local_config *lc;
struct aia_hgei_control *hgctrl = per_cpu_ptr(&aia_hgei, cpu);
if (!kvm_riscv_aia_available() || !hgctrl)
@@ -409,11 +411,14 @@ int kvm_riscv_aia_alloc_hgei(int cpu, struct kvm_vcpu *owner,
raw_spin_unlock_irqrestore(&hgctrl->lock, flags);
- /* TODO: To be updated later by AIA IMSIC HW guest file support */
- if (hgei_va)
- *hgei_va = NULL;
- if (hgei_pa)
- *hgei_pa = 0;
+ gc = imsic_get_global_config();
+ lc = (gc) ? per_cpu_ptr(gc->local, cpu) : NULL;
+ if (lc && ret > 0) {
+ if (hgei_va)
+ *hgei_va = lc->msi_va + (ret * IMSIC_MMIO_PAGE_SZ);
+ if (hgei_pa)
+ *hgei_pa = lc->msi_pa + (ret * IMSIC_MMIO_PAGE_SZ);
+ }
return ret;
}
@@ -605,9 +610,11 @@ void kvm_riscv_aia_disable(void)
int kvm_riscv_aia_init(void)
{
int rc;
+ const struct imsic_global_config *gc;
if (!riscv_isa_extension_available(NULL, SxAIA))
return -ENODEV;
+ gc = imsic_get_global_config();
/* Figure-out number of bits in HGEIE */
csr_write(CSR_HGEIE, -1UL);
@@ -619,17 +626,17 @@ int kvm_riscv_aia_init(void)
/*
* Number of usable HGEI lines should be minimum of per-HART
* IMSIC guest files and number of bits in HGEIE
- *
- * TODO: To be updated later by AIA IMSIC HW guest file support
*/
- kvm_riscv_aia_nr_hgei = 0;
+ if (gc)
+ kvm_riscv_aia_nr_hgei = min((ulong)kvm_riscv_aia_nr_hgei,
+ BIT(gc->guest_index_bits) - 1);
+ else
+ kvm_riscv_aia_nr_hgei = 0;
- /*
- * Find number of guest MSI IDs
- *
- * TODO: To be updated later by AIA IMSIC HW guest file support
- */
+ /* Find number of guest MSI IDs */
kvm_riscv_aia_max_ids = IMSIC_MAX_ID;
+ if (gc && kvm_riscv_aia_nr_hgei)
+ kvm_riscv_aia_max_ids = gc->nr_guest_ids + 1;
/* Initialize guest external interrupt line management */
rc = aia_hgei_init();
diff --git a/arch/riscv/kvm/aia_aplic.c b/arch/riscv/kvm/aia_aplic.c
index b467ba5ed910..da6ff1bade0d 100644
--- a/arch/riscv/kvm/aia_aplic.c
+++ b/arch/riscv/kvm/aia_aplic.c
@@ -7,12 +7,12 @@
* Anup Patel <apatel@ventanamicro.com>
*/
+#include <linux/irqchip/riscv-aplic.h>
#include <linux/kvm_host.h>
#include <linux/math.h>
#include <linux/spinlock.h>
#include <linux/swab.h>
#include <kvm/iodev.h>
-#include <asm/kvm_aia_aplic.h>
struct aplic_irq {
raw_spinlock_t lock;
diff --git a/arch/riscv/kvm/aia_device.c b/arch/riscv/kvm/aia_device.c
index 5cd407c6a8e4..39cd26af5a69 100644
--- a/arch/riscv/kvm/aia_device.c
+++ b/arch/riscv/kvm/aia_device.c
@@ -8,9 +8,9 @@
*/
#include <linux/bits.h>
+#include <linux/irqchip/riscv-imsic.h>
#include <linux/kvm_host.h>
#include <linux/uaccess.h>
-#include <asm/kvm_aia_imsic.h>
static void unlock_vcpus(struct kvm *kvm, int vcpu_lock_idx)
{
diff --git a/arch/riscv/kvm/aia_imsic.c b/arch/riscv/kvm/aia_imsic.c
index e808723a85f1..0a1e859323b4 100644
--- a/arch/riscv/kvm/aia_imsic.c
+++ b/arch/riscv/kvm/aia_imsic.c
@@ -9,13 +9,13 @@
#include <linux/atomic.h>
#include <linux/bitmap.h>
+#include <linux/irqchip/riscv-imsic.h>
#include <linux/kvm_host.h>
#include <linux/math.h>
#include <linux/spinlock.h>
#include <linux/swab.h>
#include <kvm/iodev.h>
#include <asm/csr.h>
-#include <asm/kvm_aia_imsic.h>
#define IMSIC_MAX_EIX (IMSIC_MAX_ID / BITS_PER_TYPE(u64))
diff --git a/arch/riscv/kvm/trace.h b/arch/riscv/kvm/trace.h
new file mode 100644
index 000000000000..3d54175d805c
--- /dev/null
+++ b/arch/riscv/kvm/trace.h
@@ -0,0 +1,67 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Tracepoints for RISC-V KVM
+ *
+ * Copyright 2024 Beijing ESWIN Computing Technology Co., Ltd.
+ *
+ */
+#if !defined(_TRACE_KVM_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_KVM_H
+
+#include <linux/tracepoint.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM kvm
+
+TRACE_EVENT(kvm_entry,
+ TP_PROTO(struct kvm_vcpu *vcpu),
+ TP_ARGS(vcpu),
+
+ TP_STRUCT__entry(
+ __field(unsigned long, pc)
+ ),
+
+ TP_fast_assign(
+ __entry->pc = vcpu->arch.guest_context.sepc;
+ ),
+
+ TP_printk("PC: 0x016%lx", __entry->pc)
+);
+
+TRACE_EVENT(kvm_exit,
+ TP_PROTO(struct kvm_cpu_trap *trap),
+ TP_ARGS(trap),
+
+ TP_STRUCT__entry(
+ __field(unsigned long, sepc)
+ __field(unsigned long, scause)
+ __field(unsigned long, stval)
+ __field(unsigned long, htval)
+ __field(unsigned long, htinst)
+ ),
+
+ TP_fast_assign(
+ __entry->sepc = trap->sepc;
+ __entry->scause = trap->scause;
+ __entry->stval = trap->stval;
+ __entry->htval = trap->htval;
+ __entry->htinst = trap->htinst;
+ ),
+
+ TP_printk("SEPC:0x%lx, SCAUSE:0x%lx, STVAL:0x%lx, HTVAL:0x%lx, HTINST:0x%lx",
+ __entry->sepc,
+ __entry->scause,
+ __entry->stval,
+ __entry->htval,
+ __entry->htinst)
+);
+
+#endif /* _TRACE_RSICV_KVM_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE trace
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
index c58a0a7f5e5f..8d7d381737ee 100644
--- a/arch/riscv/kvm/vcpu.c
+++ b/arch/riscv/kvm/vcpu.c
@@ -21,6 +21,9 @@
#include <asm/cacheflush.h>
#include <asm/kvm_vcpu_vector.h>
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
KVM_GENERIC_VCPU_STATS(),
STATS_DESC_COUNTER(VCPU, ecall_exit_stat),
@@ -761,7 +764,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
return ret;
}
- if (run->immediate_exit) {
+ if (!vcpu->wants_to_run) {
kvm_vcpu_srcu_read_unlock(vcpu);
return -EINTR;
}
@@ -832,6 +835,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
*/
kvm_riscv_local_tlb_sanitize(vcpu);
+ trace_kvm_entry(vcpu);
+
guest_timing_enter_irqoff();
kvm_riscv_vcpu_enter_exit(vcpu);
@@ -870,6 +875,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
local_irq_enable();
+ trace_kvm_exit(&trap);
+
preempt_enable();
kvm_vcpu_srcu_read_lock(vcpu);
diff --git a/arch/riscv/kvm/vcpu_exit.c b/arch/riscv/kvm/vcpu_exit.c
index 5761f95abb60..fa98e5c024b2 100644
--- a/arch/riscv/kvm/vcpu_exit.c
+++ b/arch/riscv/kvm/vcpu_exit.c
@@ -185,6 +185,8 @@ int kvm_riscv_vcpu_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
case EXC_INST_ILLEGAL:
case EXC_LOAD_MISALIGNED:
case EXC_STORE_MISALIGNED:
+ case EXC_LOAD_ACCESS:
+ case EXC_STORE_ACCESS:
if (vcpu->arch.guest_context.hstatus & HSTATUS_SPV) {
kvm_riscv_vcpu_trap_redirect(vcpu, trap);
ret = 1;