Age | Commit message (Collapse) | Author |
|
Pull more kvm updates from Paolo Bonzini:
- ARM fixes
- RISC-V Svade and Svadu (accessed and dirty bit) extension support for
host and guest
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: riscv: selftests: Add Svade and Svadu Extension to get-reg-list test
RISC-V: KVM: Add Svade and Svadu Extensions Support for Guest/VM
dt-bindings: riscv: Add Svade and Svadu Entries
RISC-V: Add Svade and Svadu Extensions Support
KVM: arm64: Use MDCR_EL2.HPME to evaluate overflow of hyp counters
KVM: arm64: Ignore PMCNTENSET_EL0 while checking for overflow status
KVM: arm64: Mark set_sysreg_masks() as inline to avoid build failure
KVM: arm64: vgic-its: Add stronger type-checking to the ITS entry sizes
KVM: arm64: vgic: Kill VGIC_MAX_PRIVATE definition
KVM: arm64: vgic: Make vgic_get_irq() more robust
KVM: arm64: vgic-v3: Sanitise guest writes to GICR_INVLPIR
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-v updates from Palmer Dabbelt:
- Support for pointer masking in userspace
- Support for probing vector misaligned access performance
- Support for qspinlock on systems with Zacas and Zabha
* tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits)
RISC-V: Remove unnecessary include from compat.h
riscv: Fix default misaligned access trap
riscv: Add qspinlock support
dt-bindings: riscv: Add Ziccrse ISA extension description
riscv: Add ISA extension parsing for Ziccrse
asm-generic: ticket-lock: Add separate ticket-lock.h
asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
riscv: Implement xchg8/16() using Zabha
riscv: Implement arch_cmpxchg128() using Zacas
riscv: Improve zacas fully-ordered cmpxchg()
riscv: Implement cmpxchg8/16() using Zabha
dt-bindings: riscv: Add Zabha ISA extension description
riscv: Implement cmpxchg32/64() using Zacas
riscv: Do not fail to build on byte/halfword operations with Zawrs
riscv: Move cpufeature.h macros into their own header
KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test
RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests
riscv: hwprobe: Export the Supm ISA extension
riscv: selftests: Add a pointer masking test
riscv: Allow ptrace control of the tagged address ABI
...
|
|
KVM/riscv changes for 6.13 part #2
- Svade and Svadu extension support for Host and Guest/VM
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into HEAD
RISC-V Paches for the 6.13 Merge Window, Part 1
* Support for pointer masking in userspace,
* Support for probing vector misaligned access performance.
* Support for qspinlock on systems with Zacas and Zabha.
|
|
Pull kvm updates from Paolo Bonzini:
"The biggest change here is eliminating the awful idea that KVM had of
essentially guessing which pfns are refcounted pages.
The reason to do so was that KVM needs to map both non-refcounted
pages (for example BARs of VFIO devices) and VM_PFNMAP/VM_MIXMEDMAP
VMAs that contain refcounted pages.
However, the result was security issues in the past, and more recently
the inability to map VM_IO and VM_PFNMAP memory that _is_ backed by
struct page but is not refcounted. In particular this broke virtio-gpu
blob resources (which directly map host graphics buffers into the
guest as "vram" for the virtio-gpu device) with the amdgpu driver,
because amdgpu allocates non-compound higher order pages and the tail
pages could not be mapped into KVM.
This requires adjusting all uses of struct page in the
per-architecture code, to always work on the pfn whenever possible.
The large series that did this, from David Stevens and Sean
Christopherson, also cleaned up substantially the set of functions
that provided arch code with the pfn for a host virtual addresses.
The previous maze of twisty little passages, all different, is
replaced by five functions (__gfn_to_page, __kvm_faultin_pfn, the
non-__ versions of these two, and kvm_prefetch_pages) saving almost
200 lines of code.
ARM:
- Support for stage-1 permission indirection (FEAT_S1PIE) and
permission overlays (FEAT_S1POE), including nested virt + the
emulated page table walker
- Introduce PSCI SYSTEM_OFF2 support to KVM + client driver. This
call was introduced in PSCIv1.3 as a mechanism to request
hibernation, similar to the S4 state in ACPI
- Explicitly trap + hide FEAT_MPAM (QoS controls) from KVM guests. As
part of it, introduce trivial initialization of the host's MPAM
context so KVM can use the corresponding traps
- PMU support under nested virtualization, honoring the guest
hypervisor's trap configuration and event filtering when running a
nested guest
- Fixes to vgic ITS serialization where stale device/interrupt table
entries are not zeroed when the mapping is invalidated by the VM
- Avoid emulated MMIO completion if userspace has requested
synchronous external abort injection
- Various fixes and cleanups affecting pKVM, vCPU initialization, and
selftests
LoongArch:
- Add iocsr and mmio bus simulation in kernel.
- Add in-kernel interrupt controller emulation.
- Add support for virtualization extensions to the eiointc irqchip.
PPC:
- Drop lingering and utterly obsolete references to PPC970 KVM, which
was removed 10 years ago.
- Fix incorrect documentation references to non-existing ioctls
RISC-V:
- Accelerate KVM RISC-V when running as a guest
- Perf support to collect KVM guest statistics from host side
s390:
- New selftests: more ucontrol selftests and CPU model sanity checks
- Support for the gen17 CPU model
- List registers supported by KVM_GET/SET_ONE_REG in the
documentation
x86:
- Cleanup KVM's handling of Accessed and Dirty bits to dedup code,
improve documentation, harden against unexpected changes.
Even if the hardware A/D tracking is disabled, it is possible to
use the hardware-defined A/D bits to track if a PFN is Accessed
and/or Dirty, and that removes a lot of special cases.
- Elide TLB flushes when aging secondary PTEs, as has been done in
x86's primary MMU for over 10 years.
- Recover huge pages in-place in the TDP MMU when dirty page logging
is toggled off, instead of zapping them and waiting until the page
is re-accessed to create a huge mapping. This reduces vCPU jitter.
- Batch TLB flushes when dirty page logging is toggled off. This
reduces the time it takes to disable dirty logging by ~3x.
- Remove the shrinker that was (poorly) attempting to reclaim shadow
page tables in low-memory situations.
- Clean up and optimize KVM's handling of writes to
MSR_IA32_APICBASE.
- Advertise CPUIDs for new instructions in Clearwater Forest
- Quirk KVM's misguided behavior of initialized certain feature MSRs
to their maximum supported feature set, which can result in KVM
creating invalid vCPU state. E.g. initializing PERF_CAPABILITIES to
a non-zero value results in the vCPU having invalid state if
userspace hides PDCM from the guest, which in turn can lead to
save/restore failures.
- Fix KVM's handling of non-canonical checks for vCPUs that support
LA57 to better follow the "architecture", in quotes because the
actual behavior is poorly documented. E.g. most MSR writes and
descriptor table loads ignore CR4.LA57 and operate purely on
whether the CPU supports LA57.
- Bypass the register cache when querying CPL from kvm_sched_out(),
as filling the cache from IRQ context is generally unsafe; harden
the cache accessors to try to prevent similar issues from occuring
in the future. The issue that triggered this change was already
fixed in 6.12, but was still kinda latent.
- Advertise AMD_IBPB_RET to userspace, and fix a related bug where
KVM over-advertises SPEC_CTRL when trying to support cross-vendor
VMs.
- Minor cleanups
- Switch hugepage recovery thread to use vhost_task.
These kthreads can consume significant amounts of CPU time on
behalf of a VM or in response to how the VM behaves (for example
how it accesses its memory); therefore KVM tried to place the
thread in the VM's cgroups and charge the CPU time consumed by that
work to the VM's container.
However the kthreads did not process SIGSTOP/SIGCONT, and therefore
cgroups which had KVM instances inside could not complete freezing.
Fix this by replacing the kthread with a PF_USER_WORKER thread, via
the vhost_task abstraction. Another 100+ lines removed, with
generally better behavior too like having these threads properly
parented in the process tree.
- Revert a workaround for an old CPU erratum (Nehalem/Westmere) that
didn't really work; there was really nothing to work around anyway:
the broken patch was meant to fix nested virtualization, but the
PERF_GLOBAL_CTRL MSR is virtualized and therefore unaffected by the
erratum.
- Fix 6.12 regression where CONFIG_KVM will be built as a module even
if asked to be builtin, as long as neither KVM_INTEL nor KVM_AMD is
'y'.
x86 selftests:
- x86 selftests can now use AVX.
Documentation:
- Use rST internal links
- Reorganize the introduction to the API document
Generic:
- Protect vcpu->pid accesses outside of vcpu->mutex with a rwlock
instead of RCU, so that running a vCPU on a different task doesn't
encounter long due to having to wait for all CPUs become quiescent.
In general both reads and writes are rare, but userspace that
supports confidential computing is introducing the use of "helper"
vCPUs that may jump from one host processor to another. Those will
be very happy to trigger a synchronize_rcu(), and the effect on
performance is quite the disaster"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (298 commits)
KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMD
KVM: x86: add back X86_LOCAL_APIC dependency
Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()"
KVM: x86: switch hugepage recovery thread to vhost_task
KVM: x86: expose MSR_PLATFORM_INFO as a feature MSR
x86: KVM: Advertise CPUIDs for new instructions in Clearwater Forest
Documentation: KVM: fix malformed table
irqchip/loongson-eiointc: Add virt extension support
LoongArch: KVM: Add irqfd support
LoongArch: KVM: Add PCHPIC user mode read and write functions
LoongArch: KVM: Add PCHPIC read and write functions
LoongArch: KVM: Add PCHPIC device support
LoongArch: KVM: Add EIOINTC user mode read and write functions
LoongArch: KVM: Add EIOINTC read and write functions
LoongArch: KVM: Add EIOINTC device support
LoongArch: KVM: Add IPI user mode read and write function
LoongArch: KVM: Add IPI read and write function
LoongArch: KVM: Add IPI device support
LoongArch: KVM: Add iocsr and mmio bus simulation in kernel
KVM: arm64: Pass on SVE mapping failures
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- The series "zram: optimal post-processing target selection" from
Sergey Senozhatsky improves zram's post-processing selection
algorithm. This leads to improved memory savings.
- Wei Yang has gone to town on the mapletree code, contributing several
series which clean up the implementation:
- "refine mas_mab_cp()"
- "Reduce the space to be cleared for maple_big_node"
- "maple_tree: simplify mas_push_node()"
- "Following cleanup after introduce mas_wr_store_type()"
- "refine storing null"
- The series "selftests/mm: hugetlb_fault_after_madv improvements" from
David Hildenbrand fixes this selftest for s390.
- The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng
implements some rationaizations and cleanups in the page mapping
code.
- The series "mm: optimize shadow entries removal" from Shakeel Butt
optimizes the file truncation code by speeding up the handling of
shadow entries.
- The series "Remove PageKsm()" from Matthew Wilcox completes the
migration of this flag over to being a folio-based flag.
- The series "Unify hugetlb into arch_get_unmapped_area functions" from
Oscar Salvador implements a bunch of consolidations and cleanups in
the hugetlb code.
- The series "Do not shatter hugezeropage on wp-fault" from Dev Jain
takes away the wp-fault time practice of turning a huge zero page
into small pages. Instead we replace the whole thing with a THP. More
consistent cleaner and potentiall saves a large number of pagefaults.
- The series "percpu: Add a test case and fix for clang" from Andy
Shevchenko enhances and fixes the kernel's built in percpu test code.
- The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett
optimizes mremap() by avoiding doing things which we didn't need to
do.
- The series "Improve the tmpfs large folio read performance" from
Baolin Wang teaches tmpfs to copy data into userspace at the folio
size rather than as individual pages. A 20% speedup was observed.
- The series "mm/damon/vaddr: Fix issue in
damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON
splitting.
- The series "memcg-v1: fully deprecate charge moving" from Shakeel
Butt removes the long-deprecated memcgv2 charge moving feature.
- The series "fix error handling in mmap_region() and refactor" from
Lorenzo Stoakes cleanup up some of the mmap() error handling and
addresses some potential performance issues.
- The series "x86/module: use large ROX pages for text allocations"
from Mike Rapoport teaches x86 to use large pages for
read-only-execute module text.
- The series "page allocation tag compression" from Suren Baghdasaryan
is followon maintenance work for the new page allocation profiling
feature.
- The series "page->index removals in mm" from Matthew Wilcox remove
most references to page->index in mm/. A slow march towards shrinking
struct page.
- The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs
interface tests" from Andrew Paniakin performs maintenance work for
DAMON's self testing code.
- The series "mm: zswap swap-out of large folios" from Kanchana Sridhar
improves zswap's batching of compression and decompression. It is a
step along the way towards using Intel IAA hardware acceleration for
this zswap operation.
- The series "kasan: migrate the last module test to kunit" from
Sabyrzhan Tasbolatov completes the migration of the KASAN built-in
tests over to the KUnit framework.
- The series "implement lightweight guard pages" from Lorenzo Stoakes
permits userapace to place fault-generating guard pages within a
single VMA, rather than requiring that multiple VMAs be created for
this. Improved efficiencies for userspace memory allocators are
expected.
- The series "memcg: tracepoint for flushing stats" from JP Kobryn uses
tracepoints to provide increased visibility into memcg stats flushing
activity.
- The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky
fixes a zram buglet which potentially affected performance.
- The series "mm: add more kernel parameters to control mTHP" from
Maíra Canal enhances our ability to control/configuremultisize THP
from the kernel boot command line.
- The series "kasan: few improvements on kunit tests" from Sabyrzhan
Tasbolatov has a couple of fixups for the KASAN KUnit tests.
- The series "mm/list_lru: Split list_lru lock into per-cgroup scope"
from Kairui Song optimizes list_lru memory utilization when lockdep
is enabled.
* tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (215 commits)
cma: enforce non-zero pageblock_order during cma_init_reserved_mem()
mm/kfence: add a new kunit test test_use_after_free_read_nofault()
zram: fix NULL pointer in comp_algorithm_show()
memcg/hugetlb: add hugeTLB counters to memcg
vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event
mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount
zram: ZRAM_DEF_COMP should depend on ZRAM
MAINTAINERS/MEMORY MANAGEMENT: add document files for mm
Docs/mm/damon: recommend academic papers to read and/or cite
mm: define general function pXd_init()
kmemleak: iommu/iova: fix transient kmemleak false positive
mm/list_lru: simplify the list_lru walk callback function
mm/list_lru: split the lock to per-cgroup scope
mm/list_lru: simplify reparenting and initial allocation
mm/list_lru: code clean up for reparenting
mm/list_lru: don't export list_lru_add
mm/list_lru: don't pass unnecessary key parameters
kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller
kasan: change kasan_atomics kunit test as KUNIT_CASE_SLOW
kasan: use EXPORT_SYMBOL_IF_KUNIT to export symbols
...
|
|
Svade and Svadu extensions represent two schemes for managing the PTE A/D
bits. When the PTE A/D bits need to be set, Svade extension intdicates
that a related page fault will be raised. In contrast, the Svadu extension
supports hardware updating of PTE A/D bits. Since the Svade extension is
mandatory and the Svadu extension is optional in RVA23 profile, by default
the M-mode firmware will enable the Svadu extension in the menvcfg CSR
when only Svadu is present in DT.
This patch detects Svade and Svadu extensions from DT and adds
arch_has_hw_pte_young() to enable optimization in MGLRU and
__wp_page_copy_user() when we have the PTE A/D bits hardware updating
support.
Co-developed-by: Jinyu Tang <tjytimi@163.com>
Signed-off-by: Jinyu Tang <tjytimi@163.com>
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20240726084931.28924-2-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"Bindings:
- Enable dtc "interrupt_provider" warnings for binding examples. Fix
the warnings in fsl,mu-msi and ti,sci-inta due to this.
- Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton, and
altr,fpga-passive-serial to DT schema format
- Add some documentation on the different forms of YAML text blocks
which are a constant source of review comments
- Fix some schema errors in constraints for arrays
- Add compatibles for qcom,sar2130p-pdc and onnn,adt7462
DT core:
- Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
- Add some warnings on deprecated address handling
- Rework early_init_dt_scan() so the arch can pass in the phys
address of the DTB as __pa() is not always valid to use. This fixes
a warning for arm64 with kexec.
- Add and use some new DT graph iterators for iterating over ports
and endpoints
- Rework reserved-memory handling to be sized dynamically for fixed
regions
- Optimize of_modalias() to avoid a strlen() call
- Constify struct device_node and property pointers where ever
possible"
* tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (36 commits)
of: Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
dt-bindings: interrupt-controller: qcom,pdc: Add SAR2130P compatible
of/address: Rework bus matching to avoid warnings
of: WARN on deprecated #address-cells/#size-cells handling
of/fdt: Don't use default address cell sizes for address translation
dt-bindings: Enable dtc "interrupt_provider" warnings
of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify
dt-bindings: cache: qcom,llcc: Fix X1E80100 reg entries
dt-bindings: watchdog: convert zii,rave-sp-wdt.txt to yaml format
dt-bindings: input: convert zii,rave-sp-pwrbutton.txt to yaml
media: xilinx-tpg: use new of_graph functions
fbdev: omapfb: use new of_graph functions
gpu: drm: omapdrm: use new of_graph functions
ASoC: audio-graph-card2: use new of_graph functions
ASoC: audio-graph-card: use new of_graph functions
ASoC: test-component: use new of_graph functions
of: property: use new of_graph functions
of: property: add of_graph_get_next_port_endpoint()
of: property: add of_graph_get_next_port()
of: module: remove strlen() call in of_modalias()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace updates from Steven Rostedt:
- Restructure the function graph shadow stack to prepare it for use
with kretprobes
With the goal of merging the shadow stack logic of function graph and
kretprobes, some more restructuring of the function shadow stack is
required.
Move out function graph specific fields from the fgraph
infrastructure and store it on the new stack variables that can pass
data from the entry callback to the exit callback.
Hopefully, with this change, the merge of kretprobes to use fgraph
shadow stacks will be ready by the next merge window.
- Make shadow stack 4k instead of using PAGE_SIZE.
Some architectures have very large PAGE_SIZE values which make its
use for shadow stacks waste a lot of memory.
- Give shadow stacks its own kmem cache.
When function graph is started, every task on the system gets a
shadow stack. In the future, shadow stacks may not be 4K in size.
Have it have its own kmem cache so that whatever size it becomes will
still be efficient in allocations.
- Initialize profiler graph ops as it will be needed for new updates to
fgraph
- Convert to use guard(mutex) for several ftrace and fgraph functions
- Add more comments and documentation
- Show function return address in function graph tracer
Add an option to show the caller of a function at each entry of the
function graph tracer, similar to what the function tracer does.
- Abstract out ftrace_regs from being used directly like pt_regs
ftrace_regs was created to store a partial pt_regs. It holds only the
registers and stack information to get to the function arguments and
return values. On several archs, it is simply a wrapper around
pt_regs. But some users would access ftrace_regs directly to get the
pt_regs which will not work on all archs. Make ftrace_regs an
abstract structure that requires all access to its fields be through
accessor functions.
- Show how long it takes to do function code modifications
When code modification for function hooks happen, it always had the
time recorded in how long it took to do the conversion. But this
value was never exported. Recently the code was touched due to new
ROX modification handling that caused a large slow down in doing the
modifications and had a significant impact on boot times.
Expose the timings in the dyn_ftrace_total_info file. This file was
created a while ago to show information about memory usage and such
to implement dynamic function tracing. It's also an appropriate file
to store the timings of this modification as well. This will make it
easier to see the impact of changes to code modification on boot up
timings.
- Other clean ups and small fixes
* tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits)
ftrace: Show timings of how long nop patching took
ftrace: Use guard to take ftrace_lock in ftrace_graph_set_hash()
ftrace: Use guard to take the ftrace_lock in release_probe()
ftrace: Use guard to lock ftrace_lock in cache_mod()
ftrace: Use guard for match_records()
fgraph: Use guard(mutex)(&ftrace_lock) for unregister_ftrace_graph()
fgraph: Give ret_stack its own kmem cache
fgraph: Separate size of ret_stack from PAGE_SIZE
ftrace: Rename ftrace_regs_return_value to ftrace_regs_get_return_value
selftests/ftrace: Fix check of return value in fgraph-retval.tc test
ftrace: Use arch_ftrace_regs() for ftrace_regs_*() macros
ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs
ftrace: Make ftrace_regs abstract from direct use
fgragh: No need to invoke the function call_filter_check_discard()
fgraph: Simplify return address printing in function graph tracer
function_graph: Remove unnecessary initialization in ftrace_graph_ret_addr()
function_graph: Support recording and printing the function return address
ftrace: Have calltime be saved in the fgraph storage
ftrace: Use a running sleeptime instead of saving on shadow stack
fgraph: Use fgraph data to store subtime for profiler
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull vdso data page handling updates from Thomas Gleixner:
"First steps of consolidating the VDSO data page handling.
The VDSO data page handling is architecture specific for historical
reasons, but there is no real technical reason to do so.
Aside of that VDSO data has become a dump ground for various
mechanisms and fail to provide a clear separation of the
functionalities.
Clean this up by:
- consolidating the VDSO page data by getting rid of architecture
specific warts especially in x86 and PowerPC.
- removing the last includes of header files which are pulling in
other headers outside of the VDSO namespace.
- seperating timekeeping and other VDSO data accordingly.
Further consolidation of the VDSO page handling is done in subsequent
changes scheduled for the next merge window.
This also lays the ground for expanding the VDSO time getters for
independent PTP clocks in a generic way without making every
architecture add support seperately"
* tag 'timers-vdso-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
x86/vdso: Add missing brackets in switch case
vdso: Rename struct arch_vdso_data to arch_vdso_time_data
powerpc: Split systemcfg struct definitions out from vdso
powerpc: Split systemcfg data out of vdso data page
powerpc: Add kconfig option for the systemcfg page
powerpc/pseries/lparcfg: Use num_possible_cpus() for potential processors
powerpc/pseries/lparcfg: Fix printing of system_active_processors
powerpc/procfs: Propagate error of remap_pfn_range()
powerpc/vdso: Remove offset comment from 32bit vdso_arch_data
x86/vdso: Split virtual clock pages into dedicated mapping
x86/vdso: Delete vvar.h
x86/vdso: Access vdso data without vvar.h
x86/vdso: Move the rng offset to vsyscall.h
x86/vdso: Access rng vdso data without vvar.h
x86/vdso: Access timens vdso data without vvar.h
x86/vdso: Allocate vvar page from C code
x86/vdso: Access rng data from kernel without vvar
x86/vdso: Place vdso_data at beginning of vvar page
x86/vdso: Use __arch_get_vdso_data() to access vdso data
x86/mm/mmap: Remove arch_vma_name()
...
|
|
- Drop obsolete references to PPC970 KVM, which was removed 10 years ago.
- Fix incorrect references to non-existing ioctls
- List registers supported by KVM_GET/SET_ONE_REG on s390
- Use rST internal links
- Reorganize the introduction to the API document
|
|
Alexandre Ghiti <alexghiti@rivosinc.com> says:
This implements [cmp]xchgXX() macros using Zacas and Zabha extensions
and finally uses those newly introduced macros to add support for
qspinlocks: note that this implementation of qspinlocks satisfies the
forward progress guarantee.
It also uses Ziccrse to provide the qspinlock implementation.
Thanks to Guo and Leonardo for their work!
* b4-shazam-merge: (1314 commits)
riscv: Add qspinlock support
dt-bindings: riscv: Add Ziccrse ISA extension description
riscv: Add ISA extension parsing for Ziccrse
asm-generic: ticket-lock: Add separate ticket-lock.h
asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
riscv: Implement xchg8/16() using Zabha
riscv: Implement arch_cmpxchg128() using Zacas
riscv: Improve zacas fully-ordered cmpxchg()
riscv: Implement cmpxchg8/16() using Zabha
dt-bindings: riscv: Add Zabha ISA extension description
riscv: Implement cmpxchg32/64() using Zacas
riscv: Do not fail to build on byte/halfword operations with Zawrs
riscv: Move cpufeature.h macros into their own header
Link: https://lore.kernel.org/r/20241103145153.105097-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
In order to produce a generic kernel, a user can select
CONFIG_COMBO_SPINLOCKS which will fallback at runtime to the ticket
spinlock implementation if Zabha or Ziccrse are not present.
Note that we can't use alternatives here because the discovery of
extensions is done too late and we need to start with the qspinlock
implementation because the ticket spinlock implementation would pollute
the spinlock value, so let's use static keys.
This is largely based on Guo's work and Leonardo reviews at [1].
Link: https://lore.kernel.org/linux-riscv/20231225125847.2778638-1-guoren@kernel.org/ [1]
Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20241103145153.105097-14-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Add support to parse the Ziccrse string in the riscv,isa string.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20241103145153.105097-12-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
This adds runtime support for Zabha in cmpxchg8/16() operations.
Note that in the absence of Zacas support in the toolchain, CAS
instructions from Zabha won't be used.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20241103145153.105097-6-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Several architectures support text patching, but they name the header
files that declare patching functions differently.
Make all such headers consistently named text-patching.h and add an empty
header in asm-generic for architectures that do not support text patching.
Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: kdevops <kdevops@lists.linux.dev>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Song Liu <song@kernel.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The struct arch_vdso_data is only about vdso time data. So rename it to
arch_vdso_time_data to make it obvious.
Non time-related data will be migrated out of these structs soon.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-28-b64f0842d512@linutronix.de
|
|
The vvar mapping is the same for all processes. Use a single mapping to
simplify the logic and align it with the other architectures.
In addition this will enable the move of the vvar handling into generic code.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-6-b64f0842d512@linutronix.de
|
|
__pa() is only intended to be used for linear map addresses and using
it for initial_boot_params which is in fixmap for arm64 will give an
incorrect value. Hence save the physical address when it is known at
boot time when calling early_init_dt_scan for arm64 and use it at kexec
time instead of converting the virtual address using __pa().
Note that arm64 doesn't need the FDT region reserved in the DT as the
kernel explicitly reserves the passed in FDT. Therefore, only a debug
warning is fixed with this change.
Reported-by: Breno Leitao <leitao@debian.org>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Usama Arif <usamaarif642@gmail.com>
Fixes: ac10be5cdbfa ("arm64: Use common of_kexec_alloc_and_setup_fdt()")
Link: https://lore.kernel.org/r/20241023171426.452688-1-usamaarif642@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Introduce basic guest support in perf, enabling it to distinguish
between PMU interrupts in the host or guest, and collect
fundamental information.
Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/a67d527dc1b11493fe11f7f53584772fdd983744.1728957131.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
|
|
This fix is part of a series on for-next, but it fixes broken builds so
I'm picking it up as a fix.
* commit 'bf40167d54d5':
riscv: vdso: Prevent the compiler from inserting calls to memset()
|
|
The macro GET_RM defined twice in this file, one can be removed.
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241008094141.549248-3-zhangchunyan@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The macro is not used in the current version of kernel, it looks like
can be removed to avoid a build warning:
../arch/riscv/kernel/asm-offsets.c: At top level:
../arch/riscv/kernel/asm-offsets.c:7: warning: macro "GENERATING_ASM_OFFSETS" is not used [-Wunused-macros]
7 | #define GENERATING_ASM_OFFSETS
Fixes: 9639a44394b9 ("RISC-V: Provide a cleaner raw_smp_processor_id()")
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
Link: https://lore.kernel.org/r/20241008094141.549248-2-zhangchunyan@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
'cpu' is an unsigned integer, so its conversion specifier should
be %u, not %d.
Suggested-by: Wentao Guan <guanwentao@uniontech.com>
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/all/alpine.DEB.2.21.2409122309090.40372@angie.orcam.me.uk/
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: f1e58583b9c7 ("RISC-V: Support cpu hotplug")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/4C127DEECDA287C8+20241017032010.96772-1-wangyuli@uniontech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
When populating cache leaves we previously fetched the CPU device node
at the very beginning. But when ACPI is enabled we go through a
specific branch which returns early and does not call 'of_node_put' for
the node that was acquired.
Since we are not using a CPU device node for the ACPI code anyways, we
can simply move the initialization of it just passed the ACPI block, and
we are guaranteed to have an 'of_node_put' call for the acquired node.
This prevents a bad reference count of the CPU device node.
Moreover, the previous function did not check for errors when acquiring
the device node, so a return -ENOENT has been added for that case.
Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Fixes: 604f32ea6909 ("riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240913080053.36636-1-mikisabate@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The IMAGE_DLLCHARACTERISTICS_NX_COMPAT informs the firmware that the
EFI binary does not rely on pages that are both executable and
writable.
The flag is used by some distro versions of GRUB to decide if the EFI
binary may be executed.
As the Linux kernel neither has RWX sections nor needs RWX pages for
relocation we should set the flag.
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Fixes: cb7d2dd5612a ("RISC-V: Add PE/COFF header for EFI stub")
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240929140233.211800-1-heinrich.schuchardt@canonical.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Early code designates the code executed when the MMU is not yet enabled,
and this comes with some limitations (see
Documentation/arch/riscv/boot.rst, section "Pre-MMU execution").
FORTIFY_SOURCE must be disabled then since it can trigger kernel panics
as reported in [1].
Reported-by: Jason Montleon <jmontleo@redhat.com>
Closes: https://lore.kernel.org/linux-riscv/CAJD_bPJes4QhmXY5f63GHV9B9HFkSCoaZjk-qCT2NGS7Q9HODg@mail.gmail.com/ [1]
Fixes: a35707c3d850 ("riscv: add memory-type errata for T-Head")
Fixes: 26e7aacb83df ("riscv: Allow to downgrade paging mode from the command line")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20241009072749.45006-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
When SVPBMT is enabled, __acpi_map_table() will directly access the
data in DDR through the IO attribute, rather than through hardware
cache consistency, resulting in incorrect data in the obtained ACPI
table.
The log: ACPI: [ACPI:0x18] Invalid zero length.
We do not assume whether the bootloader flushes or not. We should
access in a cacheable way instead of maintaining cache consistency
by software.
Fixes: 3b426d4b5b14 ("RISC-V: ACPI : Fix for usage of pointers in different address space")
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Link: https://lore.kernel.org/r/20241014130141.86426-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Samuel Holland <samuel.holland@sifive.com> says:
RISC-V defines three extensions for pointer masking[1]:
- Smmpm: configured in M-mode, affects M-mode
- Smnpm: configured in M-mode, affects the next lower mode (S or U-mode)
- Ssnpm: configured in S-mode, affects the next lower mode (VS, VU, or U-mode)
This series adds support for configuring Smnpm or Ssnpm (depending on
which privilege mode the kernel is running in) to allow pointer masking
in userspace (VU or U-mode), extending the PR_SET_TAGGED_ADDR_CTRL API
from arm64. Unlike arm64 TBI, userspace pointer masking is not enabled
by default on RISC-V. Additionally, the tag width (referred to as PMLEN)
is variable, so userspace needs to ask the kernel for a specific tag
width, which is interpreted as a lower bound on the number of tag bits.
This series also adds support for a tagged address ABI similar to arm64
and x86. Since accesses from the kernel to user memory use the kernel's
pointer masking configuration, not the user's, the kernel must untag
user pointers in software before dereferencing them. And since the tag
width is variable, as with LAM on x86, it must be kept the same across
all threads in a process so untagged_addr_remote() can work.
[1]: https://github.com/riscv/riscv-j-extension/raw/d70011dde6c2/zjpm-spec.pdf
* b4-shazam-merge:
KVM: riscv: selftests: Add Smnpm and Ssnpm to get-reg-list test
RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests
riscv: hwprobe: Export the Supm ISA extension
riscv: selftests: Add a pointer masking test
riscv: Allow ptrace control of the tagged address ABI
riscv: Add support for the tagged address ABI
riscv: Add support for userspace pointer masking
riscv: Add CSR definitions for pointer masking
riscv: Add ISA extension parsing for pointer masking
dt-bindings: riscv: Add pointer masking ISA extensions
Link: https://lore.kernel.org/r/20241016202814.4061541-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Supm is a virtual ISA extension defined in the RISC-V Pointer Masking
specification, which indicates that pointer masking is available in
U-mode. It can be provided by either Smnpm or Ssnpm, depending on which
mode the kernel runs in. Userspace should not care about this
distinction, so export Supm instead of either underlying extension.
Hide the extension if the kernel was compiled without support for the
pointer masking prctl() interface.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-9-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
This allows a tracer to control the ABI of the tracee, as on arm64.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-7-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
When pointer masking is enabled for userspace, the kernel can accept
tagged pointers as arguments to some system calls. Allow this by
untagging the pointers in access_ok() and the uaccess routines. The
uaccess routines must peform untagging in software because U-mode and
S-mode have entirely separate pointer masking configurations. In fact,
hardware may not even implement pointer masking for S-mode.
Since the number of tag bits is variable, untagged_addr_remote() needs
to know what PMLEN to use for the remote mm. Therefore, the pointer
masking mode must be the same for all threads sharing an mm. Enforce
this with a lock flag in the mm context, as x86 does for LAM. The flag
gets reset in init_new_context() during fork(), as the new mm is no
longer multithreaded.
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-6-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
RISC-V supports pointer masking with a variable number of tag bits
(which is called "PMLEN" in the specification) and which is configured
at the next higher privilege level.
Wire up the PR_SET_TAGGED_ADDR_CTRL and PR_GET_TAGGED_ADDR_CTRL prctls
so userspace can request a lower bound on the number of tag bits and
determine the actual number of tag bits. As with arm64's
PR_TAGGED_ADDR_ENABLE, the pointer masking configuration is
thread-scoped, inherited on clone() and fork() and cleared on execve().
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-5-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The RISC-V Pointer Masking specification defines three extensions:
Smmpm, Smnpm, and Ssnpm. Add support for parsing each of them. The
specific extension which provides pointer masking support to userspace
(Supm) depends on the kernel's privilege mode, so provide a macro to
abstract this selection.
Smmpm implies the existence of the mseccfg CSR. As it is the only user
of this CSR so far, there is no need for an Xlinuxmseccfg extension.
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20241016202814.4061541-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The first is a fix and the second a check to make sure we don't
regress on the relocations, so I'm picking this up as a series to get
the fix into fixes.
* b4-shazam-merge:
riscv: Check that vdso does not contain any dynamic relocations
riscv: vdso: Prevent the compiler from inserting calls to memset()
Link: https://lore.kernel.org/r/20241016083625.136311-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Like other architectures, use the common cmd_vdso_check to make sure of
that.
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Vladimir Isaev <vladimir.isaev@syntacore.com>
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20241016083625.136311-3-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The compiler is smart enough to insert a call to memset() in
riscv_vdso_get_cpus(), which generates a dynamic relocation.
So prevent this by using -fno-builtin option.
Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
Cc: stable@vger.kernel.org
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20241016083625.136311-2-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
accesses"
Charlie Jenkins <charlie@rivosinc.com> says:
Adds support for detecting and reporting the speed of unaligned vector
accesses on RISC-V CPUs. Adds vec_misaligned_speed key to the hwprobe
adds Zicclsm to cpufeature and fixes the check for scalar unaligned
emulated all CPUs. The vec_misaligned_speed key keeps the same format
as the scalar unaligned access speed key.
This set does not emulate unaligned vector accesses on CPUs that do not
support them. Only reports if userspace can run them and speed of
unaligned vector accesses if supported.
* b4-shazam-merge:
RISC-V: hwprobe: Document unaligned vector perf key
RISC-V: Report vector unaligned access speed hwprobe
RISC-V: Detect unaligned vector accesses supported
RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED
RISC-V: Scalar unaligned access emulated on hotplug CPUs
RISC-V: Check scalar unaligned access on all CPUs
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-0-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Detect if vector misaligned accesses are faster or slower than
equivalent vector byte accesses. This is useful for usermode to know
whether vector byte accesses or vector misaligned accesses have a better
bandwidth for operations like memcpy.
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-5-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Run an unaligned vector access to test if the system supports
vector unaligned access. Add the result to a new key in hwprobe.
This is useful for usermode to know if vector misaligned accesses are
supported and if they are faster or slower than equivalent byte accesses.
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-4-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED to allow
for the addition of RISCV_VECTOR_MISALIGNED in a later patch.
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-3-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
The check_unaligned_access_emulated() function should have been called
during CPU hotplug to ensure that if all CPUs had emulated unaligned
accesses, the new CPU also does.
This patch adds the call to check_unaligned_access_emulated() in
the hotplug path.
Fixes: 55e0bf49a0d0 ("RISC-V: Probe misaligned access speed in parallel")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-2-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Originally, the check_unaligned_access_emulated_all_cpus function
only checked the boot hart. This fixes the function to check all
harts.
Fixes: 71c54b3d169d ("riscv: report misaligned accesses emulation to hwprobe")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-1-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
ftrace_regs was created to hold registers that store information to save
function parameters, return value and stack. Since it is a subset of
pt_regs, it should only be used by its accessor functions. But because
pt_regs can easily be taken from ftrace_regs (on most archs), it is
tempting to use it directly. But when running on other architectures, it
may fail to build or worse, build but crash the kernel!
Instead, make struct ftrace_regs an empty structure and have the
architectures define __arch_ftrace_regs and all the accessor functions
will typecast to it to get to the actual fields. This will help avoid
usage of ftrace_regs directly.
Link: https://lore.kernel.org/all/20241007171027.629bdafd@gandalf.local.home/
Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Cc: "x86@kernel.org" <x86@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/20241008230628.958778821@goodmis.org
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Now that the [ms]envcfg CSR value is maintained per thread, not per
hart, riscv_user_isa_enable() only needs to be called once during boot,
to set the value for the init task. This also allows it to be marked as
__init.
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240814081126.956287-4-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Some bits in the [ms]envcfg CSR, such as the CFI state and pointer
masking mode, need to be controlled on a per-thread basis. Support this
by keeping a copy of the CSR value in struct thread_struct and writing
it during context switches. It is safe to discard the old CSR value
during the context switch because the CSR is modified only by software,
so the CSR will remain in sync with the copy in thread_struct.
Use ALTERNATIVE directly instead of riscv_has_extension_unlikely() to
minimize branchiness in the context switching code.
Since thread_struct is copied during fork(), setting the value for the
init task sets the default value for all other threads.
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240814081126.956287-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Currently, we enable cbo.zero for usermode on each hart that supports
the Zicboz extension. This means that the [ms]envcfg CSR value may
differ between harts. Other features, such as pointer masking and CFI,
require setting [ms]envcfg bits on a per-thread basis. The combination
of these two adds quite some complexity and overhead to context
switching, as we would need to maintain two separate masks for the
per-hart and per-thread bits. Andrew Jones, who originally added Zicboz
support, writes[1][2]:
I've approached Zicboz the same way I would approach all
extensions, which is to be per-hart. I'm not currently aware of
a platform that is / will be composed of harts where some have
Zicboz and others don't, but there's nothing stopping a platform
like that from being built.
So, how about we add code that confirms Zicboz is on all harts.
If any hart does not have it, then we complain loudly and disable
it on all the other harts. If it was just a hardware description
bug, then it'll get fixed. If there's actually a platform which
doesn't have Zicboz on all harts, then, when the issue is reported,
we can decide to not support it, support it with defconfig, or
support it under a Kconfig guard which must be enabled by the user.
Let's follow his suggested solution and require the extension to be
available on all harts, so the envcfg CSR value does not need to change
when a thread migrates between harts. Since we are doing this for all
extensions with fields in envcfg, the CSR itself only needs to be saved/
restored when it is present on all harts.
This should not be a regression as no known hardware has asymmetric
Zicboz support, but if anyone reports seeing the warning, we will
re-evaluate our solution.
Link: https://lore.kernel.org/linux-riscv/20240322-168f191eeb8479b2ea169a5e@orel/ [1]
Link: https://lore.kernel.org/linux-riscv/20240323-28943722feb57a41fb0ff488@orel/ [2]
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240814081126.956287-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support using Zkr to seed KASLR
- Support IPI-triggered CPU backtracing
- Support for generic CPU vulnerabilities reporting to userspace
- A few cleanups for missing licenses
- The size limit on the XIP kernel has been removed
- Support for tracing userspace stacks
- Support for the Svvptc extension
- Various cleanups and fixes throughout the tree
* tag 'riscv-for-linus-6.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (47 commits)
crash: Fix riscv64 crash memory reserve dead loop
perf/riscv-sbi: Add platform specific firmware event handling
tools: Optimize ring buffer for riscv
tools: Add riscv barrier implementation
RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t
ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
riscv: Enable bitops instrumentation
riscv: Omit optimized string routines when using KASAN
ACPI: RISCV: Make acpi_numa_get_nid() to be static
riscv: Randomize lower bits of stack address
selftests: riscv: Allow mmap test to compile on 32-bit
riscv: Make riscv_isa_vendor_ext_andes array static
riscv: Use LIST_HEAD() to simplify code
riscv: defconfig: Disable RZ/Five peripheral support
RISC-V: Implement kgdb_roundup_cpus() to enable future NMI Roundup
riscv: avoid Imbalance in RAS
riscv: cacheinfo: Add back init_cache_level() function
riscv: Remove unused _TIF_WORK_MASK
drivers/perf: riscv: Remove redundant macro check
riscv: define ILLEGAL_POINTER_VALUE for 64bit
...
|
|
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> (arm64 platform)
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240729035958.1957185-1-haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Samuel Holland <samuel.holland@sifive.com> says:
This series fixes two areas where uninstrumented assembly routines
caused gaps in KASAN coverage on RISC-V, which were caught by KUnit
tests. The KASAN KUnit test suite passes after applying this series.
This series fixes the following test failures:
# kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1520
KASAN failure expected in "kasan_int_result = strcmp(ptr, "2")", but none occurred
# kasan_strings: EXPECTATION FAILED at mm/kasan/kasan_test.c:1524
KASAN failure expected in "kasan_int_result = strlen(ptr)", but none occurred
not ok 60 kasan_strings
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1531
KASAN failure expected in "set_bit(nr, addr)", but none occurred
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1533
KASAN failure expected in "clear_bit(nr, addr)", but none occurred
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1535
KASAN failure expected in "clear_bit_unlock(nr, addr)", but none occurred
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1536
KASAN failure expected in "__clear_bit_unlock(nr, addr)", but none occurred
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1537
KASAN failure expected in "change_bit(nr, addr)", but none occurred
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1543
KASAN failure expected in "test_and_set_bit(nr, addr)", but none occurred
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1545
KASAN failure expected in "test_and_set_bit_lock(nr, addr)", but none occurred
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1546
KASAN failure expected in "test_and_clear_bit(nr, addr)", but none occurred
# kasan_bitops_generic: EXPECTATION FAILED at mm/kasan/kasan_test.c:1548
KASAN failure expected in "test_and_change_bit(nr, addr)", but none occurred
not ok 61 kasan_bitops_generic
Samuel Holland (2):
riscv: Omit optimized string routines when using KASAN
riscv: Enable bitops instrumentation
arch/riscv/include/asm/bitops.h | 43 ++++++++++++++++++---------------
arch/riscv/include/asm/string.h | 2 ++
arch/riscv/kernel/riscv_ksyms.c | 3 ---
arch/riscv/lib/Makefile | 2 ++
arch/riscv/lib/strcmp.S | 1 +
arch/riscv/lib/strlen.S | 1 +
arch/riscv/lib/strncmp.S | 1 +
arch/riscv/purgatory/Makefile | 2 ++
8 files changed, 32 insertions(+), 23 deletions(-)
* b4-shazam-merge:
riscv: Enable bitops instrumentation
riscv: Omit optimized string routines when using KASAN
Link: https://lore.kernel.org/r/20240801033725.28816-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|