summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2024-01-09use linux/export.h rather than asm-generic/export.hAl Viro
asm-generic/export.h is a wrapper for linux/export.h, with explicit request to use linux/export.h directly. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Link: https://lore.kernel.org/r/20231214191922.GQ1674809@ZenIV Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09Merge patch series "Fix XIP boot and make XIP testable in QEMU"Palmer Dabbelt
Frederik Haxel <haxel@fzi.de> says: XIP boot seems to be broken for some time now. A likely reason why no one seems to have noticed this is that XIP is more difficult to test, as it is currently not easily testable with QEMU. These patches fix the XIP boot and allow an XIP build without BUILTIN_DTB, which in turn makes it easier to test an image with the QEMU virt machine. * b4-shazam-merge: riscv: Allow disabling of BUILTIN_DTB for XIP riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro riscv: Make XIP bootable again Link: https://lore.kernel.org/r/20231212130116.848530-1-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Remove SHADOW_OVERFLOW_STACK_SIZE macroSong Shuai
The commit be97d0db5f44 ("riscv: VMAP_STACK overflow detection thread-safe") got rid of `shadow_stack`, so SHADOW_OVERFLOW_STACK_SIZE should be removed too. Fixes: be97d0db5f44 ("riscv: VMAP_STACK overflow detection thread-safe") Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20231211110331.359534-1-songshuaishuai@tinylab.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09Merge remote-tracking branch 'palmer/fixes' into for-nextPalmer Dabbelt
I don't usually merge these in, but I missed sending a PR due to the holidays. * palmer/fixes: riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC riscv: Fix module_alloc() that did not reset the linear mapping permissions riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping riscv: Check if the code to patch lies in the exit section riscv: errata: andes: Probe for IOCP only once in boot stage riscv: Fix SMP when shadow call stacks are enabled dt-bindings: perf: riscv,pmu: drop unneeded quotes riscv: fix misaligned access handling of C.SWSP and C.SDSP RISC-V: hwprobe: Always use u64 for extension bits Support rv32 ULEB128 test riscv: Correct type casting in module loading riscv: Safely remove entries from relocation list Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv; fix __user annotation in save_v_state()Ben Dooks
The save_v_state() is technically sending a __user pointer through __put_user() and thus is generating a sparse warning so force the value to be "void *" to fix: arch/riscv/kernel/signal.c:94:16: warning: incorrect type in initializer (different address spaces) arch/riscv/kernel/signal.c:94:16: expected void *__val arch/riscv/kernel/signal.c:94:16: got void [noderef] __user *[assigned] datap Fixes: 8ee0b41898fa26f66e32 ("riscv: signal: Add sigcontext save/restore for vector") Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Link: https://lore.kernel.org/r/20231123142708.261733-1-ben.dooks@codethink.co.uk Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: fix __user annotation in traps_misaligned.cBen Dooks
The instruction reading code can read from either user or kernel addresses and thus the use of __user on pointers to instructions depends on which context. Fix a few sparse warnings by using __user for user-accesses and remove it when not. Fixes: arch/riscv/kernel/traps_misaligned.c:361:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:373:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:381:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:322:24: warning: incorrect type in initializer (different address spaces) arch/riscv/kernel/traps_misaligned.c:322:24: expected unsigned char const [noderef] __user *__gu_ptr arch/riscv/kernel/traps_misaligned.c:322:24: got unsigned char const [usertype] *addr arch/riscv/kernel/traps_misaligned.c:361:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:373:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:381:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:332:24: warning: incorrect type in initializer (different address spaces) arch/riscv/kernel/traps_misaligned.c:332:24: expected unsigned char [noderef] __user *__gu_ptr arch/riscv/kernel/traps_misaligned.c:332:24: got unsigned char [usertype] *addr Fixes: 7c83232161f60 ("riscv: add support for misaligned trap handling in S-mode") Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Link: https://lore.kernel.org/r/20231123141617.259591-1-ben.dooks@codethink.co.uk Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Select ARCH_WANTS_NO_INSTRJisheng Zhang
As said in the help of ARCH_WANTS_NO_INSTR entry in arch/Kconfig: "An architecture should select this if the noinstr macro is being used on functions to denote that the toolchain should avoid instrumenting such functions and is required for correctness." Select ARCH_WANTS_NO_INSTR for correctness. PS: The reason we didn't find any issue so far is that the CC_HAS_NO_PROFILE_FN_ATTR is true. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20231123142223.1787-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09Merge patch series "riscv: CPU operations cleanup"Palmer Dabbelt
Samuel Holland <samuel.holland@sifive.com> says: This series cleans up some duplicated and dead code around the RISC-V CPU operations, that was copied from arm64 but is not needed here. The result is a bit of memory savings and removal of a few SBI calls during boot, with no functional change. * b4-shazam-merge: riscv: Use the same CPU operations for all CPUs riscv: Remove unused members from struct cpu_operations riscv: Deduplicate code in setup_smp() Link: https://lore.kernel.org/r/20231121234736.3489608-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Remove obsolete rv32_defconfig fileSamuel Holland
This file is not used since commit 72f045d19f25 ("riscv: Fixup difference with defconfig"), where it was replaced by the 32-bit.config fragment. Delete the old file to avoid any confusion. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231121225320.3430550-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09Merge patch series "RISC-V: hwprobe: Introduce which-cpus"Palmer Dabbelt
Andrew Jones <ajones@ventanamicro.com> says: This series introduces a flag for the hwprobe syscall which effectively reverses its behavior from getting the values of keys for a set of cpus to getting the cpus for a set of key-value pairs. * b4-shazam-merge: RISC-V: selftests: Add which-cpus hwprobe test RISC-V: hwprobe: Introduce which-cpus flag RISC-V: Move the hwprobe syscall to its own file RISC-V: hwprobe: Clarify cpus size parameter Link: https://lore.kernel.org/r/20231122164700.127954-6-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Allow disabling of BUILTIN_DTB for XIPFrederik Haxel
This enables, among other things, testing with the QEMU virt machine. To build an XIP kernel for the QEMU virt machine, configure the the kernel as desired and apply the following configuration ``` CONFIG_NONPORTABLE=y CONFIG_XIP_KERNEL=y CONFIG_XIP_PHYS_ADDR=0x20000000 CONFIG_PHYS_RAM_BASE=0x80200000 CONFIG_BUILTIN_DTB=n ``` Since the QEMU virt flash memory expects a 32 MB file, the built image must be padded. For example, with `truncate -s 32M arch/riscv/boot/xipImage` The kernel can be started using the following command in QEMU (v8+) ``` qemu-system-riscv64 -M virt,pflash0=pflash0 \ -blockdev node-name=pflash0,driver=file,read-only=on,\ filename=arch/riscv/boot/xipImage <optional parameters> ``` Signed-off-by: Frederik Haxel <haxel@fzi.de> Link: https://lore.kernel.org/r/20231212130116.848530-4-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macroFrederik Haxel
During the refactoring, a bug was introduced in the rarly used XIP_FIXUP_FLASH_OFFSET macro. Fixes: bee7fbc38579 ("RISC-V CPU Idle Support") Fixes: e7681beba992 ("RISC-V: Split out the XIP fixups into their own file") Signed-off-by: Frederik Haxel <haxel@fzi.de> Link: https://lore.kernel.org/r/20231212130116.848530-3-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Make XIP bootable againFrederik Haxel
Currently, the XIP kernel seems to fail to boot due to missing XIP_FIXUP and a wrong page_offset value. A superfluous XIP_FIXUP has also been removed. Signed-off-by: Frederik Haxel <haxel@fzi.de> Link: https://lore.kernel.org/r/20231212130116.848530-2-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09Merge tag 'efi-next-for-v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Fix a syzbot reported issue in efivarfs where concurrent accesses to the file system resulted in list corruption - Add support for accessing EFI variables via the TEE subsystem (and a trusted application in the secure world) instead of via EFI runtime firmware running in the OS's execution context - Avoid linker tricks to discover the image base on LoongArch * tag 'efi-next-for-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: memmap: fix kernel-doc warnings efi/loongarch: Directly position the loaded image file efivarfs: automatically update super block flag efi: Add tee-based EFI variable driver efi: Add EFI_ACCESS_DENIED status code efi: expose efivar generic ops register function efivarfs: Move efivarfs list into superblock s_fs_info efivarfs: Free s_fs_info on unmount efivarfs: Move efivar availability check into FS context init efivarfs: force RO when remounting if SetVariable is not supported
2024-01-09Merge tag 'lsm-pr-20240105' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull security module updates from Paul Moore: - Add three new syscalls: lsm_list_modules(), lsm_get_self_attr(), and lsm_set_self_attr(). The first syscall simply lists the LSMs enabled, while the second and third get and set the current process' LSM attributes. Yes, these syscalls may provide similar functionality to what can be found under /proc or /sys, but they were designed to support multiple, simultaneaous (stacked) LSMs from the start as opposed to the current /proc based solutions which were created at a time when only one LSM was allowed to be active at a given time. We have spent considerable time discussing ways to extend the existing /proc interfaces to support multiple, simultaneaous LSMs and even our best ideas have been far too ugly to support as a kernel API; after +20 years in the kernel, I felt the LSM layer had established itself enough to justify a handful of syscalls. Support amongst the individual LSM developers has been nearly unanimous, with a single objection coming from Tetsuo (TOMOYO) as he is worried that the LSM_ID_XXX token concept will make it more difficult for out-of-tree LSMs to survive. Several members of the LSM community have demonstrated the ability for out-of-tree LSMs to continue to exist by picking high/unused LSM_ID values as well as pointing out that many kernel APIs rely on integer identifiers, e.g. syscalls (!), but unfortunately Tetsuo's objections remain. My personal opinion is that while I have no interest in penalizing out-of-tree LSMs, I'm not going to penalize in-tree development to support out-of-tree development, and I view this as a necessary step forward to support the push for expanded LSM stacking and reduce our reliance on /proc and /sys which has occassionally been problematic for some container users. Finally, we have included the linux-api folks on (all?) recent revisions of the patchset and addressed all of their concerns. - Add a new security_file_ioctl_compat() LSM hook to handle the 32-bit ioctls on 64-bit systems problem. This patch includes support for all of the existing LSMs which provide ioctl hooks, although it turns out only SELinux actually cares about the individual ioctls. It is worth noting that while Casey (Smack) and Tetsuo (TOMOYO) did not give explicit ACKs to this patch, they did both indicate they are okay with the changes. - Fix a potential memory leak in the CALIPSO code when IPv6 is disabled at boot. While it's good that we are fixing this, I doubt this is something users are seeing in the wild as you need to both disable IPv6 and then attempt to configure IPv6 labeled networking via NetLabel/CALIPSO; that just doesn't make much sense. Normally this would go through netdev, but Jakub asked me to take this patch and of all the trees I maintain, the LSM tree seemed like the best fit. - Update the LSM MAINTAINERS entry with additional information about our process docs, patchwork, bug reporting, etc. I also noticed that the Lockdown LSM is missing a dedicated MAINTAINERS entry so I've added that to the pull request. I've been working with one of the major Lockdown authors/contributors to see if they are willing to step up and assume a Lockdown maintainer role; hopefully that will happen soon, but in the meantime I'll continue to look after it. - Add a handful of mailmap entries for Serge Hallyn and myself. * tag 'lsm-pr-20240105' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (27 commits) lsm: new security_file_ioctl_compat() hook lsm: Add a __counted_by() annotation to lsm_ctx.ctx calipso: fix memory leak in netlbl_calipso_add_pass() selftests: remove the LSM_ID_IMA check in lsm/lsm_list_modules_test MAINTAINERS: add an entry for the lockdown LSM MAINTAINERS: update the LSM entry mailmap: add entries for Serge Hallyn's dead accounts mailmap: update/replace my old email addresses lsm: mark the lsm_id variables are marked as static lsm: convert security_setselfattr() to use memdup_user() lsm: align based on pointer length in lsm_fill_user_ctx() lsm: consolidate buffer size handling into lsm_fill_user_ctx() lsm: correct error codes in security_getselfattr() lsm: cleanup the size counters in security_getselfattr() lsm: don't yet account for IMA in LSM_CONFIG_COUNT calculation lsm: drop LSM_ID_IMA LSM: selftests for Linux Security Module syscalls SELinux: Add selfattr hooks AppArmor: Add selfattr hooks Smack: implement setselfattr and getselfattr hooks ...
2024-01-09Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Quite a lot of kexec work this time around. Many singleton patches in many places. The notable patch series are: - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio conversions for file paths'. - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2: Folio conversions for directory paths'. - IA64 remnant removal in Heiko Carstens's 'Remove unused code after IA-64 removal'. - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere in 'Treewide: enable -Wmissing-prototypes'. This had some followup fixes: - Nathan Chancellor has cleaned up the hexagon build in the series 'hexagon: Fix up instances of -Wmissing-prototypes'. - Nathan also addressed some s390 warnings in 's390: A couple of fixes for -Wmissing-prototypes'. - Arnd Bergmann addresses the same warnings for MIPS in his series 'mips: address -Wmissing-prototypes warnings'. - Baoquan He has made kexec_file operate in a top-down-fitting manner similar to kexec_load in the series 'kexec_file: Load kernel at top of system RAM if required' - Baoquan He has also added the self-explanatory 'kexec_file: print out debugging message if required'. - Some checkstack maintenance work from Tiezhu Yang in the series 'Modify some code about checkstack'. - Douglas Anderson has disentangled the watchdog code's logging when multiple reports are occurring simultaneously. The series is 'watchdog: Better handling of concurrent lockups'. - Yuntao Wang has contributed some maintenance work on the crash code in 'crash: Some cleanups and fixes'" * tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits) crash_core: fix and simplify the logic of crash_exclude_mem_range() x86/crash: use SZ_1M macro instead of hardcoded value x86/crash: remove the unused image parameter from prepare_elf_headers() kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE scripts/decode_stacktrace.sh: strip unexpected CR from lines watchdog: if panicking and we dumped everything, don't re-enable dumping watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/hardlockup: adopt softlockup logic avoiding double-dumps kexec_core: fix the assignment to kimage->control_page x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init() lib/trace_readwrite.c:: replace asm-generic/io with linux/io nilfs2: cpfile: fix some kernel-doc warnings stacktrace: fix kernel-doc typo scripts/checkstack.pl: fix no space expression between sp and offset x86/kexec: fix incorrect argument passed to kexec_dprintk() x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs nilfs2: add missing set_freezable() for freezable kthread kernel: relay: remove relay_file_splice_read dead code, doesn't work docs: submit-checklist: remove all of "make namespacecheck" ...
2024-01-09Merge tag 'mm-stable-2024-01-08-15-31' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series 'maple_tree: add mt_free_one() and mt_attr() helpers' 'Some cleanups of maple tree' - In the series 'mm: use memmap_on_memory semantics for dax/kmem' Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series 'Add folio_zero_tail() and folio_fill_tail()' 'Make folio_start_writeback return void' 'Fix fault handler's handling of poisoned tail pages' 'Convert aops->error_remove_page to ->error_remove_folio' 'Finish two folio conversions' 'More swap folio conversions' - Kefeng Wang has also contributed folio-related work in the series 'mm: cleanup and use more folio in page fault' - Jim Cromie has improved the kmemleak reporting output in the series 'tweak kmemleak report format'. - In the series 'stackdepot: allow evicting stack traces' Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series 'mm: page_alloc: fixes for high atomic reserve caluculations'. - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series 'samples: introduce cgroup events listeners'. - Some mapletree maintanance work from Liam Howlett in the series 'maple_tree: iterator state changes'. - Nhat Pham has improved zswap's approach to writeback in the series 'workload-specific and memory pressure-driven zswap writeback'. - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series 'mm/damon: let users feed and tame/auto-tune DAMOS' 'selftests/damon: add Python-written DAMON functionality tests' 'mm/damon: misc updates for 6.8' - Yosry Ahmed has improved memcg's stats flushing in the series 'mm: memcg: subtree stats flushing and thresholds'. - In the series 'Multi-size THP for anonymous memory' Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series 'More buffer_head cleanups'. - Suren Baghdasaryan has done work on Andrea Arcangeli's series 'userfaultfd move option'. UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm: Add ksm advisor'. This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series 'mm/zswap: dstmem reuse optimizations and cleanups'. - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is 'Clean up the writeback paths'. - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series 'kasan: save mempool stack traces'. - Andrey also performed some KASAN maintenance work in the series 'kasan: assorted clean-ups'. - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series 'mm/rmap: interface overhaul'. - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series 'mm/mglru: Kconfig cleanup'. - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series 'Remove some lruvec page accounting functions'" * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits) mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER mm, treewide: introduce NR_PAGE_ORDERS selftests/mm: add separate UFFDIO_MOVE test for PMD splitting selftests/mm: skip test if application doesn't has root privileges selftests/mm: conform test to TAP format output selftests: mm: hugepage-mmap: conform to TAP format output selftests/mm: gup_test: conform test to TAP format output mm/selftests: hugepage-mremap: conform test to TAP format output mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large mm/memcontrol: remove __mod_lruvec_page_state() mm/khugepaged: use a folio more in collapse_file() slub: use a folio in __kmalloc_large_node slub: use folio APIs in free_large_kmalloc() slub: use alloc_pages_node() in alloc_slab_page() mm: remove inc/dec lruvec page state functions mm: ratelimit stat flush from workingset shrinker kasan: stop leaking stack trace handles mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE mm/mglru: add dummy pmd_dirty() ...
2024-01-09riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXECAlexandre Ghiti
When resetting the linear mapping permissions, we must make sure that we clear the X bit so that do not end up with WX mappings (since we set PAGE_KERNEL). Fixes: 395a21ff859c ("riscv: add ARCH_HAS_SET_DIRECT_MAP support") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231213134027.155327-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Fix module_alloc() that did not reset the linear mapping permissionsAlexandre Ghiti
After unloading a module, we must reset the linear mapping permissions, see the example below: Before unloading a module: 0xffffaf809d65d000-0xffffaf809d6dc000 0x000000011d65d000 508K PTE . .. .. D A G . . W R V 0xffffaf809d6dc000-0xffffaf809d6dd000 0x000000011d6dc000 4K PTE . .. .. D A G . . . R V 0xffffaf809d6dd000-0xffffaf809d6e1000 0x000000011d6dd000 16K PTE . .. .. D A G . . W R V 0xffffaf809d6e1000-0xffffaf809d6e7000 0x000000011d6e1000 24K PTE . .. .. D A G . X . R V After unloading a module: 0xffffaf809d65d000-0xffffaf809d6e1000 0x000000011d65d000 528K PTE . .. .. D A G . . W R V 0xffffaf809d6e1000-0xffffaf809d6e7000 0x000000011d6e1000 24K PTE . .. .. D A G . X W R V The last mapping is not reset and we end up with WX mappings in the linear mapping. So add VM_FLUSH_RESET_PERMS to our module_alloc() definition. Fixes: 0cff8bff7af8 ("riscv: avoid the PIC offset of static percpu data in module beyond 2G limits") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231213134027.155327-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Fix wrong usage of lm_alias() when splitting a huge linear mappingAlexandre Ghiti
lm_alias() can only be used on kernel mappings since it explicitly uses __pa_symbol(), so simply fix this by checking where the address belongs to before. Fixes: 311cd2f6e253 ("riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings") Reported-by: syzbot+afb726d49f84c8d95ee1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-riscv/000000000000620dd0060c02c5e1@google.com/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231212195400.128457-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09riscv: Check if the code to patch lies in the exit sectionAlexandre Ghiti
Otherwise we fall through to vmalloc_to_page() which panics since the address does not lie in the vmalloc region. Fixes: 043cb41a85de ("riscv: introduce interfaces to patch kernel code") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231214091926.203439-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-01-09Merge tag 'slab-for-6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - SLUB: delayed freezing of CPU partial slabs (Chengming Zhou) Freezing is an operation involving double_cmpxchg() that makes a slab exclusive for a particular CPU. Chengming noticed that we use it also in situations where we are not yet installing the slab as the CPU slab, because freezing also indicates that the slab is not on the shared list. This results in redundant freeze/unfreeze operation and can be avoided by marking separately the shared list presence by reusing the PG_workingset flag. This approach neatly avoids the issues described in 9b1ea29bc0d7 ("Revert "mm, slub: consider rest of partial list if acquire_slab() fails"") as we can now grab a slab from the shared list in a quick and guaranteed way without the cmpxchg_double() operation that amplifies the lock contention and can fail. As a result, lkp has reported 34.2% improvement of stress-ng.rawudp.ops_per_sec - SLAB removal and SLUB cleanups (Vlastimil Babka) The SLAB allocator has been deprecated since 6.5 and nobody has objected so far. We agreed at LSF/MM to wait until the next LTS, which is 6.6, so we should be good to go now. This doesn't yet erase all traces of SLAB outside of mm/ so some dead code, comments or documentation remain, and will be cleaned up gradually (some series are already in the works). Removing the choice of allocators has already allowed to simplify and optimize the code wiring up the kmalloc APIs to the SLUB implementation. * tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits) mm/slub: free KFENCE objects in slab_free_hook() mm/slub: handle bulk and single object freeing separately mm/slub: introduce __kmem_cache_free_bulk() without free hooks mm/slub: fix bulk alloc and free stats mm/slub: optimize free fast path code layout mm/slub: optimize alloc fastpath code layout mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappers mm/slab: move kmalloc() functions from slab_common.c to slub.c mm/slab: move kmalloc_slab() to mm/slab.h mm/slab: move kfree() from slab_common.c to slub.c mm/slab: move struct kmem_cache_node from slab.h to slub.c mm/slab: move memcg related functions from slab.h to slub.c mm/slab: move pre/post-alloc hooks from slab.h to slub.c mm/slab: consolidate includes in the internal mm/slab.h mm/slab: move the rest of slub_def.h to mm/slab.h mm/slab: move struct kmem_cache_cpu declaration to slub.c mm/slab: remove mm/slab.c and slab_def.h mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs mm/slab: remove CONFIG_SLAB code from slab common code cpu/hotplug: remove CPUHP_SLAB_PREPARE hooks ...
2024-01-09ARM: multi_v7_defconfig: Enable STM32 IPCC mailbox driverMark Brown
The STM32 systems have a mailbox used for communication with non-Linux processors like the M4 on the STM32MP157A. Enable the driver for the mailbox so it is available for testing. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240107-arm-defconfig-stm32-ipcc-v1-1-924721db5661@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-01-09ARM: davinci: always select CONFIG_CPU_ARM926TArnd Bergmann
The select was lost by accident during the multiplatform conversion. Any davinci-only arm-linux-gnueabi-ld: arch/arm/mach-davinci/sleep.o: in function `CACHE_FLUSH': (.text+0x168): undefined reference to `arm926_flush_kern_cache_all' Fixes: f962396ce292 ("ARM: davinci: support multiplatform build for ARM v5") Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Link: https://lore.kernel.org/r/20240108110055.1531153-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-01-08Merge tag 'sched-core-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Energy scheduling: - Consolidate how the max compute capacity is used in the scheduler and how we calculate the frequency for a level of utilization. - Rework interface between the scheduler and the schedutil governor - Simplify the util_est logic Deadline scheduler: - Work more towards reducing SCHED_DEADLINE starvation of low priority tasks (e.g., SCHED_OTHER) tasks when higher priority tasks monopolize CPU cycles, via the introduction of 'deadline servers' (nested/2-level scheduling). "Fair servers" to make use of this facility are not introduced yet. EEVDF: - Introduce O(1) fastpath for EEVDF task selection NUMA balancing: - Tune the NUMA-balancing vma scanning logic some more, to better distribute the probability of a particular vma getting scanned. Plus misc fixes, cleanups and updates" * tag 'sched-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) sched/fair: Fix tg->load when offlining a CPU sched/fair: Remove unused 'next_buddy_marked' local variable in check_preempt_wakeup_fair() sched/fair: Use all little CPUs for CPU-bound workloads sched/fair: Simplify util_est sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true) arm64/amu: Use capacity_ref_freq() to set AMU ratio cpufreq/cppc: Set the frequency used for computing the capacity cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz|khz_to_perf}() energy_model: Use a fixed reference frequency cpufreq/schedutil: Use a fixed reference frequency cpufreq: Use the fixed and coherent frequency for scaling capacity sched/topology: Add a new arch_scale_freq_ref() method freezer,sched: Clean saved_state when restoring it during thaw sched/fair: Update min_vruntime for reweight_entity() correctly sched/doc: Update documentation after renames and synchronize Chinese version sched/cpufreq: Rework iowait boost sched/cpufreq: Rework schedutil governor performance estimation sched/pelt: Avoid underestimation of task utilization sched/timers: Explain why idle task schedules out on remote timer enqueue sched/cpuidle: Comment about timers requirements VS idle handler ...
2024-01-08Merge tag 'perf-core-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: - Add branch stack counters ABI extension to better capture the growing amount of information the PMU exposes via branch stack sampling. There's matching tooling support. - Fix race when creating the nr_addr_filters sysfs file - Add Intel Sierra Forest and Grand Ridge intel/cstate PMU support - Add Intel Granite Rapids, Sierra Forest and Grand Ridge uncore PMU support - Misc cleanups & fixes * tag 'perf-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Factor out topology_gidnid_map() perf/x86/intel/uncore: Fix NULL pointer dereference issue in upi_fill_topology() perf/x86/amd: Reject branch stack for IBS events perf/x86/intel/uncore: Support Sierra Forest and Grand Ridge perf/x86/intel/uncore: Support IIO free-running counters on GNR perf/x86/intel/uncore: Support Granite Rapids perf/x86/uncore: Use u64 to replace unsigned for the uncore offsets array perf/x86/intel/uncore: Generic uncore_get_uncores and MMIO format of SPR perf: Fix the nr_addr_filters fix perf/x86/intel/cstate: Add Grand Ridge support perf/x86/intel/cstate: Add Sierra Forest support x86/smp: Export symbol cpu_clustergroup_mask() perf/x86/intel/cstate: Cleanup duplicate attr_groups perf/core: Fix narrow startup race when creating the perf nr_addr_filters sysfs file perf/x86/intel: Support branch counters logging perf/x86/intel: Reorganize attrs and is_visible perf: Add branch_sample_call_stack perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag perf: Add branch stack counters
2024-01-08arm64: Update __NR_compat_syscalls for statmount/listmountFlorian Fainelli
Commit d8b0f5465012 ("wire up syscalls for statmount/listmount") added two new system calls to arch/arm64/include/asm/unistd32.h but forgot to update the __NR_compat_syscalls number, thus causing the following build failures: arch/arm64/include/asm/unistd32.h:922:24: error: array index in initializer exceeds array bounds 922 | #define __NR_statmount 457 | ^~~ arch/arm64/kernel/sys32.c:130:34: note: in definition of macro '__SYSCALL' 130 | #define __SYSCALL(nr, sym) [nr] = __arm64_##sym, | ^~ Bump up the number by two to accomodate for the new system calls added. Fixes: d8b0f5465012 ("wire up syscalls for statmount/listmount") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-01-08Merge tag 'x86-entry-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry updates from Ingo Molnar: - Optimize common_interrupt_return() - Harden the return-to-user code by making a CONFIG_DEBUG_ENTRY=y check unconditional & moving it closer to the IRET. * tag 'x86-entry-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Harden return-to-user x86/entry: Optimize common_interrupt_return()
2024-01-08Merge tag 'x86-core-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Ingo Molnar: - Add comments about the magic behind the shadow STI before MWAIT in __sti_mwait(). - Fix possible unintended timer delays caused by a race in mwait_idle_with_hints(). * tag 'x86-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram x86: Add a comment about the "magic" behind shadow sti before mwait
2024-01-08Merge tag 'x86-cleanups-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Ingo Molnar: - Change global variables to local - Add missing kernel-doc function parameter descriptions - Remove unused parameter from a macro - Remove obsolete Kconfig entry - Fix comments - Fix typos, mostly scripted, manually reviewed and a micro-optimization got misplaced as a cleanup: - Micro-optimize the asm code in secondary_startup_64_no_verify() * tag 'x86-cleanups-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arch/x86: Fix typos x86/head_64: Use TESTB instead of TESTL in secondary_startup_64_no_verify() x86/docs: Remove reference to syscall trampoline in PTI x86/Kconfig: Remove obsolete config X86_32_SMP x86/io: Remove the unused 'bw' parameter from the BUILDIO() macro x86/mtrr: Document missing function parameters in kernel-doc x86/setup: Make relocated_ramdisk a local variable of relocate_initrd()
2024-01-08Merge tag 'x86-build-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build updates from Ingo Molnar: - Update the objdump & instruction decoder self-test code for better LLVM toolchain compatibility - Rework CONFIG_X86_PAE dependencies, for better readability and higher robustness. - Misc cleanups * tag 'x86-build-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tools: objdump_reformat.awk: Skip bad instructions from llvm-objdump x86/Kconfig: Rework CONFIG_X86_PAE dependency x86/tools: Remove chkobjdump.awk x86/tools: objdump_reformat.awk: Allow for spaces x86/tools: objdump_reformat.awk: Ensure regex matches fwait
2024-01-08Merge tag 'x86-boot-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Ingo Molnar: - Ignore NMIs during very early boot, to address kexec crashes - Remove redundant initialization in boot/string.c's strcmp() * tag 'x86-boot-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Remove redundant initialization of the 'delta' variable in strcmp() x86/boot: Ignore NMIs during very early boot
2024-01-08Merge tag 'x86-asm-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Ingo Molnar: "Replace magic numbers in GDT descriptor definitions & handling: - Introduce symbolic names via macros for descriptor types/fields/flags, and then use these symbolic names. - Clean up definitions a bit, such as GDT_ENTRY_INIT() - Fix/clean up details that became visibly inconsistent after the symbol-based code was introduced: - Unify accessed flag handling - Set the D/B size flag consistently & according to the HW specification" * tag 'x86-asm-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Add DB flag to 32-bit percpu GDT entry x86/asm: Always set A (accessed) flag in GDT descriptors x86/asm: Replace magic numbers in GDT descriptors, script-generated change x86/asm: Replace magic numbers in GDT descriptors, preparations x86/asm: Provide new infrastructure for GDT descriptors
2024-01-08Merge tag 'x86-apic-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Ingo Molnar: - Clean up 'struct apic': - Drop ::delivery_mode - Drop 'enum apic_delivery_modes' - Drop 'struct local_apic' - Fix comments * tag 'x86-apic-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ioapic: Remove unfinished sentence from comment x86/apic: Drop struct local_apic x86/apic: Drop enum apic_delivery_modes x86/apic: Drop apic::delivery_mode
2024-01-08Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "CPU features: - Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye olde Thunder-X machines - Avoid mapping KPTI trampoline when it is not required - Make CPU capability API more robust during early initialisation Early idreg overrides: - Remove dependencies on core kernel helpers from the early command-line parsing logic in preparation for moving this code before the kernel is mapped FPsimd: - Restore kernel-mode fpsimd context lazily, allowing us to run fpsimd code sequences in the kernel with pre-emption enabled KBuild: - Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y - Makefile cleanups LPA2 prep: - Preparatory work for enabling the 'LPA2' extension, which will introduce 52-bit virtual and physical addressing even with 4KiB pages (including for KVM guests). Misc: - Remove dead code and fix a typo MM: - Pass NUMA node information for IRQ stack allocations Perf: - Add perf support for the Synopsys DesignWare PCIe PMU - Add support for event counting thresholds (FEAT_PMUv3_TH) introduced in Armv8.8 - Add support for i.MX8DXL SoCs to the IMX DDR PMU driver. - Minor PMU driver fixes and optimisations RIP VPIPT: - Remove what support we had for the obsolete VPIPT I-cache policy Selftests: - Improvements to the SVE and SME selftests Stacktrace: - Refactor kernel unwind logic so that it can used by BPF unwinding and, eventually, reliable backtracing Sysregs: - Update a bunch of register definitions based on the latest XML drop from Arm" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits) kselftest/arm64: Don't probe the current VL for unsupported vector types efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad arm64: properly install vmlinuz.efi arm64/sysreg: Add missing system instruction definitions for FGT arm64/sysreg: Add missing system register definitions for FGT arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1 arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1 arm64: memory: remove duplicated include arm: perf: Fix ARCH=arm build with GCC arm64: Align boot cpucap handling with system cpucap handling arm64: Cleanup system cpucap handling MAINTAINERS: add maintainers for DesignWare PCIe PMU driver drivers/perf: add DesignWare PCIe PMU driver PCI: Move pci_clear_and_set_dword() helper to PCI header PCI: Add Alibaba Vendor ID to linux/pci_ids.h docs: perf: Add description for Synopsys DesignWare PCIe PMU driver arm64: irq: set the correct node for shadow call stack Revert "perf/arm_dmc620: Remove duplicate format attribute #defines" arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD arm64: fpsimd: Preserve/restore kernel mode NEON at context switch ...
2024-01-08Merge tag 'm68k-for-v6.8-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - make the NuBus bus type static and constant - defconfig updates * tag 'm68k-for-v6.8-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: defconfig: Update defconfigs for v6.7-rc1 nubus: Make nubus_bus_type static and constant
2024-01-08Merge tag 'powerpc-6.8-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add initial support to recognise the HeXin C2000 processor. - Add papr-vpd and papr-sysparm character device drivers for VPD & sysparm retrieval, so userspace tools can be adapted to avoid doing raw firmware calls from userspace. - Sched domains optimisations for shared processor partitions on P9/P10. - A series of optimisations for KVM running as a nested HV under PowerVM. - Other small features and fixes. Thanks to Aditya Gupta, Aneesh Kumar K.V, Arnd Bergmann, Christophe Leroy, Colin Ian King, Dario Binacchi, David Heidelberg, Geoff Levand, Gustavo A. R. Silva, Haoran Liu, Jordan Niethe, Kajol Jain, Kevin Hao, Kunwu Chan, Li kunyu, Li zeming, Masahiro Yamada, Michal Suchánek, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Randy Dunlap, Sathvika Vasireddy, Srikar Dronamraju, Stephen Rothwell, Vaibhav Jain, and Zhao Ke. * tag 'powerpc-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (96 commits) powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2 powerpc/86xx: Drop unused CONFIG_MPC8610 powerpc/powernv: Add error handling to opal_prd_range_is_valid selftests/powerpc: Fix spelling mistake "EACCESS" -> "EACCES" powerpc/hvcall: Reorder Nestedv2 hcall opcodes powerpc/ps3: Add missing set_freezable() for ps3_probe_thread() powerpc/mpc83xx: Use wait_event_freezable() for freezable kthread powerpc/mpc83xx: Add the missing set_freezable() for agent_thread_fn() powerpc/fsl: Fix fsl,tmu-calibration to match the schema powerpc/smp: Dynamically build Powerpc topology powerpc/smp: Avoid asym packing within thread_group of a core powerpc/smp: Add __ro_after_init attribute powerpc/smp: Disable MC domain for shared processor powerpc/smp: Enable Asym packing for cores on shared processor powerpc/sched: Cleanup vcpu_is_preempted() powerpc: add cpu_spec.cpu_features to vmcoreinfo powerpc/imc-pmu: Add a null pointer check in update_events_in_group() powerpc/powernv: Add a null pointer check in opal_powercap_init() powerpc/powernv: Add a null pointer check in opal_event_init() powerpc/powernv: Add a null pointer check to scom_debug_init_one() ...
2024-01-08Merge tag 'ras_core_for_v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 RAS updates from Borislav Petkov: - Convert the hw error storm handling into a finer-grained, per-bank solution which allows for more timely detection and reporting of errors - Start a documentation section which will hold down relevant RAS features description and how they should be used - Add new AMD error bank types - Slim down and remove error type descriptions from the kernel side of error decoding to rasdaemon which can be used from now on to decode hw errors on AMD - Mark pages containing uncorrectable errors as poison so that kdump can avoid them and thus not cause another panic - The usual cleanups and fixlets * tag 'ras_core_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Handle Intel threshold interrupt storms x86/mce: Add per-bank CMCI storm mitigation x86/mce: Remove old CMCI storm mitigation code Documentation: Begin a RAS section x86/MCE/AMD: Add new MA_LLC, USR_DP, and USR_CP bank types EDAC/mce_amd: Remove SMCA Extended Error code descriptions x86/mce/amd, EDAC/mce_amd: Move long names to decoder module x86/mce/inject: Clear test status value x86/mce: Remove redundant check from mce_device_create() x86/mce: Mark fatal MCE's page as poison to avoid panic in the kdump kernel
2024-01-08Merge tag 'x86_cpu_for_v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu feature updates from Borislav Petkov: - Add synthetic X86_FEATURE flags for the different AMD Zen generations and use them everywhere instead of ad-hoc family/model checks. Drop an ancient AMD errata checking facility as a result - Fix a fragile initcall ordering in intel_epb - Do not issue the MFENCE+LFENCE barrier for the TSC deadline and X2APIC MSRs on AMD as it is not needed there * tag 'x86_cpu_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Add X86_FEATURE_ZEN1 x86/CPU/AMD: Drop now unused CPU erratum checking function x86/CPU/AMD: Get rid of amd_erratum_1485[] x86/CPU/AMD: Get rid of amd_erratum_400[] x86/CPU/AMD: Get rid of amd_erratum_383[] x86/CPU/AMD: Get rid of amd_erratum_1054[] x86/CPU/AMD: Move the DIV0 bug detection to the Zen1 init function x86/CPU/AMD: Move Zenbleed check to the Zen2 init function x86/CPU/AMD: Rename init_amd_zn() to init_amd_zen_common() x86/CPU/AMD: Call the spectral chicken in the Zen2 init function x86/CPU/AMD: Move erratum 1076 fix into the Zen1 init function x86/CPU/AMD: Move the Zen3 BTC_NO detection to the Zen3 init function x86/CPU/AMD: Carve out the erratum 1386 fix x86/CPU/AMD: Add ZenX generations flags x86/cpu/intel_epb: Don't rely on link order x86/barrier: Do not serialize MSR accesses on AMD
2024-01-08Merge tag 'x86_sev_for_v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: - Convert the sev-guest plaform ->remove callback to return void - Move the SEV C-bit verification to the BSP as it needs to happen only once and not on every AP * tag 'x86_sev_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: virt: sev-guest: Convert to platform remove callback returning void x86/sev: Do the C-bit verification only on the BSP
2024-01-08mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDERKirill A. Shutemov
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08mm, treewide: introduce NR_PAGE_ORDERSKirill A. Shutemov
NR_PAGE_ORDERS defines the number of page orders supported by the page allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total. NR_PAGE_ORDERS assists in defining arrays of page orders and allows for more natural iteration over them. [kirill.shutemov@linux.intel.com: fixup for kerneldoc warning] Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box Link: https://lkml.kernel.org/r/20231228144704.14033-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-08Merge tag 'x86_paravirt_for_v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - Replace the paravirt patching functionality using the alternatives infrastructure and remove the former - Misc other improvements * tag 'x86_paravirt_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternative: Correct feature bit debug output x86/paravirt: Remove no longer needed paravirt patching code x86/paravirt: Switch mixed paravirt/alternative calls to alternatives x86/alternative: Add indirect call patching x86/paravirt: Move some functions and defines to alternative.c x86/paravirt: Introduce ALT_NOT_XEN x86/paravirt: Make the struct paravirt_patch_site packed x86/paravirt: Use relative reference for the original instruction offset
2024-01-08Merge tag 'x86_misc_for_v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: - Add an informational message which gets issued when IA32 emulation has been disabled on the cmdline - Clarify in detail how /proc/cpuinfo is used on x86 - Fix a theoretical overflow in num_digits() * tag 'x86_misc_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ia32: State that IA32 emulation is disabled Documentation/x86: Document what /proc/cpuinfo is for x86/lib: Fix overflow when counting digits
2024-01-08Merge tag 'x86_microcode_for_v6.8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode updates from Borislav Petkov: - Correct minor issues after the microcode revision reporting sanitization * tag 'x86_microcode_for_v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/intel: Set new revision only after a successful update x86/microcode/intel: Remove redundant microcode late updated message
2024-01-08Merge tag 'vfs-6.8.mount' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: "This contains the work to retrieve detailed information about mounts via two new system calls. This is hopefully the beginning of the end of the saga that started with fsinfo() years ago. The LWN articles in [1] and [2] can serve as a summary so we can avoid rehashing everything here. At LSFMM in May 2022 we got into a room and agreed on what we want to do about fsinfo(). Basically, split it into pieces. This is the first part of that agreement. Specifically, it is concerned with retrieving information about mounts. So this only concerns the mount information retrieval, not the mount table change notification, or the extended filesystem specific mount option work. That is separate work. Currently mounts have a 32bit id. Mount ids are already in heavy use by libmount and other low-level userspace but they can't be relied upon because they're recycled very quickly. We agreed that mounts should carry a unique 64bit id by which they can be referenced directly. This is now implemented as part of this work. The new 64bit mount id is exposed in statx() through the new STATX_MNT_ID_UNIQUE flag. If the flag isn't raised the old mount id is returned. If it is raised and the kernel supports the new 64bit mount id the flag is raised in the result mask and the new 64bit mount id is returned. New and old mount ids do not overlap so they cannot be conflated. Two new system calls are introduced that operate on the 64bit mount id: statmount() and listmount(). A summary of the api and usage can be found on LWN as well (cf. [3]) but of course, I'll provide a summary here as well. Both system calls rely on struct mnt_id_req. Which is the request struct used to pass the 64bit mount id identifying the mount to operate on. It is extensible to allow for the addition of new parameters and for future use in other apis that make use of mount ids. statmount() mimicks the semantics of statx() and exposes a set flags that userspace may raise in mnt_id_req to request specific information to be retrieved. A statmount() call returns a struct statmount filled in with information about the requested mount. Supported requests are indicated by raising the request flag passed in struct mnt_id_req in the @mask argument in struct statmount. Currently we do support: - STATMOUNT_SB_BASIC: Basic filesystem info - STATMOUNT_MNT_BASIC Mount information (mount id, parent mount id, mount attributes etc) - STATMOUNT_PROPAGATE_FROM Propagation from what mount in current namespace - STATMOUNT_MNT_ROOT Path of the root of the mount (e.g., mount --bind /bla /mnt returns /bla) - STATMOUNT_MNT_POINT Path of the mount point (e.g., mount --bind /bla /mnt returns /mnt) - STATMOUNT_FS_TYPE Name of the filesystem type as the magic number isn't enough due to submounts The string options STATMOUNT_MNT_{ROOT,POINT} and STATMOUNT_FS_TYPE are appended to the end of the struct. Userspace can use the offsets in @fs_type, @mnt_root, and @mnt_point to reference those strings easily. The struct statmount reserves quite a bit of space currently for future extensibility. This isn't really a problem and if this bothers us we can just send a follow-up pull request during this cycle. listmount() is given a 64bit mount id via mnt_id_req just as statmount(). It takes a buffer and a size to return an array of the 64bit ids of the child mounts of the requested mount. Userspace can thus choose to either retrieve child mounts for a mount in batches or iterate through the child mounts. For most use-cases it will be sufficient to just leave space for a few child mounts. But for big mount tables having an iterator is really helpful. Iterating through a mount table works by setting @param in mnt_id_req to the mount id of the last child mount retrieved in the previous listmount() call" Link: https://lwn.net/Articles/934469 [1] Link: https://lwn.net/Articles/829212 [2] Link: https://lwn.net/Articles/950569 [3] * tag 'vfs-6.8.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: add selftest for statmount/listmount fs: keep struct mnt_id_req extensible wire up syscalls for statmount/listmount add listmount(2) syscall statmount: simplify string option retrieval statmount: simplify numeric option retrieval add statmount(2) syscall namespace: extract show_path() helper mounts: keep list of mounts in an rbtree add unique mount ID
2024-01-08Merge tag 'vfs-6.8.misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual fses. Features: - Add Jan Kara as VFS reviewer - Show correct device and inode numbers in proc/<pid>/maps for vma files on stacked filesystems. This is now easily doable thanks to the backing file work from the last cycles. This comes with selftests Cleanups: - Remove a redundant might_sleep() from wait_on_inode() - Initialize pointer with NULL, not 0 - Clarify comment on access_override_creds() - Rework and simplify eventfd_signal() and eventfd_signal_mask() helpers - Process aio completions in batches to avoid needless wakeups - Completely decouple struct mnt_idmap from namespaces. We now only keep the actual idmapping around and don't stash references to namespaces - Reformat maintainer entries to indicate that a given subsystem belongs to fs/ - Simplify fput() for files that were never opened - Get rid of various pointless file helpers - Rename various file helpers - Rename struct file members after SLAB_TYPESAFE_BY_RCU switch from last cycle - Make relatime_need_update() return bool - Use GFP_KERNEL instead of GFP_USER when allocating superblocks - Replace deprecated ida_simple_*() calls with their current ida_*() counterparts Fixes: - Fix comments on user namespace id mapping helpers. They aren't kernel doc comments so they shouldn't be using /** - s/Retuns/Returns/g in various places - Add missing parameter documentation on can_move_mount_beneath() - Rename i_mapping->private_data to i_mapping->i_private_data - Fix a false-positive lockdep warning in pipe_write() for watch queues - Improve __fget_files_rcu() code generation to improve performance - Only notify writer that pipe resizing has finished after setting pipe->max_usage otherwise writers are never notified that the pipe has been resized and hang - Fix some kernel docs in hfsplus - s/passs/pass/g in various places - Fix kernel docs in ntfs - Fix kcalloc() arguments order reported by gcc 14 - Fix uninitialized value in reiserfs" * tag 'vfs-6.8.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (36 commits) reiserfs: fix uninit-value in comp_keys watch_queue: fix kcalloc() arguments order ntfs: dir.c: fix kernel-doc function parameter warnings fs: fix doc comment typo fs tree wide selftests/overlayfs: verify device and inode numbers in /proc/pid/maps fs/proc: show correct device and inode numbers in /proc/pid/maps eventfd: Remove usage of the deprecated ida_simple_xx() API fs: super: use GFP_KERNEL instead of GFP_USER for super block allocation fs/hfsplus: wrapper.c: fix kernel-doc warnings fs: add Jan Kara as reviewer fs/inode: Make relatime_need_update return bool pipe: wakeup wr_wait after setting max_usage file: remove __receive_fd() file: stop exposing receive_fd_user() fs: replace f_rcuhead with f_task_work file: remove pointless wrapper file: s/close_fd_get_file()/file_close_fd()/g Improve __fget_files_rcu() code generation (and thus __fget_light()) file: massage cleanup of files that failed to open fs/pipe: Fix lockdep false-positive in watchqueue pipe_write() ...
2024-01-08x86/kvm: Do not try to disable kvmclock if it was not enabledKirill A. Shutemov
kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is present in the VM. It leads to write to a MSR that doesn't exist on some configurations, namely in TDX guest: unchecked MSR access error: WRMSR to 0x12 (tried to write 0x0000000000000000) at rIP: 0xffffffff8110687c (kvmclock_disable+0x1c/0x30) kvmclock enabling is gated by CLOCKSOURCE and CLOCKSOURCE2 KVM paravirt features. Do not disable kvmclock if it was not enabled. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Fixes: c02027b5742b ("x86/kvm: Disable kvmclock on all CPUs on shutdown") Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: stable@vger.kernel.org Message-Id: <20231205004510.27164-6-kirill.shutemov@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-08Merge tag 'kvm-x86-mmu-6.8' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM x86 MMU changes for 6.8: - Fix a relatively benign off-by-one error when splitting huge pages during CLEAR_DIRTY_LOG. - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE. - Relax the TDP MMU's lockdep assertions related to holding mmu_lock for read versus write so that KVM doesn't pass "bool shared" all over the place just to have precise assertions in paths that don't actually care about whether the caller is a reader or a writer.
2024-01-08Merge tag 'kvm-x86-xen-6.8' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM Xen change for 6.8: To workaround Xen guests that don't expect Xen PV clocks to be marked as being based on a stable TSC, add a Xen config knob to allow userspace to opt out of KVM setting the "TSC stable" bit in Xen PV clocks. Note, the "TSC stable" bit was added to the PVCLOCK ABI by KVM without an ack from Xen, i.e. KVM isn't entirely blameless for the buggy guest behavior.