summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-06-05Merge branches 'acpi-ec', 'acpi-apei' and 'pnp'Rafael J. Wysocki
Merge ACPI EC driver fixes, an ACPI APEI fix and PNP fixes for 6.10-rc3: - Fix error handling during EC operation region accesses in the ACPI EC driver (Armin Wolf). - Fix a memory leak in the APEI error injection driver introduced during its converion to a platform driver (Dan Williams). - Fix build failures related to the dev_is_pnp() macro by redefining it as a proper function and exporting it to modules as appropriate and unexport pnp_bus_type which need not be exported any more (Andy Shevchenko). * acpi-ec: ACPI: EC: Avoid returning AE_OK on errors in address space handler ACPI: EC: Abort address space access upon error * acpi-apei: ACPI: APEI: EINJ: Fix einj_dev release leak * pnp: PNP: Hide pnp_bus_type from the non-PNP code PNP: Make dev_is_pnp() to be a function and export it for modules
2024-06-05bcachefs: Fix trans->locked assertKent Overstreet
in bch2_move_data_btree, we might start with the trans unlocked from a previous loop iteration - we need a trans_begin() before iter_init(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-05bcachefs: Rereplicate now moves data off of durability=0 devicesKent Overstreet
This fixes an issue where setting a device to durability=0 after it's been used makes it impossible to remove. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-05bcachefs: Fix GFP_KERNEL allocation in break_cycle()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-05perf bpf: Fix handling of minimal vmlinux.h file when interrupting the buildNamhyung Kim
Ingo reported that he was seeing these when hitting Control+C during a perf tools build: Makefile.perf:1149: *** Missing bpftool input for generating vmlinux.h. Stop. The failure happens when you don't have vmlinux.h or vmlinux with BTF. ifeq ($(VMLINUX_H),) ifeq ($(VMLINUX_BTF),) $(error Missing bpftool input for generating vmlinux.h) endif endif VMLINUX_BTF can be empty if you didn't build a kernel or it doesn't have a BTF section and the current kernel also has no BTF. This is totally ok. But VMLINUX_H should be set to the minimal version in the source tree (unless you overwrite it manually) when you don't pass GEN_VMLINUX_H=1 (which requires VMLINUX_BTF should not be empty). The problem is that it's defined in Makefile.config which is not included for `make clean`. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lore.kernel.org/lkml/CAM9d7ch5HTr+k+_GpbMrX0HUo5BZ11byh1xq0Two7B7RQACuNw@mail.gmail.com Link: http://lore.kernel.org/lkml/ZjssGrj+abyC6mYP@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-05Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event"Arnaldo Carvalho de Melo
This reverts commit 7d1405c71df21f6c394b8a885aa8a133f749fa22. This causes segfaults in some cases, as reported by Milian: ``` sudo /usr/bin/perf record -z --call-graph dwarf -e cycles -e raw_syscalls:sys_enter ls ... [ perf record: Woken up 3 times to write data ] malloc(): invalid next size (unsorted) Aborted ``` Backtrace with GDB + debuginfod: ``` malloc(): invalid next size (unsorted) Thread 1 "perf" received signal SIGABRT, Aborted. __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 Downloading source file /usr/src/debug/glibc/glibc/nptl/pthread_kill.c 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007ffff6ea8eb3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x00007ffff6e50a30 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/ raise.c:26 #3 0x00007ffff6e384c3 in __GI_abort () at abort.c:79 #4 0x00007ffff6e39354 in __libc_message_impl (fmt=fmt@entry=0x7ffff6fc22ea "%s\n") at ../sysdeps/posix/libc_fatal.c:132 #5 0x00007ffff6eb3085 in malloc_printerr (str=str@entry=0x7ffff6fc5850 "malloc(): invalid next size (unsorted)") at malloc.c:5772 #6 0x00007ffff6eb657c in _int_malloc (av=av@entry=0x7ffff6ff6ac0 <main_arena>, bytes=bytes@entry=368) at malloc.c:4081 #7 0x00007ffff6eb877e in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3754 #8 0x000055555569bdb6 in perf_session.do_write_header () #9 0x00005555555a373a in __cmd_record.constprop.0 () #10 0x00005555555a6846 in cmd_record () #11 0x000055555564db7f in run_builtin () #12 0x000055555558ed77 in main () ``` Valgrind memcheck: ``` ==45136== Invalid write of size 8 ==45136== at 0x2B38A5: perf_event__synthesize_id_sample (in /usr/bin/perf) ==45136== by 0x157069: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd ==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675) ==45136== by 0x3574AB: zalloc (in /usr/bin/perf) ==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== ==45136== Syscall param write(buf) points to unaddressable byte(s) ==45136== at 0x575953D: __libc_write (write.c:26) ==45136== by 0x575953D: write (write.c:24) ==45136== by 0x35761F: ion (in /usr/bin/perf) ==45136== by 0x357778: writen (in /usr/bin/perf) ==45136== by 0x1548F7: record__write (in /usr/bin/perf) ==45136== by 0x15708A: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd ==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675) ==45136== by 0x3574AB: zalloc (in /usr/bin/perf) ==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== ----- Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/ Reported-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@kernel.org # 6.8+ Link: https://lore.kernel.org/lkml/Zl9ksOlHJHnKM70p@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-05locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldocCarlos Llamas
For ${atomic}_sub_and_test() the @i parameter is the value to subtract, not add. Fix the typo in the kerneldoc template and generate the headers with this update. Fixes: ad8110706f38 ("locking/atomic: scripts: generate kerneldoc comments") Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20240515133844.3502360-1-cmllamas@google.com
2024-06-05perf/core: Fix missing wakeup when waiting for context referenceHaifeng Xu
In our production environment, we found many hung tasks which are blocked for more than 18 hours. Their call traces are like this: [346278.191038] __schedule+0x2d8/0x890 [346278.191046] schedule+0x4e/0xb0 [346278.191049] perf_event_free_task+0x220/0x270 [346278.191056] ? init_wait_var_entry+0x50/0x50 [346278.191060] copy_process+0x663/0x18d0 [346278.191068] kernel_clone+0x9d/0x3d0 [346278.191072] __do_sys_clone+0x5d/0x80 [346278.191076] __x64_sys_clone+0x25/0x30 [346278.191079] do_syscall_64+0x5c/0xc0 [346278.191083] ? syscall_exit_to_user_mode+0x27/0x50 [346278.191086] ? do_syscall_64+0x69/0xc0 [346278.191088] ? irqentry_exit_to_user_mode+0x9/0x20 [346278.191092] ? irqentry_exit+0x19/0x30 [346278.191095] ? exc_page_fault+0x89/0x160 [346278.191097] ? asm_exc_page_fault+0x8/0x30 [346278.191102] entry_SYSCALL_64_after_hwframe+0x44/0xae The task was waiting for the refcount become to 1, but from the vmcore, we found the refcount has already been 1. It seems that the task didn't get woken up by perf_event_release_kernel() and got stuck forever. The below scenario may cause the problem. Thread A Thread B ... ... perf_event_free_task perf_event_release_kernel ... acquire event->child_mutex ... get_ctx ... release event->child_mutex acquire ctx->mutex ... perf_free_event (acquire/release event->child_mutex) ... release ctx->mutex wait_var_event acquire ctx->mutex acquire event->child_mutex # move existing events to free_list release event->child_mutex release ctx->mutex put_ctx ... ... In this case, all events of the ctx have been freed, so we couldn't find the ctx in free_list and Thread A will miss the wakeup. It's thus necessary to add a wakeup after dropping the reference. Fixes: 1cf8dfe8a661 ("perf/core: Fix race between close() and fork()") Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20240513103948.33570-1-haifeng.xu@shopee.com
2024-06-05KVM: Fix a data race on last_boosted_vcpu in kvm_vcpu_on_spin()Breno Leitao
Use {READ,WRITE}_ONCE() to access kvm->last_boosted_vcpu to ensure the loads and stores are atomic. In the extremely unlikely scenario the compiler tears the stores, it's theoretically possible for KVM to attempt to get a vCPU using an out-of-bounds index, e.g. if the write is split into multiple 8-bit stores, and is paired with a 32-bit load on a VM with 257 vCPUs: CPU0 CPU1 last_boosted_vcpu = 0xff; (last_boosted_vcpu = 0x100) last_boosted_vcpu[15:8] = 0x01; i = (last_boosted_vcpu = 0x1ff) last_boosted_vcpu[7:0] = 0x00; vcpu = kvm->vcpu_array[0x1ff]; As detected by KCSAN: BUG: KCSAN: data-race in kvm_vcpu_on_spin [kvm] / kvm_vcpu_on_spin [kvm] write to 0xffffc90025a92344 of 4 bytes by task 4340 on cpu 16: kvm_vcpu_on_spin (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4112) kvm handle_pause (arch/x86/kvm/vmx/vmx.c:5929) kvm_intel vmx_handle_exit (arch/x86/kvm/vmx/vmx.c:? arch/x86/kvm/vmx/vmx.c:6606) kvm_intel vcpu_run (arch/x86/kvm/x86.c:11107 arch/x86/kvm/x86.c:11211) kvm kvm_arch_vcpu_ioctl_run (arch/x86/kvm/x86.c:?) kvm kvm_vcpu_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:?) kvm __se_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:904 fs/ioctl.c:890) __x64_sys_ioctl (fs/ioctl.c:890) x64_sys_call (arch/x86/entry/syscall_64.c:33) do_syscall_64 (arch/x86/entry/common.c:?) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) read to 0xffffc90025a92344 of 4 bytes by task 4342 on cpu 4: kvm_vcpu_on_spin (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4069) kvm handle_pause (arch/x86/kvm/vmx/vmx.c:5929) kvm_intel vmx_handle_exit (arch/x86/kvm/vmx/vmx.c:? arch/x86/kvm/vmx/vmx.c:6606) kvm_intel vcpu_run (arch/x86/kvm/x86.c:11107 arch/x86/kvm/x86.c:11211) kvm kvm_arch_vcpu_ioctl_run (arch/x86/kvm/x86.c:?) kvm kvm_vcpu_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:?) kvm __se_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:904 fs/ioctl.c:890) __x64_sys_ioctl (fs/ioctl.c:890) x64_sys_call (arch/x86/entry/syscall_64.c:33) do_syscall_64 (arch/x86/entry/common.c:?) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) value changed: 0x00000012 -> 0x00000000 Fixes: 217ece6129f2 ("KVM: use yield_to instead of sleep in kvm_vcpu_on_spin") Cc: stable@vger.kernel.org Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240510092353.2261824-1-leitao@debian.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-05KVM: selftests: x86: Prioritize getting max_gfn from GuestPhysBitsTao Su
Use the max mappable GPA via GuestPhysBits advertised by KVM to calculate max_gfn. Currently some selftests (e.g. access_tracking_perf_test, dirty_log_test...) add RAM regions close to max_gfn, so guest may access GPA beyond its mappable range and cause infinite loop. Adjust max_gfn in vm_compute_max_gfn() since x86 selftests already overrides vm_compute_max_gfn() specifically to deal with goofy edge cases. Reported-by: Yi Lai <yi1.lai@intel.com> Signed-off-by: Tao Su <tao1.su@linux.intel.com> Tested-by: Yi Lai <yi1.lai@intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Link: https://lore.kernel.org/r/20240513014003.104593-1-tao1.su@linux.intel.com [sean: tweak name, add comment and sanity check] Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-05KVM: selftests: Fix shift of 32 bit unsigned int more than 32 bitsColin Ian King
Currrentl a 32 bit 1u value is being shifted more than 32 bits causing overflow and incorrect checking of bits 32-63. Fix this by using the BIT_ULL macro for shifting bits. Detected by cppcheck: sev_init2_tests.c:108:34: error: Shifting 32-bit value by 63 bits is undefined behaviour [shiftTooManyBits] Fixes: dfc083a181ba ("selftests: kvm: add tests for KVM_SEV_INIT2") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20240523154102.2236133-1-colin.i.king@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-05Merge branch 'mlx5-fixes'David S. Miller
Tariq Toukan says: ==================== mlx5 core fixes 20240603 This small patchset provides two bug fixes from the team to the mlx5 core driver. Series generated against: commit 33700a0c9b56 ("net/tcp: Don't consider TCP_CLOSE in TCP_AO_ESTABLISHED") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net/mlx5: Always stop health timer during driver removalShay Drory
Currently, if teardown_hca fails to execute during driver removal, mlx5 does not stop the health timer. Afterwards, mlx5 continue with driver teardown. This may lead to a UAF bug, which results in page fault Oops[1], since the health timer invokes after resources were freed. Hence, stop the health monitor even if teardown_hca fails. [1] mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: cleanup mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup BUG: unable to handle page fault for address: ffffa26487064230 PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 RIP: 0010:ioread32be+0x34/0x60 RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? exc_page_fault+0x175/0x180 ? asm_exc_page_fault+0x26/0x30 ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? ioread32be+0x34/0x60 mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] poll_health+0x42/0x230 [mlx5_core] ? __next_timer_interrupt+0xbc/0x110 ? __pfx_poll_health+0x10/0x10 [mlx5_core] call_timer_fn+0x21/0x130 ? __pfx_poll_health+0x10/0x10 [mlx5_core] __run_timers+0x222/0x2c0 run_timer_softirq+0x1d/0x40 __do_softirq+0xc9/0x2c8 __irq_exit_rcu+0xa6/0xc0 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:cpuidle_enter_state+0xcc/0x440 ? cpuidle_enter_state+0xbd/0x440 cpuidle_enter+0x2d/0x40 do_idle+0x20d/0x270 cpu_startup_entry+0x2a/0x30 rest_init+0xd0/0xd0 arch_call_rest_init+0xe/0x30 start_kernel+0x709/0xa90 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x96/0xa0 secondary_startup_64_no_verify+0x18f/0x19b ---[ end trace 0000000000000000 ]--- Fixes: 9b98d395b85d ("net/mlx5: Start health poll at earlier stage of driver load") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net/mlx5: Stop waiting for PCI if pci channel is offlineMoshe Shemesh
In case pci channel becomes offline the driver should not wait for PCI reads during health dump and recovery flow. The driver has timeout for each of these loops trying to read PCI, so it would fail anyway. However, in case of recovery waiting till timeout may cause the pci error_detected() callback fail to meet pci_dpc_recovered() wait timeout. Fixes: b3bd076f7501 ("net/mlx5: Report devlink health on FW fatal issues") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net: ethernet: mtk_eth_soc: handle dma buffer size soc specificFrank Wunderlich
The mainline MTK ethernet driver suffers long time from rarly but annoying tx queue timeouts. We think that this is caused by fixed dma sizes hardcoded for all SoCs. We suspect this problem arises from a low level of free TX DMADs, the TX Ring alomost full. The transmit timeout is caused by the Tx queue not waking up. The Tx queue stops when the free counter is less than ring->thres, and it will wake up once the free counter is greater than ring->thres. If the CPU is too late to wake up the Tx queues, it may cause a transmit timeout. Therefore, we increased the TX and RX DMADs to improve this error situation. Use the dma-size implementation from SDK in a per SoC manner. In difference to SDK we have no RSS feature yet, so all RX/TX sizes should be raised from 512 to 2048 byte except fqdma on mt7988 to avoid the tx timeout issue. Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet") Suggested-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05arm64/io: add constant-argument checkArnd Bergmann
In some configurations __const_iowrite32_copy() does not get inlined and gcc runs into the BUILD_BUG(): In file included from <command-line>: In function '__const_memcpy_toio_aligned32', inlined from '__const_iowrite32_copy' at arch/arm64/include/asm/io.h:203:3, inlined from '__const_iowrite32_copy' at arch/arm64/include/asm/io.h:199:20: include/linux/compiler_types.h:487:45: error: call to '__compiletime_assert_538' declared with attribute error: BUILD_BUG failed 487 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ include/linux/compiler_types.h:468:25: note: in definition of macro '__compiletime_assert' 468 | prefix ## suffix(); \ | ^~~~~~ include/linux/compiler_types.h:487:9: note: in expansion of macro '_compiletime_assert' 487 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~~~~~~~~~~~~~~~~~ include/linux/build_bug.h:59:21: note: in expansion of macro 'BUILD_BUG_ON_MSG' 59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed") | ^~~~~~~~~~~~~~~~ arch/arm64/include/asm/io.h:193:17: note: in expansion of macro 'BUILD_BUG' 193 | BUILD_BUG(); | ^~~~~~~~~ Move the check for constant arguments into the inline function to ensure it is still constant if the compiler decides against inlining it, and mark them as __always_inline to override the logic that sometimes leads to the compiler not producing the simplified output. Note that either the __always_inline annotation or the check for a constant value are sufficient here, but combining the two looks cleaner as it also avoids the macro. With clang-8 and older, the macro was still needed, but all versions of gcc and clang can reliably perform constant folding here. Fixes: ead79118dae6 ("arm64/io: Provide a WC friendly __iowriteXX_copy()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240604210006.668912-1-arnd@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-06-05rtnetlink: make the "split" NLM_DONE handling genericJakub Kicinski
Jaroslav reports Dell's OMSA Systems Management Data Engine expects NLM_DONE in a separate recvmsg(), both for rtnl_dump_ifinfo() and inet_dump_ifaddr(). We already added a similar fix previously in commit 460b0d33cf10 ("inet: bring NLM_DONE out to a separate recv() again") Instead of modifying all the dump handlers, and making them look different than modern for_each_netdev_dump()-based dump handlers - put the workaround in rtnetlink code. This will also help us move the custom rtnl-locking from af_netlink in the future (in net-next). Note that this change is not touching rtnl_dump_all(). rtnl_dump_all() is different kettle of fish and a potential problem. We now mix families in a single recvmsg(), but NLM_DONE is not coalesced. Tested: ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_addr.yaml \ --dump getaddr --json '{"ifa-family": 2}' ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_route.yaml \ --dump getroute --json '{"rtm-family": 2}' ./cli.py --dbg-small-recv 4096 --spec netlink/specs/rt_link.yaml \ --dump getlink Fixes: 3e41af90767d ("rtnetlink: use xarray iterator to implement rtnl_dump_ifinfo()") Fixes: cdb2f80f1c10 ("inet: use xa_array iterator to implement inet_dump_ifaddr()") Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Link: https://lore.kernel.org/all/CAK8fFZ7MKoFSEzMBDAOjoUt+vTZRRQgLDNXEOfdCCXSoXXKE0g@mail.gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05Merge branch 'tcp-mptcp-close-wait'David S. Miller
Jason Xing says: ==================== tcp/mptcp: count CLOSE-WAIT for CurrEstab Taking CLOSE-WAIT sockets into CurrEstab counters is in accordance with RFC 1213, as suggested by Eric and Neal. v5 Link: https://lore.kernel.org/all/20240531091753.75930-1-kerneljasonxing@gmail.com/ 1. add more detailed comment (Matthieu) v4 Link: https://lore.kernel.org/all/20240530131308.59737-1-kerneljasonxing@gmail.com/ 1. correct the Fixes: tag in patch [2/2]. (Eric) Previous discussion Link: https://lore.kernel.org/all/20240529033104.33882-1-kerneljasonxing@gmail.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05mptcp: count CLOSE-WAIT sockets for MPTCP_MIB_CURRESTABJason Xing
Like previous patch does in TCP, we need to adhere to RFC 1213: "tcpCurrEstab OBJECT-TYPE ... The number of TCP connections for which the current state is either ESTABLISHED or CLOSE- WAIT." So let's consider CLOSE-WAIT sockets. The logic of counting When we increment the counter? a) Only if we change the state to ESTABLISHED. When we decrement the counter? a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT, say, on the client side, changing from ESTABLISHED to FIN-WAIT-1. b) if the socket leaves CLOSE-WAIT, say, on the server side, changing from CLOSE-WAIT to LAST-ACK. Fixes: d9cd27b8cd19 ("mptcp: add CurrEstab MIB counter support") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05tcp: count CLOSE-WAIT sockets for TCP_MIB_CURRESTABJason Xing
According to RFC 1213, we should also take CLOSE-WAIT sockets into consideration: "tcpCurrEstab OBJECT-TYPE ... The number of TCP connections for which the current state is either ESTABLISHED or CLOSE- WAIT." After this, CurrEstab counter will display the total number of ESTABLISHED and CLOSE-WAIT sockets. The logic of counting When we increment the counter? a) if we change the state to ESTABLISHED. b) if we change the state from SYN-RECEIVED to CLOSE-WAIT. When we decrement the counter? a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT, say, on the client side, changing from ESTABLISHED to FIN-WAIT-1. b) if the socket leaves CLOSE-WAIT, say, on the server side, changing from CLOSE-WAIT to LAST-ACK. Please note: there are two chances that old state of socket can be changed to CLOSE-WAIT in tcp_fin(). One is SYN-RECV, the other is ESTABLISHED. So we have to take care of the former case. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05KVM: x86/mmu: Don't save mmu_invalidate_seq after checking private attrTao Su
Drop the second snapshot of mmu_invalidate_seq in kvm_faultin_pfn(). Before checking the mismatch of private vs. shared, mmu_invalidate_seq is saved to fault->mmu_seq, which can be used to detect an invalidation related to the gfn occurred, i.e. KVM will not install a mapping in page table if fault->mmu_seq != mmu_invalidate_seq. Currently there is a second snapshot of mmu_invalidate_seq, which may not be same as the first snapshot in kvm_faultin_pfn(), i.e. the gfn attribute may be changed between the two snapshots, but the gfn may be mapped in page table without hindrance. Therefore, drop the second snapshot as it has no obvious benefits. Fixes: f6adeae81f35 ("KVM: x86/mmu: Handle no-slot faults at the beginning of kvm_faultin_pfn()") Signed-off-by: Tao Su <tao1.su@linux.intel.com> Message-ID: <20240528102234.2162763-1-tao1.su@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-06-05Merge tag 'kvmarm-fixes-6.10-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.10, take #1 - Large set of FP/SVE fixes for pKVM, addressing the fallout from the per-CPU data rework and making sure that the host is not involved in the FP/SVE switching any more - Allow FEAT_BTI to be enabled with NV now that FEAT_PAUTH is copletely supported - Fix for the respective priorities of Failed PAC, Illegal Execution state and Instruction Abort exceptions - Fix the handling of AArch32 instruction traps failing their condition code, which was broken by the introduction of ESR_EL2.ISS2 - Allow vpcus running in AArch32 state to be restored in System mode - Fix AArch32 GPR restore that would lose the 64 bit state under some conditions
2024-06-05arm64: armv8_deprecated: Fix warning in isndep cpuhp starting processWei Li
The function run_all_insn_set_hw_mode() is registered as startup callback of 'CPUHP_AP_ARM64_ISNDEP_STARTING', it invokes set_hw_mode() methods of all emulated instructions. As the STARTING callbacks are not expected to fail, if one of the set_hw_mode() fails, e.g. due to el0 mixed-endian is not supported for 'setend', it will report a warning: ``` CPU[2] cannot support the emulation of setend CPU 2 UP state arm64/isndep:starting (136) failed (-22) CPU2: Booted secondary processor 0x0000000002 [0x414fd0c1] ``` To fix it, add a check for INSN_UNAVAILABLE status and skip the process. Signed-off-by: Wei Li <liwei391@huawei.com> Tested-by: Huisong Li <lihuisong@huawei.com> Link: https://lore.kernel.org/r/20240423093501.3460764-1-liwei391@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2024-06-05selftests: hsr: add missing config for CONFIG_BRIDGEHangbin Liu
hsr_redbox.sh test need to create bridge for testing. Add the missing config CONFIG_BRIDGE in config file. Fixes: eafbf0574e05 ("test: hsr: Extend the hsr_redbox.sh to have more SAN devices connected") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Simon Horman <horms@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05vxlan: Fix regression when dropping packets due to invalid src addressesDaniel Borkmann
Commit f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") has recently been added to vxlan mainly in the context of source address snooping/learning so that when it is enabled, an entry in the FDB is not being created for an invalid address for the corresponding tunnel endpoint. Before commit f58f45c1e5b9 vxlan was similarly behaving as geneve in that it passed through whichever macs were set in the L2 header. It turns out that this change in behavior breaks setups, for example, Cilium with netkit in L3 mode for Pods as well as tunnel mode has been passing before the change in f58f45c1e5b9 for both vxlan and geneve. After mentioned change it is only passing for geneve as in case of vxlan packets are dropped due to vxlan_set_mac() returning false as source and destination macs are zero which for E/W traffic via tunnel is totally fine. Fix it by only opting into the is_valid_ether_addr() check in vxlan_set_mac() when in fact source address snooping/learning is actually enabled in vxlan. This is done by moving the check into vxlan_snoop(). With this change, the Cilium connectivity test suite passes again for both tunnel flavors. Fixes: f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Bauer <mail@david-bauer.net> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net: sched: sch_multiq: fix possible OOB write in multiq_tune()Hangyu Hua
q->bands will be assigned to qopt->bands to execute subsequent code logic after kmalloc. So the old q->bands should not be used in kmalloc. Otherwise, an out-of-bounds write will occur. Fixes: c2999f7fb05b ("net: sched: multiq: don't call qdisc_put() while holding tree lock") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Acked-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05ionic: fix kernel panic in XDP_TX actionTaehee Yoo
In the XDP_TX path, ionic driver sends a packet to the TX path with rx page and corresponding dma address. After tx is done, ionic_tx_clean() frees that page. But RX ring buffer isn't reset to NULL. So, it uses a freed page, which causes kernel panic. BUG: unable to handle page fault for address: ffff8881576c110c PGD 773801067 P4D 773801067 PUD 87f086067 PMD 87efca067 PTE 800ffffea893e060 Oops: Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN NOPTI CPU: 1 PID: 25 Comm: ksoftirqd/1 Not tainted 6.9.0+ #11 Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021 RIP: 0010:bpf_prog_f0b8caeac1068a55_balancer_ingress+0x3b/0x44f Code: 00 53 41 55 41 56 41 57 b8 01 00 00 00 48 8b 5f 08 4c 8b 77 00 4c 89 f7 48 83 c7 0e 48 39 d8 RSP: 0018:ffff888104e6fa28 EFLAGS: 00010283 RAX: 0000000000000002 RBX: ffff8881576c1140 RCX: 0000000000000002 RDX: ffffffffc0051f64 RSI: ffffc90002d33048 RDI: ffff8881576c110e RBP: ffff888104e6fa88 R08: 0000000000000000 R09: ffffed1027a04a23 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881b03a21a8 R13: ffff8881589f800f R14: ffff8881576c1100 R15: 00000001576c1100 FS: 0000000000000000(0000) GS:ffff88881ae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff8881576c110c CR3: 0000000767a90000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x254/0x790 ? __pfx_page_fault_oops+0x10/0x10 ? __pfx_is_prefetch.constprop.0+0x10/0x10 ? search_bpf_extables+0x165/0x260 ? fixup_exception+0x4a/0x970 ? exc_page_fault+0xcb/0xe0 ? asm_exc_page_fault+0x22/0x30 ? 0xffffffffc0051f64 ? bpf_prog_f0b8caeac1068a55_balancer_ingress+0x3b/0x44f ? do_raw_spin_unlock+0x54/0x220 ionic_rx_service+0x11ab/0x3010 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? ionic_tx_clean+0x29b/0xc60 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_tx_clean+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_rx_service+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? ionic_tx_cq_service+0x25d/0xa00 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ? __pfx_ionic_rx_service+0x10/0x10 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ionic_cq_service+0x69/0x150 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] ionic_txrx_napi+0x11a/0x540 [ionic 9180c3001ab627d82bbc5f3ebe8a0decaf6bb864] __napi_poll.constprop.0+0xa0/0x440 net_rx_action+0x7e7/0xc30 ? __pfx_net_rx_action+0x10/0x10 Fixes: 8eeed8373e1c ("ionic: Add XDP_TX support") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net: phy: Micrel KSZ8061: fix errata solution not taking effect problemTristram Ha
KSZ8061 needs to write to a MMD register at driver initialization to fix an errata. This worked in 5.0 kernel but not in newer kernels. The issue is the main phylib code no longer resets PHY at the very beginning. Calling phy resuming code later will reset the chip if it is already powered down at the beginning. This wipes out the MMD register write. Solution is to implement a phy resume function for KSZ8061 to take care of this problem. Fixes: 232ba3a51cc2 ("net: phy: Micrel KSZ8061: link failure after cable connect") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05net/smc: avoid overwriting when adjusting sock bufsizesWen Gu
When copying smc settings to clcsock, avoid setting clcsock's sk_sndbuf to sysctl_tcp_wmem[1], since this may overwrite the value set by tcp_sndbuf_expand() in TCP connection establishment. And the other setting sk_{snd|rcv}buf to sysctl value in smc_adjust_sock_bufsizes() can also be omitted since the initialization of smc sock and clcsock has set sk_{snd|rcv}buf to smc.sysctl_{w|r}mem or ipv4_sysctl_tcp_{w|r}mem[1]. Fixes: 30c3c4a4497c ("net/smc: Use correct buffer sizes when switching between TCP and SMC") Link: https://lore.kernel.org/r/5eaf3858-e7fd-4db8-83e8-3d7a3e0e9ae2@linux.alibaba.com Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>, too. Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05octeontx2-af: Always allocate PF entries from low prioriy zoneSubbaraya Sundeep
PF mcam entries has to be at low priority always so that VF can install longest prefix match rules at higher priority. This was taken care currently but when priority allocation wrt reference entry is requested then entries are allocated from mid-zone instead of low priority zone. Fix this and always allocate entries from low priority zone for PFs. Fixes: 7df5b4b260dd ("octeontx2-af: Allocate low priority entries for PF") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05efi: Add missing __nocfi annotations to runtime wrappersArd Biesheuvel
The EFI runtime wrappers are a sandbox for calling into EFI runtime services, which are invoked using indirect calls. When running with kCFI enabled, the compiler will require the target of any indirect call to be type annotated. Given that the EFI runtime services prototypes and calling convention are governed by the EFI spec, not the Linux kernel, adding such type annotations for firmware routines is infeasible, and so the compiler must be informed that prototype validation should be omitted. Add the __nocfi annotation at the appropriate places in the EFI runtime wrapper code to achieve this. Note that this currently only affects 32-bit ARM, given that other architectures that support both kCFI and EFI use an asm wrapper to call EFI runtime services, and this hides the indirect call from the compiler. Fixes: 1a4fec49efe5 ("ARM: 9392/2: Support CLANG CFI") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-06-05Revert "xsk: Document ability to redirect to any socket bound to the same umem"Magnus Karlsson
This reverts commit 968595a93669b6b4f6d1fcf80cf2d97956b6868f. Reported-by: Yuval El-Hanany <YuvalE@radware.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/xdp-newbies/8100DBDC-0B7C-49DB-9995-6027F6E63147@radware.com Link: https://lore.kernel.org/bpf/20240604122927.29080-3-magnus.karlsson@gmail.com
2024-06-05Revert "xsk: Support redirect to any socket bound to the same umem"Magnus Karlsson
This reverts commit 2863d665ea41282379f108e4da6c8a2366ba66db. This patch introduced a potential kernel crash when multiple napi instances redirect to the same AF_XDP socket. By removing the queue_index check, it is possible for multiple napi instances to access the Rx ring at the same time, which will result in a corrupted ring state which can lead to a crash when flushing the rings in __xsk_flush(). This can happen when the linked list of sockets to flush gets corrupted by concurrent accesses. A quick and small fix is not possible, so let us revert this for now. Reported-by: Yuval El-Hanany <YuvalE@radware.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/xdp-newbies/8100DBDC-0B7C-49DB-9995-6027F6E63147@radware.com Link: https://lore.kernel.org/bpf/20240604122927.29080-2-magnus.karlsson@gmail.com
2024-06-05bpf: Set run context for rawtp test_run callbackJiri Olsa
syzbot reported crash when rawtp program executed through the test_run interface calls bpf_get_attach_cookie helper or any other helper that touches task->bpf_ctx pointer. Setting the run context (task->bpf_ctx pointer) for test_run callback. Fixes: 7adfc6c9b315 ("bpf: Add bpf_get_attach_cookie() BPF helper to access bpf_cookie value") Reported-by: syzbot+3ab78ff125b7979e45f9@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Closes: https://syzkaller.appspot.com/bug?extid=3ab78ff125b7979e45f9 Link: https://lore.kernel.org/bpf/20240604150024.359247-1-jolsa@kernel.org
2024-06-05tpm: Switch to new Intel CPU model definesTony Luck
New CPU #defines encode vendor and family as well as model. Link: https://lore.kernel.org/all/20240520224620.9480-4-tony.luck@intel.com/ Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-06-05tpm_tis: Do *not* flush uninitialized workJan Beulich
tpm_tis_core_init() may fail before tpm_tis_probe_irq_single() is called, in which case tpm_tis_remove() unconditionally calling flush_work() is triggering a warning for .func still being NULL. Cc: stable@vger.kernel.org # v6.5+ Fixes: 481c2d14627d ("tpm,tpm_tis: Disable interrupts after 1000 unhandled IRQs") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-06-04Merge tag 'devicetree-fixes-for-6.10-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix regression in 'interrupt-map' handling affecting Apple M1 mini (at least) - Fix binding example warning in stm32 st,mlahb binding - Fix schema error in Allwinner platform binding causing lots of spurious warnings - Add missing MODULE_DESCRIPTION() to DT kunit tests * tag 'devicetree-fixes-for-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: property: Fix fw_devlink handling of interrupt-map of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw() dt-bindings: arm: stm32: st,mlahb: Drop spurious "reg" property from example dt-bindings: arm: sunxi: Fix incorrect '-' usage of: of_test: add MODULE_DESCRIPTION()
2024-06-04tools headers arm64: Sync arm64's cputype.h with the kernel sourcesArnaldo Carvalho de Melo
To get the changes in: 0ce85db6c2141b7f ("arm64: cputype: Add Neoverse-V3 definitions") 02a0a04676fa7796 ("arm64: cputype: Add Cortex-X4 definitions") f4d9d9dcc70b96b5 ("arm64: Add Neoverse-V2 part") That makes this perf source code to be rebuilt: CC /tmp/build/perf-tools/util/arm-spe.o The changes in the above patch add MIDR_NEOVERSE_V[23] and MIDR_NEOVERSE_V1 is used in arm-spe.c, so probably we need to add those and perhaps MIDR_CORTEX_X4 to that array? Or maybe we need to leave this for later when this is all tested on those machines? static const struct midr_range neoverse_spe[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), {}, }; Mark Rutland recommended about arm-spe.c: "I would not touch this for now -- someone would have to go audit the TRMs to check that those other cores have the same encoding, and I think it'd be better to do that as a follow-up." That addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Besar Wicaksono <bwicaksono@nvidia.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/lkml/Zl8cYk0Tai2fs7aM@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-04spi: cs42l43: Correct SPI root clock speedCharles Keepax
The root clock is actually 49.152MHz not 40MHz, as it is derived from the primary audio clock, update the driver to match. This error can cause the actual clock rate to be higher than the requested clock rate on the SPI bus. Fixes: ef75e767167a ("spi: cs42l43: Add SPI controller support") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://msgid.link/r/20240604131704.3227500-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-06-04thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188Julien Panis
Filtered mode is not supported on mt8188 SoC and is the source of bad results. Move to immediate mode which provides good temperatures. Fixes: f4745f546e60 ("thermal/drivers/mediatek/lvts_thermal: Add MT8188 support") Reviewed-by: Nicolas Pitre <npitre@baylibre.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Julien Panis <jpanis@baylibre.com> Link: https://lore.kernel.org/r/20240516-mtk-thermal-mt8188-mode-fix-v2-1-40a317442c62@baylibre.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-04Merge tag 'linux_kselftest-fixes-6.10-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "Fixes to build warnings in several tests and fixes to ftrace tests" * tag 'linux_kselftest-fixes-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/futex: don't pass a const char* to asprintf(3) selftests/futex: don't redefine .PHONY targets (all, clean) selftests/tracing: Fix event filter test to retry up to 10 times selftests/futex: pass _GNU_SOURCE without a value to the compiler selftests/overlayfs: Fix build error on ppc64 selftests/openat2: Fix build warnings on ppc64 selftests: cachestat: Fix build warnings on ppc64 tracing/selftests: Fix kprobe event name test for .isra. functions selftests/ftrace: Update required config selftests/ftrace: Fix to check required event file kselftest/alsa: Ensure _GNU_SOURCE is defined
2024-06-04Merge branch 'efi/next' into efi/urgentArd Biesheuvel
2024-06-04PCI: Revert the cfg_access_lock lockdep mechanismDan Williams
While the experiment did reveal that there are additional places that are missing the lock during secondary bus reset, one of the places that needs to take cfg_access_lock (pci_bus_lock()) is not prepared for lockdep annotation. Specifically, pci_bus_lock() takes pci_dev_lock() recursively and is currently dependent on the fact that the device_lock() is marked lockdep_set_novalidate_class(&dev->mutex). Otherwise, without that annotation, pci_bus_lock() would need to use something like a new pci_dev_lock_nested() helper, a scheme to track a PCI device's depth in the topology, and a hope that the depth of a PCI tree never exceeds the max value for a lockdep subclass. The alternative to ripping out the lockdep coverage would be to deploy a dynamic lock key for every PCI device. Unfortunately, there is evidence that increasing the number of keys that lockdep needs to track to be per-PCI-device is prohibitively expensive for something like the cfg_access_lock. The main motivation for adding the annotation in the first place was to catch unlocked secondary bus resets, not necessarily catch lock ordering problems between cfg_access_lock and other locks. Solve that narrower problem with follow-on patches, and just due to targeted revert for now. Link: https://lore.kernel.org/r/171711746402.1628941.14575335981264103013.stgit@dwillia2-xfh.jf.intel.com Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()") Reported-by: Imre Deak <imre.deak@intel.com> Closes: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Kalle Valo <kvalo@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Cc: Jani Saarinen <jani.saarinen@intel.com>
2024-06-04drivers: core: synchronize really_probe() and dev_uevent()Dirk Behme
Synchronize the dev->driver usage in really_probe() and dev_uevent(). These can run in different threads, what can result in the following race condition for dev->driver uninitialization: Thread #1: ========== really_probe() { ... probe_failed: ... device_unbind_cleanup(dev) { ... dev->driver = NULL; // <= Failed probe sets dev->driver to NULL ... } ... } Thread #2: ========== dev_uevent() { ... if (dev->driver) // If dev->driver is NULLed from really_probe() from here on, // after above check, the system crashes add_uevent_var(env, "DRIVER=%s", dev->driver->name); ... } really_probe() holds the lock, already. So nothing needs to be done there. dev_uevent() is called with lock held, often, too. But not always. What implies that we can't add any locking in dev_uevent() itself. So fix this race by adding the lock to the non-protected path. This is the path where above race is observed: dev_uevent+0x235/0x380 uevent_show+0x10c/0x1f0 <= Add lock here dev_attr_show+0x3a/0xa0 sysfs_kf_seq_show+0x17c/0x250 kernfs_seq_show+0x7c/0x90 seq_read_iter+0x2d7/0x940 kernfs_fop_read_iter+0xc6/0x310 vfs_read+0x5bc/0x6b0 ksys_read+0xeb/0x1b0 __x64_sys_read+0x42/0x50 x64_sys_call+0x27ad/0x2d30 do_syscall_64+0xcd/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Similar cases are reported by syzkaller in https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a But these are regarding the *initialization* of dev->driver dev->driver = drv; As this switches dev->driver to non-NULL these reports can be considered to be false-positives (which should be "fixed" by this commit, as well, though). The same issue was reported and tried to be fixed back in 2015 in https://lore.kernel.org/lkml/1421259054-2574-1-git-send-email-a.sangwan@samsung.com/ already. Fixes: 239378f16aa1 ("Driver core: add uevent vars for devices of a class") Cc: stable <stable@kernel.org> Cc: syzbot+ffa8143439596313a85a@syzkaller.appspotmail.com Cc: Ashish Sangwan <a.sangwan@samsung.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com> Link: https://lore.kernel.org/r/20240513050634.3964461-1-dirk.behme@de.bosch.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04clkdev: don't fail clkdev_alloc() if over-sizedRussell King (Oracle)
Don't fail clkdev_alloc() if the strings are over-sized. In this case, the entry will not match during lookup, so its useless. However, since code fails if we return NULL leading to boot failure, return a dummy entry with the connection and device IDs set to "bad". Leave the warning so these problems can be found, and the useless wasteful clkdev registrations removed. Reported-by: Ron Economos <re@w6rz.net> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 8d532528ff6a ("clkdev: report over-sized strings when creating clkdev entries") Closes: https://lore.kernel.org/linux-clk/7eda7621-0dde-4153-89e4-172e4c095d01@roeck-us.net. Link: https://lore.kernel.org/r/28114882-f8d7-21bf-4536-a186e8d7a22a@w6rz.net Tested-by: Ron Economos <re@w6rz.net> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-06-04jfs: xattr: fix buffer overflow for invalid xattrGreg Kroah-Hartman
When an xattr size is not what is expected, it is printed out to the kernel log in hex format as a form of debugging. But when that xattr size is bigger than the expected size, printing it out can cause an access off the end of the buffer. Fix this all up by properly restricting the size of the debug hex dump in the kernel log. Reported-by: syzbot+9dfe490c8176301c1d06@syzkaller.appspotmail.com Cc: Dave Kleikamp <shaggy@kernel.org> Link: https://lore.kernel.org/r/2024051433-slider-cloning-98f9@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04misc: microchip: pci1xxxx: Fix a memory leak in the error handling of ↵Yongzhi Liu
gp_aux_bus_probe() There is a memory leak (forget to free allocated buffers) in a memory allocation failure path. Fix it to jump to the correct error handling code. Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.") Signed-off-by: Yongzhi Liu <hyperlyzcs@gmail.com> Reviewed-by: Kumaravel Thiagarajan <kumaravel.thiagarajan@microchip.com> Link: https://lore.kernel.org/r/20240523121434.21855-4-hyperlyzcs@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04misc: microchip: pci1xxxx: fix double free in the error handling of ↵Yongzhi Liu
gp_aux_bus_probe() When auxiliary_device_add() returns error and then calls auxiliary_device_uninit(), callback function gp_auxiliary_device_release() calls ida_free() and kfree(aux_device_wrapper) to free memory. We should't call them again in the error handling path. Fix this by skipping the redundant cleanup functions. Fixes: 393fc2f5948f ("misc: microchip: pci1xxxx: load auxiliary bus driver for the PIO function in the multi-function endpoint of pci1xxxx device.") Signed-off-by: Yongzhi Liu <hyperlyzcs@gmail.com> Link: https://lore.kernel.org/r/20240523121434.21855-3-hyperlyzcs@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04parport: amiga: Mark driver struct with __refdata to prevent section mismatchUwe Kleine-König
As described in the added code comment, a reference to .exit.text is ok for drivers registered via module_platform_driver_probe(). Make this explicit to prevent the following section mismatch warning WARNING: modpost: drivers/parport/parport_amiga: section mismatch in reference: amiga_parallel_driver+0x8 (section: .data) -> amiga_parallel_remove (section: .exit.text) that triggers on an allmodconfig W=1 build. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20240513075206.2337310-2-u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-04mei: vsc: Fix wrong invocation of ACPI SID methodHans de Goede
When using an initializer for a union only one of the union members must be initialized. The initializer for the acpi_object union variable passed as argument to the SID ACPI method was initializing both the type and the integer members of the union. Unfortunately rather then complaining about this gcc simply ignores the first initializer and only used the second integer.value = 1 initializer. Leaving type set to 0 which leads to the argument being skipped by acpi acpi_ns_evaluate() resulting in: ACPI Warning: \_SB.PC00.SPI1.SPFD.CVFD.SID: Insufficient arguments - Caller passed 0, method requires 1 (20240322/nsarguments-232) Fix this by initializing only the integer struct part of the union and initializing both members of the integer struct. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Fixes: 566f5ca97680 ("mei: Add transport driver for IVSC device") Reviewed-by: Wentong Wu <wentong.wu@intel.com> Link: https://lore.kernel.org/r/20240603205050.505389-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>