summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2022-03-05selftests/vm: cleanup hugetlb file after mremap testMike Kravetz
The hugepage-mremap test will create a file in a hugetlb filesystem. In a default 'run_vmtests' run, the file will contain all the hugetlb pages. After the test, the file remains and there are no free hugetlb pages for subsequent tests. This causes those hugetlb tests to fail. Change hugepage-mremap to take the name of the hugetlb file as an argument. Unlink the file within the test, and just to be sure remove the file in the run_vmtests script. Link: https://lkml.kernel.org/r/20220201033459.156944-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-03Merge tag 'net-5.17-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from can, xfrm, wifi, bluetooth, and netfilter. Lots of various size fixes, the length of the tag speaks for itself. Most of the 5.17-relevant stuff comes from xfrm, wifi and bt trees which had been lagging as you pointed out previously. But there's also a larger than we'd like portion of fixes for bugs from previous releases. Three more fixes still under discussion, including and xfrm revert for uAPI error. Current release - regressions: - iwlwifi: don't advertise TWT support, prevent FW crash - xfrm: fix the if_id check in changelink - xen/netfront: destroy queues before real_num_tx_queues is zeroed - bluetooth: fix not checking MGMT cmd pending queue, make scanning work again Current release - new code bugs: - mptcp: make SIOCOUTQ accurate for fallback socket - bluetooth: access skb->len after null check - bluetooth: hci_sync: fix not using conn_timeout - smc: fix cleanup when register ULP fails - dsa: restore error path of dsa_tree_change_tag_proto - iwlwifi: fix build error for IWLMEI - iwlwifi: mvm: propagate error from request_ownership to the user Previous releases - regressions: - xfrm: fix pMTU regression when reported pMTU is too small - xfrm: fix TCP MSS calculation when pMTU is close to 1280 - bluetooth: fix bt_skb_sendmmsg not allocating partial chunks - ipv6: ensure we call ipv6_mc_down() at most once, prevent leaks - ipv6: prevent leaks in igmp6 when input queues get full - fix up skbs delta_truesize in UDP GRO frag_list - eth: e1000e: fix possible HW unit hang after an s0ix exit - eth: e1000e: correct NVM checksum verification flow - ptp: ocp: fix large time adjustments Previous releases - always broken: - tcp: make tcp_read_sock() more robust in presence of urgent data - xfrm: distinguishing SAs and SPs by if_id in xfrm_migrate - xfrm: fix xfrm_migrate issues when address family changes - dcb: flush lingering app table entries for unregistered devices - smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error - mac80211: fix EAPoL rekey fail in 802.3 rx path - mac80211: fix forwarded mesh frames AC & queue selection - netfilter: nf_queue: fix socket access races and bugs - batman-adv: fix ToCToU iflink problems and check the result belongs to the expected net namespace - can: gs_usb, etas_es58x: fix opened_channel_cnt's accounting - can: rcar_canfd: register the CAN device when fully ready - eth: igb, igc: phy: drop premature return leaking HW semaphore - eth: ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc(), prevent live lock when link goes down - eth: stmmac: only enable DMA interrupts when ready - eth: sparx5: move vlan checks before any changes are made - eth: iavf: fix races around init, removal, resets and vlan ops - ibmvnic: more reset flow fixes Misc: - eth: fix return value of __setup handlers" * tag 'net-5.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits) ipv6: fix skb drops in igmp6_event_query() and igmp6_event_report() net: dsa: make dsa_tree_change_tag_proto actually unwind the tag proto change ixgbe: xsk: change !netif_carrier_ok() handling in ixgbe_xmit_zc() selftests: mlxsw: resource_scale: Fix return value selftests: mlxsw: tc_police_scale: Make test more robust net: dcb: disable softirqs in dcbnl_flush_dev() bnx2: Fix an error message sfc: extend the locking on mcdi->seqno net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error cause by server net/smc: fix unexpected SMC_CLC_DECL_ERR_REGRMB error generated by client net: arcnet: com20020: Fix null-ptr-deref in com20020pci_probe() tcp: make tcp_read_sock() more robust bpf, sockmap: Do not ignore orig_len parameter net: ipa: add an interconnect dependency net: fix up skbs delta_truesize in UDP GRO frag_list iwlwifi: mvm: return value for request_ownership nl80211: Update bss channel on channel switch for P2P_CLIENT iwlwifi: fix build error for IWLMEI ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments batman-adv: Don't expect inter-netns unique iflink indices ...
2022-03-03selftests: mlxsw: resource_scale: Fix return valueAmit Cohen
The test runs several test cases and is supposed to return an error in case at least one of them failed. Currently, the check of the return value of each test case is in the wrong place, which can result in the wrong return value. For example: # TESTS='tc_police' ./resource_scale.sh TEST: 'tc_police' [default] 968 [FAIL] tc police offload count failed Error: mlxsw_spectrum: Failed to allocate policer index. We have an error talking to the kernel Command failed /tmp/tmp.i7Oc5HwmXY:969 TEST: 'tc_police' [default] overflow 969 [ OK ] ... TEST: 'tc_police' [ipv4_max] overflow 969 [ OK ] $ echo $? 0 Fix this by moving the check to be done after each test case. Fixes: 059b18e21c63 ("selftests: mlxsw: Return correct error code in resource scale test") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03selftests: mlxsw: tc_police_scale: Make test more robustAmit Cohen
The test adds tc filters and checks how many of them were offloaded by grepping for 'in_hw'. iproute2 commit f4cd4f127047 ("tc: add skip_hw and skip_sw to control action offload") added offload indication to tc actions, producing the following output: $ tc filter show dev swp2 ingress ... filter protocol ipv6 pref 1000 flower chain 0 handle 0x7c0 eth_type ipv6 dst_ip 2001:db8:1::7bf skip_sw in_hw in_hw_count 1 action order 1: police 0x7c0 rate 10Mbit burst 100Kb mtu 2Kb action drop overhead 0b ref 1 bind 1 not_in_hw used_hw_stats immediate The current grep expression matches on both 'in_hw' and 'not_in_hw', resulting in incorrect results. Fix that by using JSON output instead. Fixes: 5061e773264b ("selftests: mlxsw: Add scale test for tc-police") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01selftests/exec: Test for empty string on NULL argvKees Cook
Test for the NULL argv argument producing a single empty string on exec. Cc: Eric Biederman <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-kselftest@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20220201011637.2457646-1-keescook@chromium.org
2022-03-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Use kfree_rcu(ptr, rcu) variant, using kfree_rcu(ptr) was not intentional. From Eric Dumazet. 2) Use-after-free in netfilter hook core, from Eric Dumazet. 3) Missing rcu read lock side for netfilter egress hook, from Florian Westphal. 4) nf_queue assume state->sk is full socket while it might not be. Invoke sock_gen_put(), from Florian Westphal. 5) Add selftest to exercise the reported KASAN splat in 4) 6) Fix possible use-after-free in nf_queue in case sk_refcnt is 0. Also from Florian. 7) Use input interface index only for hardware offload, not for the software plane. This breaks tc ct action. Patch from Paul Blakey. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: net/sched: act_ct: Fix flow table lookup failure with no originating ifindex netfilter: nf_queue: handle socket prefetch netfilter: nf_queue: fix possible use-after-free selftests: netfilter: add nfqueue TCP_NEW_SYN_RECV socket race test netfilter: nf_queue: don't assume sk is full socket netfilter: egress: silence egress hook lockdep splats netfilter: fix use-after-free in __nf_register_net_hook() netfilter: nf_tables: prefer kfree_rcu(ptr, rcu) variant ==================== Link: https://lore.kernel.org/r/20220301215337.378405-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "The bigger part of the change is a revert for x86 hosts. Here the second patch was supposed to fix the first, but in reality it was just as broken, so both have to go. x86 host: - Revert incorrect assumption that cr3 changes come with preempt notifier callbacks (they don't when static branches are changed, for example) ARM host: - Correctly synchronise PMR and co on PSCI CPU_SUSPEND - Skip tests that depend on GICv3 when the HW isn't available" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: selftests: aarch64: Skip tests if we can't create a vgic-v3 Revert "KVM: VMX: Save HOST_CR3 in vmx_prepare_switch_to_guest()" Revert "KVM: VMX: Save HOST_CR3 in vmx_set_host_fs_gs()" KVM: arm64: Don't miss pending interrupts for suspended vCPU
2022-03-01Merge tag 'linux-cpupower-5.18-rc1' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Pull cpupower utility updates for 5.18-rc1 from Shuah Khan: "This cpupower update for Linux 5.18-rc1 adds AMD P-State Support to cpupower tool. AMD P-State kernel support went into 5.17-rc1." * tag 'linux-cpupower-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: cpupower: Add "perf" option to print AMD P-State information cpupower: Add function to print AMD P-State performance capabilities cpupower: Move print_speed function into misc helper cpupower: Enable boost state support for AMD P-State module cpupower: Add AMD P-State sysfs definition and access helper cpupower: Introduce ACPI CPPC library cpupower: Add the function to get the sysfs value from specific table cpupower: Initial AMD P-State capability cpupower: Add the function to check AMD P-State enabled cpupower: Add AMD P-State capability flag
2022-03-01perf: Add irq and exception return branch typesAnshuman Khandual
This expands generic branch type classification by adding two more entries there in i.e irq and exception return. Also updates the x86 implementation to process X86_BR_IRET and X86_BR_IRQ records as appropriate. This changes branch types reported to user space on x86 platform but it should not be a problem. The possible scenarios and impacts are enumerated here. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1645681014-3346-1-git-send-email-anshuman.khandual@arm.com
2022-03-01selftests: netfilter: add nfqueue TCP_NEW_SYN_RECV socket race testFlorian Westphal
causes: BUG: KASAN: slab-out-of-bounds in sk_free+0x25/0x80 Write of size 4 at addr ffff888106df0284 by task nf-queue/1459 sk_free+0x25/0x80 nf_queue_entry_release_refs+0x143/0x1a0 nf_reinject+0x233/0x770 ... without 'netfilter: nf_queue: don't assume sk is full socket'. Signed-off-by: Florian Westphal <fw@strlen.de>
2022-02-26Merge tag 'trace-v5.17-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - rtla (Real-Time Linux Analysis tool): - fix typo in man page - Update API -e to -E before it is released - Error message fix and memory leak fix - Partially uninline trace event soft disable to shrink text - Fix function graph start up test - Have triggers affect the trace instance they are in and not top level - Have osnoise sleep in the units it says it uses - Remove unused ftrace stub function - Remove event probe redundant info from event in the buffer - Fix group ownership setting in tracefs - Ensure trace buffer is minimum size to prevent crashes * tag 'trace-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: rtla/osnoise: Fix error message when failing to enable trace instance rtla/osnoise: Free params at the exit rtla/hist: Make -E the short version of --entries tracing: Fix selftest config check for function graph start up test tracefs: Set the group ownership in apply_options() not parse_options() tracing/osnoise: Make osnoise_main to sleep for microseconds ftrace: Remove unused ftrace_startup_enable() stub tracing: Ensure trace buffer is at least 4096 bytes large tracing: Uninline trace_trigger_soft_disabled() partly eprobes: Remove redundant event type information tracing: Have traceon and traceoff trigger honor the instance tracing: Dump stacktrace trigger to the corresponding instance rtla: Fix systme -> system typo on man page
2022-02-26selftests/memfd: clean up mapping in mfd_fail_writeMike Kravetz
Running the memfd script ./run_hugetlbfs_test.sh will often end in error as follows: memfd-hugetlb: CREATE memfd-hugetlb: BASIC memfd-hugetlb: SEAL-WRITE memfd-hugetlb: SEAL-FUTURE-WRITE memfd-hugetlb: SEAL-SHRINK fallocate(ALLOC) failed: No space left on device ./run_hugetlbfs_test.sh: line 60: 166855 Aborted (core dumped) ./memfd_test hugetlbfs opening: ./mnt/memfd fuse: DONE If no hugetlb pages have been preallocated, run_hugetlbfs_test.sh will allocate 'just enough' pages to run the test. In the SEAL-FUTURE-WRITE test the mfd_fail_write routine maps the file, but does not unmap. As a result, two hugetlb pages remain reserved for the mapping. When the fallocate call in the SEAL-SHRINK test attempts allocate all hugetlb pages, it is short by the two reserved pages. Fix by making sure to unmap in mfd_fail_write. Link: https://lkml.kernel.org/r/20220219004340.56478-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26selftest/vm: fix map_fixed_noreplace test failureAneesh Kumar K.V
On the latest RHEL the test fails due to executable mapped at 256MB address # ./map_fixed_noreplace mmap() @ 0x10000000-0x10050000 p=0xffffffffffffffff result=File exists 10000000-10010000 r-xp 00000000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10010000-10020000 r--p 00000000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10020000-10030000 rw-p 00010000 fd:04 34905657 /root/rpmbuild/BUILD/kernel-5.14.0-56.el9/linux-5.14.0-56.el9.ppc64le/tools/testing/selftests/vm/map_fixed_noreplace 10029b90000-10029bc0000 rw-p 00000000 00:00 0 [heap] 7fffbb510000-7fffbb750000 r-xp 00000000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb750000-7fffbb760000 r--p 00230000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb760000-7fffbb770000 rw-p 00240000 fd:04 24534 /usr/lib64/libc.so.6 7fffbb780000-7fffbb7a0000 r--p 00000000 00:00 0 [vvar] 7fffbb7a0000-7fffbb7b0000 r-xp 00000000 00:00 0 [vdso] 7fffbb7b0000-7fffbb800000 r-xp 00000000 fd:04 24514 /usr/lib64/ld64.so.2 7fffbb800000-7fffbb810000 r--p 00040000 fd:04 24514 /usr/lib64/ld64.so.2 7fffbb810000-7fffbb820000 rw-p 00050000 fd:04 24514 /usr/lib64/ld64.so.2 7fffd93f0000-7fffd9420000 rw-p 00000000 00:00 0 [stack] Error: couldn't map the space we need for the test Fix this by finding a free address using mmap instead of hardcoding BASE_ADDRESS. Link: https://lkml.kernel.org/r/20220217083417.373823-1-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Jann Horn <jannh@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-25rtla/osnoise: Fix error message when failing to enable trace instanceDaniel Bristot de Oliveira
When a trace instance creation fails, tools are printing: Could not enable -> osnoiser <- tracer for tracing Print the actual (and correct) name of the tracer it fails to enable. Link: https://lkml.kernel.org/r/53ef0582605af91eca14b19dba9fc9febb95d4f9.1645206561.git.bristot@kernel.org Fixes: b1696371d865 ("rtla: Helper functions for rtla") Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25rtla/osnoise: Free params at the exitDaniel Bristot de Oliveira
The variable that stores the parsed command line arguments are not being free()d at the rtla osnoise top exit path. Free params variable before exiting. Link: https://lkml.kernel.org/r/0be31d8259c7c53b98a39769d60cfeecd8421785.1645206561.git.bristot@kernel.org Fixes: 1eceb2fc2ca5 ("rtla/osnoise: Add osnoise top mode") Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25rtla/hist: Make -E the short version of --entriesDaniel Bristot de Oliveira
Currently, --entries uses -e as the short version in the hist mode of timerlat and osnoise tools. But as -e is already used to enable events on trace sessions by other tools, thus let's keep it available for the same usage for all rtla tools. Make -E the short version of --entries for hist mode on all tools. Note: rtla was merged in this merge window, so rtla was not released yet. Link: https://lkml.kernel.org/r/5dbf0cbe7364d3a05e708926b41a097c59a02b1e.1645206561.git.bristot@kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-02-25Merge tag 'kvmarm-fixes-5.17-4' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.17, take #4 - Correctly synchronise PMR and co on PSCI CPU_SUSPEND - Skip tests that depend on GICv3 when the HW isn't available
2022-02-25kselftest/arm64: signal: Allow tests to be incompatible with featuresMark Brown
Some features may invalidate some tests, for example by supporting an operation which would trap otherwise. Allow tests to list features that they are incompatible with so we can cover the case where a signal will be generated without disruption on systems where that won't happen. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220207152109.197566-6-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-02-25KVM: selftests: aarch64: Skip tests if we can't create a vgic-v3Mark Brown
The arch_timer and vgic_irq kselftests assume that they can create a vgic-v3, using the library function vgic_v3_setup() which aborts with a test failure if it is not possible to do so. Since vgic-v3 can only be instantiated on systems where the host has GICv3 this leads to false positives on older systems where that is not the case. Fix this by changing vgic_v3_setup() to return an error if the vgic can't be instantiated and have the callers skip if this happens. We could also exit flagging a skip in vgic_v3_setup() but this would prevent future test cases conditionally deciding which GIC to use or generally doing more complex output. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Andrew Jones <drjones@redhat.com> Tested-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220223131624.1830351-1-broonie@kernel.org
2022-02-24selftests: mptcp: do complete cleanup at exitPaolo Abeni
After commit 05be5e273c84 ("selftests: mptcp: add disconnect tests") the mptcp selftests leave behind a couple of tmp files after each run. run_tests_disconnect() misnames a few variables used to track them. Address the issue setting the appropriate global variables Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24Merge tag 'perf-tools-fixes-for-v5.17-2022-02-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix double free in in the error path when opening perf.data from multiple files in a directory instead of from a single file - Sync the msr-index.h copy with the kernel sources - Fix error when printing 'weight' field in 'perf script' - Skip failing sigtrap test for arm+aarch64 in 'perf test' - Fix failure to use a cpu list for uncore events in hybrid systems, e.g. Intel Alder Lake * tag 'perf-tools-fixes-for-v5.17-2022-02-24' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf script: Fix error when printing 'weight' field tools arch x86: Sync the msr-index.h copy with the kernel sources perf data: Fix double free in perf_session__delete() perf evlist: Fix failed to use cpu list for uncore events perf test: Skip failing sigtrap test for arm+aarch64
2022-02-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "x86 host: - Expose KVM_CAP_ENABLE_CAP since it is supported - Disable KVM_HC_CLOCK_PAIRING in TSC catchup mode - Ensure async page fault token is nonzero - Fix lockdep false negative - Fix FPU migration regression from the AMX changes x86 guest: - Don't use PV TLB/IPI/yield on uniprocessor guests PPC: - reserve capability id (topic branch for ppc/kvm)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non default value when tsc scaling disabled KVM: x86/mmu: make apf token non-zero to fix bug KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3 x86/kvm: Don't use pv tlb/ipi/sched_yield if on 1 vCPU x86/kvm: Fix compilation warning in non-x86_64 builds x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0 x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0 kvm: x86: Disable KVM_HC_CLOCK_PAIRING if tsc is in always catchup mode KVM: Fix lockdep false negative during host resume KVM: x86: Add KVM_CAP_ENABLE_CAP to x86
2022-02-24Merge tag 'net-5.17-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. Current release - regressions: - bpf: fix crash due to out of bounds access into reg2btf_ids - mvpp2: always set port pcs ops, avoid null-deref - eth: marvell: fix driver load from initrd - eth: intel: revert "Fix reset bw limit when DCB enabled with 1 TC" Current release - new code bugs: - mptcp: fix race in overlapping signal events Previous releases - regressions: - xen-netback: revert hotplug-status changes causing devices to not be configured - dsa: - avoid call to __dev_set_promiscuity() while rtnl_mutex isn't held - fix panic when removing unoffloaded port from bridge - dsa: microchip: fix bridging with more than two member ports Previous releases - always broken: - bpf: - fix crash due to incorrect copy_map_value when both spin lock and timer are present in a single value - fix a bpf_timer initialization issue with clang - do not try bpf_msg_push_data with len 0 - add schedule points in batch ops - nf_tables: - unregister flowtable hooks on netns exit - correct flow offload action array size - fix a couple of memory leaks - vsock: don't check owner in vhost_vsock_stop() while releasing - gso: do not skip outer ip header in case of ipip and net_failover - smc: use a mutex for locking "struct smc_pnettable" - openvswitch: fix setting ipv6 fields causing hw csum failure - mptcp: fix race in incoming ADD_ADDR option processing - sysfs: add check for netdevice being present to speed_show - sched: act_ct: fix flow table lookup after ct clear or switching zones - eth: intel: fixes for SR-IOV forwarding offloads - eth: broadcom: fixes for selftests and error recovery - eth: mellanox: flow steering and SR-IOV forwarding fixes Misc: - make __pskb_pull_tail() & pskb_carve_frag_list() drop_monitor friends not report freed skbs as drops - force inlining of checksum functions in net/checksum.h" * tag 'net-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (85 commits) net: mv643xx_eth: process retval from of_get_mac_address ping: remove pr_err from ping_lookup Revert "i40e: Fix reset bw limit when DCB enabled with 1 TC" openvswitch: Fix setting ipv6 fields causing hw csum failure ipv6: prevent a possible race condition with lifetimes net/smc: Use a mutex for locking "struct smc_pnettable" bnx2x: fix driver load from initrd Revert "xen-netback: Check for hotplug-status existence before watching" Revert "xen-netback: remove 'hotplug-status' once it has served its purpose" net/mlx5e: Fix VF min/max rate parameters interchange mistake net/mlx5e: Add missing increment of count net/mlx5e: MPLSoUDP decap, fix check for unsupported matches net/mlx5e: Fix MPLSoUDP encap to use MPLS action information net/mlx5e: Add feature check for set fec counters net/mlx5e: TC, Skip redundant ct clear actions net/mlx5e: TC, Reject rules with forward and drop actions net/mlx5e: TC, Reject rules with drop and modify hdr action net/mlx5e: kTLS, Use CHECKSUM_UNNECESSARY for device-offloaded packets net/mlx5e: Fix wrong return value on ioctl EEPROM query failure net/mlx5: Fix possible deadlock on rule deletion ...
2022-02-23cpupower: Add "perf" option to print AMD P-State informationHuang Rui
Add "-c --perf" option in cpupower-frequency-info to get the performance and frequency values for AMD P-State. Commit message amended: Shuah Khan <skhan@linuxfoundation.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-23Merge tag 'slab-for-5.17-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: - Build fix (workaround) for clang. - Fix a /proc/kcore based slabinfo script broken by struct slab changes in 5.17-rc1. * tag 'slab-for-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: tools/cgroup/slabinfo: update to work with struct slab slab: remove __alloc_size attribute from __kmalloc_track_caller
2022-02-22cpupower: Add function to print AMD P-State performance capabilitiesHuang Rui
AMD P-State kernel module is using the fine grain frequency instead of acpi hardware pstate. So add a function to print performance and frequency values. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22cpupower: Move print_speed function into misc helperHuang Rui
The print_speed can be as a common function, and expose it into misc helper header. Then it can be used on other helper files as well. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22cpupower: Enable boost state support for AMD P-State moduleHuang Rui
The legacy ACPI hardware P-States function has 3 P-States on ACPI table, the CPU frequency only can be switched between the 3 P-States. While the processor supports the boost state, it will have another boost state that the frequency can be higher than P0 state, and the state can be decoded by the function of decode_pstates() and read by amd_pci_get_num_boost_states(). However, the new AMD P-State function is different than legacy ACPI hardware P-State on AMD processors. That has a finer grain frequency range between the highest and lowest frequency. And boost frequency is actually the frequency which is mapped on highest performance ratio. The similar previous P0 frequency is mapped on nominal performance ratio. If the highest performance on the processor is higher than nominal performance, then we think the current processor supports the boost state. And it uses amd_pstate_boost_init() to initialize boost for AMD P-State function. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22cpupower: Add AMD P-State sysfs definition and access helperHuang Rui
Introduce the marco definitions and access helper function for AMD P-State sysfs interfaces such as each performance goals and frequency levels in amd helper file. They will be used to read the sysfs attribute from AMD P-State cpufreq driver for cpupower utilities. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22cpupower: Introduce ACPI CPPC libraryHuang Rui
Kernel ACPI subsytem introduced the sysfs attributes for acpi cppc library in below path: /sys/devices/system/cpu/cpuX/acpi_cppc/ And these attributes will be used for AMD P-State driver to provide some performance and frequency values. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22cpupower: Add the function to get the sysfs value from specific tableHuang Rui
Expose the helper into cpufreq header, then cpufreq driver can use this function to get the sysfs value if it has any specific sysfs interfaces. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22cpupower: Initial AMD P-State capabilityHuang Rui
If kernel starts the AMD P-State module, the cpupower will initial the capability flag as CPUPOWER_CAP_AMD_PSTATE. And once AMD P-State capability is set, it won't need to set legacy ACPI relative capabilities anymore. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22cpupower: Add the function to check AMD P-State enabledHuang Rui
The processor with AMD P-State function also supports legacy ACPI hardware P-States feature as well. Once driver sets AMD P-State eanbled, the processor will respond the finer grain AMD P-State feature instead of legacy ACPI P-States. So it introduces the cpupower_amd_pstate_enabled() to check whether the current kernel enables AMD P-State or AMD CPUFreq module. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22cpupower: Add AMD P-State capability flagHuang Rui
Add AMD P-State capability flag in cpupower to indicate AMD new P-State kernel module support on Ryzen processors. Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-22perf script: Fix error when printing 'weight' fieldGerman Gomez
In SPE traces the 'weight' field can't be printed in 'perf script' because the 'dummy:u' event doesn't have the WEIGHT attribute set. Use evsel__do_check_stype(..) to check this field, as it's done with other fields such as "phys_addr". Before: $ perf record -e arm_spe_0// -- sleep 1 $ perf script -F event,ip,weight Samples for 'dummy:u' event do not have WEIGHT attribute set. Cannot print 'weight' field. After: $ perf script -F event,ip,weight l1d-access: 12 ffffaf629d4cb320 tlb-access: 12 ffffaf629d4cb320 memory: 12 ffffaf629d4cb320 Fixes: b0fde9c6e291e528 ("perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT") Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220221171707.62960-1-german.gomez@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: 3915035282573c5e ("KVM: x86: SVM: move avic definitions from AMD's spec to svm.h") Addressing these tools/perf build warnings: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' That makes the beautification scripts to pick some new entries: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after --- before 2022-02-22 17:35:36.996271430 -0300 +++ after 2022-02-22 17:35:46.258503347 -0300 @@ -287,6 +287,7 @@ [0xc0010114 - x86_AMD_V_KVM_MSRs_offset] = "VM_CR", [0xc0010115 - x86_AMD_V_KVM_MSRs_offset] = "VM_IGNNE", [0xc0010117 - x86_AMD_V_KVM_MSRs_offset] = "VM_HSAVE_PA", + [0xc001011b - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SVM_AVIC_DOORBELL", [0xc001011e - x86_AMD_V_KVM_MSRs_offset] = "AMD64_VM_PAGE_FLUSH", [0xc001011f - x86_AMD_V_KVM_MSRs_offset] = "AMD64_VIRT_SPEC_CTRL", [0xc0010130 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SEV_ES_GHCB", $ And this gets rebuilt: CC /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o LD /tmp/build/perf/trace/beauty/tracepoints/perf-in.o LD /tmp/build/perf/trace/beauty/perf-in.o CC /tmp/build/perf/util/amd-sample-raw.o LD /tmp/build/perf/util/perf-in.o LD /tmp/build/perf/perf-in.o LINK /tmp/build/perf/perf Now one can trace systemwide asking to see backtraces to where those MSRs are being read/written with: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=AMD64_SVM_AVIC_DOORBELL && msr<=AMD64_SEV_ES_GHCB" ^C# If we use -v (verbose mode) we can see what it does behind the scenes: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=AMD64_SVM_AVIC_DOORBELL && msr<=AMD64_SEV_ES_GHCB" Using CPUID AuthenticAMD-25-21-0 0xc001011b 0xc0010130 New filter for msr:read_msr: (msr>=0xc001011b && msr<=0xc0010130) && (common_pid != 1019953 && common_pid != 3629) 0xc001011b 0xc0010130 New filter for msr:write_msr: (msr>=0xc001011b && msr<=0xc0010130) && (common_pid != 1019953 && common_pid != 3629) mmap size 528384B ^C# Example with a frequent msr: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2 Using CPUID AuthenticAMD-25-21-0 0x48 New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) 0x48 New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) mmap size 528384B Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols 0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) futex_wait_queue_me ([kernel.kallsyms]) futex_wait ([kernel.kallsyms]) do_futex ([kernel.kallsyms]) __x64_sys_futex ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so) 0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule_idle ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/YhVKxaft+z8rpOfy@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22perf data: Fix double free in perf_session__delete()Alexey Bayduraev
When perf_data__create_dir() fails, it calls close_dir(), but perf_session__delete() also calls close_dir() and since dir.version and dir.nr were initialized by perf_data__create_dir(), a double free occurs. This patch moves the initialization of dir.version and dir.nr after successful initialization of dir.files, that prevents double freeing. This behavior is already implemented in perf_data__open_dir(). Fixes: 145520631130bd64 ("perf data: Add perf_data__(create_dir|close_dir) functions") Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220218152341.5197-2-alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22linkage: remove SYM_FUNC_{START,END}_ALIAS()Mark Rutland
Now that all aliases are defined using SYM_FUNC_ALIAS(), remove the old SYM_FUNC_{START,END}_ALIAS() macros. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Mark Brown <broonie@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220216162229.1076788-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-22x86: clean up symbol aliasingMark Rutland
Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those to simplify the definition of function aliases across arch/x86. For clarity, where there are multiple annotations such as EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For example, where a function has a name and an alias which are both exported, this is organised as: SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) EXPORT_SYMBOL(func) SYM_FUNC_ALIAS(alias, func) EXPORT_SYMBOL(alias) Where there are only aliases and no exports or other annotations, I have not bothered with line spacing, e.g. SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) SYM_FUNC_ALIAS(alias, func) The tools/perf/ copies of memset_64.S and memset_64.S are updated likewise to avoid the build system complaining these are mismatched: | Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' | diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S | Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' | diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Mark Brown <broonie@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220216162229.1076788-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-22linkage: add SYM_FUNC_ALIAS{,_LOCAL,_WEAK}()Mark Rutland
Currently aliasing an asm function requires adding START and END annotations for each name, as per Documentation/asm-annotations.rst: SYM_FUNC_START_ALIAS(__memset) SYM_FUNC_START(memset) ... asm insns ... SYM_FUNC_END(memset) SYM_FUNC_END_ALIAS(__memset) This is more painful than necessary to maintain, especially where a function has many aliases, some of which we may wish to define conditionally. For example, arm64's memcpy/memmove implementation (which uses some arch-specific SYM_*() helpers) has: SYM_FUNC_START_ALIAS(__memmove) SYM_FUNC_START_ALIAS_WEAK_PI(memmove) SYM_FUNC_START_ALIAS(__memcpy) SYM_FUNC_START_WEAK_PI(memcpy) ... asm insns ... SYM_FUNC_END_PI(memcpy) EXPORT_SYMBOL(memcpy) SYM_FUNC_END_ALIAS(__memcpy) EXPORT_SYMBOL(__memcpy) SYM_FUNC_END_ALIAS_PI(memmove) EXPORT_SYMBOL(memmove) SYM_FUNC_END_ALIAS(__memmove) EXPORT_SYMBOL(__memmove) SYM_FUNC_START(name) It would be much nicer if we could define the aliases *after* the standard function definition. This would avoid the need to specify each symbol name twice, and would make it easier to spot the canonical function definition. This patch adds new macros to allow us to do so, which allows the above example to be rewritten more succinctly as: SYM_FUNC_START(__pi_memcpy) ... asm insns ... SYM_FUNC_END(__pi_memcpy) SYM_FUNC_ALIAS(__memcpy, __pi_memcpy) EXPORT_SYMBOL(__memcpy) SYM_FUNC_ALIAS_WEAK(memcpy, __memcpy) EXPORT_SYMBOL(memcpy) SYM_FUNC_ALIAS(__pi_memmove, __pi_memcpy) SYM_FUNC_ALIAS(__memmove, __pi_memmove) EXPORT_SYMBOL(__memmove) SYM_FUNC_ALIAS_WEAK(memmove, __memmove) EXPORT_SYMBOL(memmove) The reduction in duplication will also make it possible to replace some uses of WEAK with more accurate Kconfig guards, e.g. #ifndef CONFIG_KASAN SYM_FUNC_ALIAS(memmove, __memmove) EXPORT_SYMBOL(memmove) #endif ... which should make it easier to ensure that symbols are neither used nor overidden unexpectedly. The existing SYM_FUNC_START_ALIAS() and SYM_FUNC_START_LOCAL_ALIAS() are marked as deprecated, and will be removed once existing users are moved over to the new scheme. The tools/perf/ copy of linkage.h is updated to match. A subsequent patch will depend upon this when updating the x86 asm annotations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Mark Brown <broonie@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Peter Zijlstra <peterz@infradead.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220216162229.1076788-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-02-22KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3Nicholas Piggin
Add KVM_CAP_PPC_AIL_MODE_3 to advertise the capability to set the AIL resource mode to 3 with the H_SET_MODE hypercall. This capability differs between processor types and KVM types (PR, HV, Nested HV), and affects guest-visible behaviour. QEMU will implement a cap-ail-mode-3 to control this behaviour[1], and use the KVM CAP if available to determine KVM support[2]. Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-21Merge tag 'v5.17-rc5' into sched/core, to resolve conflictsIngo Molnar
New conflicts in sched/core due to the following upstream fixes: 44585f7bc0cb ("psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n") a06247c6804f ("psi: Fix uaf issue when psi trigger is destroyed while being polled") Conflicts: include/linux/psi_types.h kernel/sched/psi.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-02-21tools/cgroup/slabinfo: update to work with struct slabRoman Gushchin
After the introduction of the dedicated struct slab to describe slab pages by commit d122019bf061 ("mm: Split slab into its own type") and the following removal of the corresponding struct page's fields by commit 07f910f9b729 ("mm: Remove slab from struct page") the memcg_slabinfo tool broke. An attempt to run it produces a trace like this: Traceback (most recent call last): File "/usr/bin/drgn", line 33, in <module> sys.exit(load_entry_point('drgn==0.0.16', 'console_scripts', 'drgn')()) File "/usr/lib64/python3.9/site-packages/drgn/internal/cli.py", line 133, in main runpy.run_path(args.script[0], init_globals=init_globals, run_name="__main__") File "/usr/lib64/python3.9/runpy.py", line 268, in run_path return _run_module_code(code, init_globals, run_name, File "/usr/lib64/python3.9/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "memcg_slabinfo.py", line 226, in <module> main() File "memcg_slabinfo.py", line 199, in main cache = page.slab_cache AttributeError: 'struct page' has no member 'slab_cache' The problem can be fixed by explicitly casting struct page * to struct slab * for slab pages. The tools works as expected with this fix, e.g.: cred_jar 776 776 192 21 1 : tunables 0 0 0 : slabdata 547 547 0 kmalloc-cg-32 6 6 32 128 1 : tunables 0 0 0 : slabdata 9 9 0 files_cache 3 3 832 39 8 : tunables 0 0 0 : slabdata 8 8 0 kmalloc-cg-512 1 1 512 32 4 : tunables 0 0 0 : slabdata 10 10 0 task_struct 10 10 6720 4 8 : tunables 0 0 0 : slabdata 63 63 0 mm_struct 3 3 1664 19 8 : tunables 0 0 0 : slabdata 9 9 0 kmalloc-cg-16 1 1 16 256 1 : tunables 0 0 0 : slabdata 8 8 0 pde_opener 1 1 40 102 1 : tunables 0 0 0 : slabdata 8 8 0 anon_vma_chain 375 375 64 64 1 : tunables 0 0 0 : slabdata 81 81 0 radix_tree_node 3 3 584 28 4 : tunables 0 0 0 : slabdata 419 419 0 dentry 98 98 312 26 2 : tunables 0 0 0 : slabdata 1420 1420 0 btrfs_inode 3 3 2368 13 8 : tunables 0 0 0 : slabdata 730 730 0 signal_cache 3 3 1600 20 8 : tunables 0 0 0 : slabdata 17 17 0 sighand_cache 3 3 2240 14 8 : tunables 0 0 0 : slabdata 20 20 0 filp 90 90 512 32 4 : tunables 0 0 0 : slabdata 95 95 0 anon_vma 214 214 200 20 1 : tunables 0 0 0 : slabdata 162 162 0 kmalloc-cg-1k 1 1 1024 32 8 : tunables 0 0 0 : slabdata 22 22 0 pid 10 10 256 32 2 : tunables 0 0 0 : slabdata 14 14 0 kmalloc-cg-64 2 2 64 64 1 : tunables 0 0 0 : slabdata 8 8 0 kmalloc-cg-96 3 3 96 42 1 : tunables 0 0 0 : slabdata 8 8 0 sock_inode_cache 5 5 1408 23 8 : tunables 0 0 0 : slabdata 29 29 0 UNIX 7 7 1920 17 8 : tunables 0 0 0 : slabdata 21 21 0 inode_cache 36 36 1152 28 8 : tunables 0 0 0 : slabdata 680 680 0 proc_inode_cache 26 26 1224 26 8 : tunables 0 0 0 : slabdata 64 64 0 kmalloc-cg-2k 2 2 2048 16 8 : tunables 0 0 0 : slabdata 9 9 0 v2: change naming and count_partial()/count_free()/for_each_slab() signatures to work with slabs, suggested by Matthew Wilcox Fixes: 07f910f9b729 ("mm: Remove slab from struct page") Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Roman Gushchin <guro@fb.com> Tested-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/linux-patches/Yg2cKKnIboNu7j+p@carbon.DHCP.thefacebook.com/
2022-02-21x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCEPeter Zijlstra (Intel)
The RETPOLINE_AMD name is unfortunate since it isn't necessarily AMD only, in fact Hygon also uses it. Furthermore it will likely be sufficient for some Intel processors. Therefore rename the thing to RETPOLINE_LFENCE to better describe what it is. Add the spectre_v2=retpoline,lfence option as an alias to spectre_v2=retpoline,amd to preserve existing setups. However, the output of /sys/devices/system/cpu/vulnerabilities/spectre_v2 will be changed. [ bp: Fix typos, massage. ] Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2022-02-20Merge tag 'fs.mount_setattr.v5.17-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull mount_setattr test/doc fixes from Christian Brauner: "This contains a fix for one of the selftests for the mount_setattr syscall to create idmapped mounts, an entry for idmapped mounts for maintainers, and missing kernel documentation for the helper we split out some time ago to get and yield write access to a mount when changing mount properties" * tag 'fs.mount_setattr.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: fs: add kernel doc for mnt_{hold,unhold}_writers() MAINTAINERS: add entry for idmapped mounts tests: fix idmapped mount_setattr test
2022-02-19selftests: mptcp: be more conservative with cookie MPJ limitsPaolo Abeni
Since commit 2843ff6f36db ("mptcp: remote addresses fullmesh"), an MPTCP client can attempt creating multiple MPJ subflow simultaneusly. In such scenario the server, when syncookies are enabled, could end-up accepting incoming MPJ syn even above the configured subflow limit, as the such limit can be enforced in a reliable way only after the subflow creation. In case of syncookie, only after the 3rd ack reception. As a consequence the related self-tests case sporadically fails, as it verify that the server always accept the expected number of MPJ syn. Address the issues relaxing the MPJ syn number constrain. Note that the check on the accepted number of MPJ 3rd ack still remains intact. Fixes: 2843ff6f36db ("mptcp: remote addresses fullmesh") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19selftests: mptcp: more robust signal race testPaolo Abeni
The in kernel MPTCP PM implementation can process a single incoming add address option at any given time. In the mentioned test the server can surpass such limit. Let the setup cope with that allowing a faster add_addr retransmission. Fixes: a88c9e496937 ("mptcp: do not block subflows creation on errors") Fixes: f7efc7771eac ("mptcp: drop argument port from mptcp_pm_announce_addr") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/254 Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19selftests: mptcp: improve 'fair usage on close' stabilityPaolo Abeni
The mentioned test has to wait for a subflow creation failure. The current code looks for TCP sockets in TW state and sometimes misses the relevant event. Switch to a more stable check, looking for the associated mib counter. Fixes: 46e967d187ed ("selftests: mptcp: add tests for subflow creation failure") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/257 Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19selftests: mptcp: fix diag instabilityPaolo Abeni
Instead of waiting for an arbitrary amount of time for the MPTCP MP_CAPABLE handshake to complete, explicitly wait for the relevant socket to enter into the established status. Additionally let the data transfer application use the slowest transfer mode available (-r), to cope with very slow host, or high jitter caused by hosting VMs. Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/258 Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18perf evlist: Fix failed to use cpu list for uncore eventsZhengjun Xing
The 'perf record' and 'perf stat' commands have supported the option '-C/--cpus' to count or collect only on the list of CPUs provided. Commit 1d3351e631fc34d7 ("perf tools: Enable on a list of CPUs for hybrid") add it to be supported for hybrid. For hybrid support, it checks the cpu list are available on hybrid PMU. But when we test only uncore events(or events not in cpu_core and cpu_atom), there is a bug: Before: # perf stat -C0 -e uncore_clock/clockticks/ sleep 1 failed to use cpu list 0 In this case, for uncore event, its pmu_name is not cpu_core or cpu_atom, so in evlist__fix_hybrid_cpus, perf_pmu__find_hybrid_pmu should return NULL,both events_nr and unmatched_count should be 0 ,then the cpu list check function evlist__fix_hybrid_cpus return -1 and the error "failed to use cpu list 0" will happen. Bypass "events_nr=0" case then the issue is fixed. After: # perf stat -C0 -e uncore_clock/clockticks/ sleep 1 Performance counter stats for 'CPU(s) 0': 195,476,873 uncore_clock/clockticks/ 1.004518677 seconds time elapsed When testing with at least one core event and uncore events, it has no issue. # perf stat -C0 -e cpu_core/cpu-cycles/,uncore_clock/clockticks/ sleep 1 Performance counter stats for 'CPU(s) 0': 5,993,774 cpu_core/cpu-cycles/ 301,025,912 uncore_clock/clockticks/ 1.003964934 seconds time elapsed Fixes: 1d3351e631fc34d7 ("perf tools: Enable on a list of CPUs for hybrid") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: alexander.shishkin@intel.com Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220218093127.1844241-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>