summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-07gve: fix for null pointer dereference.Ameer Hamza
Avoid passing NULL skb to __skb_put() function call if napi_alloc_skb() returns NULL. Fixes: 37149e9374bf ("gve: Implement packet continuation for RX.") Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com> Link: https://lore.kernel.org/r/20211205183810.8299-1-amhamza.mgc@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07MAINTAINERS: net: mlxsw: Remove Jiri as a maintainer, add myselfPetr Machata
Jiri has moved on and will not carry out the mlxsw maintainership duty any longer. Add myself as a co-maintainer instead. Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/45b54312cdebaf65c5d110b15a5dd2df795bf2be.1638807297.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge branch 'net-tls-cover-all-ciphers-with-tests'Jakub Kicinski
Vadim Fedorenko says: ==================== net: tls: cover all ciphers with tests Recent patches to Kernel TLS showed that some ciphers are not covered with tests. Let's cover missed. ==================== Link: https://lore.kernel.org/r/20211206213932.7508-1-vfedorenko@novek.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: tls: add missing AES256-GCM cipherVadim Fedorenko
Add tests for TLSv1.2 and TLSv1.3 with AES256-GCM cipher Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07selftests: tls: add missing AES-CCM cipher testsVadim Fedorenko
Add tests for TLSv1.2 and TLSv1.3 with AES-CCM cipher. Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08netfilter: conntrack: annotate data-races around ct->timeoutEric Dumazet
(struct nf_conn)->timeout can be read/written locklessly, add READ_ONCE()/WRITE_ONCE() to prevent load/store tearing. BUG: KCSAN: data-race in __nf_conntrack_alloc / __nf_conntrack_find_get write to 0xffff888132e78c08 of 4 bytes by task 6029 on cpu 0: __nf_conntrack_alloc+0x158/0x280 net/netfilter/nf_conntrack_core.c:1563 init_conntrack+0x1da/0xb30 net/netfilter/nf_conntrack_core.c:1635 resolve_normal_ct+0x502/0x610 net/netfilter/nf_conntrack_core.c:1746 nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901 ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414 nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline] nf_hook_slow+0x72/0x170 net/netfilter/core.c:619 nf_hook include/linux/netfilter.h:262 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324 inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402 tcp_transmit_skb net/ipv4/tcp_output.c:1420 [inline] tcp_write_xmit+0x1450/0x4460 net/ipv4/tcp_output.c:2680 __tcp_push_pending_frames+0x68/0x1c0 net/ipv4/tcp_output.c:2864 tcp_push_pending_frames include/net/tcp.h:1897 [inline] tcp_data_snd_check+0x62/0x2e0 net/ipv4/tcp_input.c:5452 tcp_rcv_established+0x880/0x10e0 net/ipv4/tcp_input.c:5947 tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0xf2/0x270 net/core/sock.c:2768 release_sock+0x40/0x110 net/core/sock.c:3300 sk_stream_wait_memory+0x435/0x700 net/core/stream.c:145 tcp_sendmsg_locked+0xb85/0x25a0 net/ipv4/tcp.c:1402 tcp_sendmsg+0x2c/0x40 net/ipv4/tcp.c:1440 inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:644 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] __sys_sendto+0x21e/0x2c0 net/socket.c:2036 __do_sys_sendto net/socket.c:2048 [inline] __se_sys_sendto net/socket.c:2044 [inline] __x64_sys_sendto+0x74/0x90 net/socket.c:2044 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff888132e78c08 of 4 bytes by task 17446 on cpu 1: nf_ct_is_expired include/net/netfilter/nf_conntrack.h:286 [inline] ____nf_conntrack_find net/netfilter/nf_conntrack_core.c:776 [inline] __nf_conntrack_find_get+0x1c7/0xac0 net/netfilter/nf_conntrack_core.c:807 resolve_normal_ct+0x273/0x610 net/netfilter/nf_conntrack_core.c:1734 nf_conntrack_in+0x1c5/0x88f net/netfilter/nf_conntrack_core.c:1901 ipv6_conntrack_local+0x19/0x20 net/netfilter/nf_conntrack_proto.c:414 nf_hook_entry_hookfn include/linux/netfilter.h:142 [inline] nf_hook_slow+0x72/0x170 net/netfilter/core.c:619 nf_hook include/linux/netfilter.h:262 [inline] NF_HOOK include/linux/netfilter.h:305 [inline] ip6_xmit+0xa3a/0xa60 net/ipv6/ip6_output.c:324 inet6_csk_xmit+0x1a2/0x1e0 net/ipv6/inet6_connection_sock.c:135 __tcp_transmit_skb+0x132a/0x1840 net/ipv4/tcp_output.c:1402 __tcp_send_ack+0x1fd/0x300 net/ipv4/tcp_output.c:3956 tcp_send_ack+0x23/0x30 net/ipv4/tcp_output.c:3962 __tcp_ack_snd_check+0x2d8/0x510 net/ipv4/tcp_input.c:5478 tcp_ack_snd_check net/ipv4/tcp_input.c:5523 [inline] tcp_rcv_established+0x8c2/0x10e0 net/ipv4/tcp_input.c:5948 tcp_v6_do_rcv+0x36e/0xa50 net/ipv6/tcp_ipv6.c:1521 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0xf2/0x270 net/core/sock.c:2768 release_sock+0x40/0x110 net/core/sock.c:3300 tcp_sendpage+0x94/0xb0 net/ipv4/tcp.c:1114 inet_sendpage+0x7f/0xc0 net/ipv4/af_inet.c:833 rds_tcp_xmit+0x376/0x5f0 net/rds/tcp_send.c:118 rds_send_xmit+0xbed/0x1500 net/rds/send.c:367 rds_send_worker+0x43/0x200 net/rds/threads.c:200 process_one_work+0x3fc/0x980 kernel/workqueue.c:2298 worker_thread+0x616/0xa70 kernel/workqueue.c:2445 kthread+0x2c7/0x2e0 kernel/kthread.c:327 ret_from_fork+0x1f/0x30 value changed: 0x00027cc2 -> 0x00000000 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 17446 Comm: kworker/u4:5 Tainted: G W 5.16.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: krdsd rds_send_worker Note: I chose an arbitrary commit for the Fixes: tag, because I do not think we need to backport this fix to very old kernels. Fixes: e37542ba111f ("netfilter: conntrack: avoid possible false sharing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08selftests: netfilter: switch zone stress to socatFlorian Westphal
centos9 has nmap-ncat which doesn't like the '-q' option, use socat. While at it, mark test skipped if needed tools are missing. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08netfilter: nft_exthdr: break evaluation if setting TCP option failsPablo Neira Ayuso
Break rule evaluation on malformed TCP options. Fixes: 99d1712bc41c ("netfilter: exthdr: tcp option set support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08selftests: netfilter: Add correctness test for mac,net set typeStefano Brivio
The existing net,mac test didn't cover the issue recently reported by Nikita Yushchenko, where MAC addresses wouldn't match if given as first field of a concatenated set with AVX2 and 8-bit groups, because there's a different code path covering the lookup of six 8-bit groups (MAC addresses) if that's the first field. Add a similar mac,net test, with MAC address and IPv4 address swapped in the set specification. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groupsStefano Brivio
The sixth byte of packet data has to be looked up in the sixth group, not in the seventh one, even if we load the bucket data into ymm6 (and not ymm5, for convenience of tracking stalls). Without this fix, matching on a MAC address as first field of a set, if 8-bit groups are selected (due to a small set size) would fail, that is, the given MAC address would never match. Reported-by: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com> Cc: <stable@vger.kernel.org> # 5.6.x Fixes: 7400b063969b ("nft_set_pipapo: Introduce AVX2-based lookup implementation") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-By: Nikita Yushchenko <nikita.yushchenko@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-08vrf: don't run conntrack on vrf with !dflt qdiscNicolas Dichtel
After the below patch, the conntrack attached to skb is set to "notrack" in the context of vrf device, for locally generated packets. But this is true only when the default qdisc is set to the vrf device. When changing the qdisc, notrack is not set anymore. In fact, there is a shortcut in the vrf driver, when the default qdisc is set, see commit dcdd43c41e60 ("net: vrf: performance improvements for IPv4") for more details. This patch ensures that the behavior is always the same, whatever the qdisc is. To demonstrate the difference, a new test is added in conntrack_vrf.sh. Fixes: 8c9c296adfae ("vrf: run conntrack only in context of lower/physdev for locally generated packets") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-12-07Merge tag 'perf-tools-fixes-for-v5.16-2021-12-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix SMT detection fast read path on sysfs. - Fix memory leaks when processing feature headers in perf.data files. - Fix 'Simple expression parser' 'perf test' on arch without CPU die topology info, such as s/390. - Fix building perf with BUILD_BPF_SKEL=1. - Fix 'perf bench' by reverting "perf bench: Fix two memory leaks detected with ASan". - Fix itrace space allowed for new attributes in 'perf script'. - Fix the build feature detection fast path, that was always failing on systems with python3 development packages, speeding up the build. - Reset shadow counts before loading, fixing metrics using duration_time. - Sync more kernel headers changed by the new futex_waitv syscall: s390 and powerpc. * tag 'perf-tools-fixes-for-v5.16-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf bpf_skel: Do not use typedef to avoid error on old clang perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distros perf header: Fix memory leaks when processing feature headers perf test: Reset shadow counts before loading perf test: Fix 'Simple expression parser' test on arch without CPU die topology info tools build: Remove needless libpython-version feature check that breaks test-all fast path perf tools: Fix SMT detection fast read path tools headers UAPI: Sync powerpc syscall table file changed by new futex_waitv syscall perf inject: Fix itrace space allowed for new attributes tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv syscall Revert "perf bench: Fix two memory leaks detected with ASan"
2021-12-07ice: fix adding different tunnelsMichal Swiatkowski
Adding filters with the same values inside for VXLAN and Geneve causes HW error, because it looks exactly the same. To choose between different type of tunnels new recipe is needed. Add storing tunnel types in creating recipes function and start checking it in finding function. Change getting open tunnels function to return port on correct tunnel type. This is needed to copy correct port to dummy packet. Block user from adding enc_dst_port via tc flower, because VXLAN and Geneve filters can be created only with destination port which was previously opened. Fixes: 8b032a55c1bd5 ("ice: low level support for tunnels") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: fix choosing UDP header typeMichal Swiatkowski
In tunnels packet there can be two UDP headers: - outer which for hw should be mark as ICE_UDP_OF - inner which for hw should be mark as ICE_UDP_ILOS or as ICE_TCP_IL if inner header is of TCP type In none tunnels packet header can be: - UDP, which for hw should be mark as ICE_UDP_ILOS - TCP, which for hw should be mark as ICE_TCP_IL Change incorrect ICE_UDP_OF for none tunnel packets to ICE_UDP_ILOS. ICE_UDP_OF is incorrect for none tunnel packets and setting it leads to error from hw while adding this kind of recipe. In summary, for tunnel outer port type should always be set to ICE_UDP_OF, for none tunnel outer and tunnel inner it should always be set to ICE_UDP_ILOS. Fixes: 9e300987d4a8 ("ice: VXLAN and Geneve TC support") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: ignore dropped packets during initJesse Brandeburg
If the hardware is constantly receiving unicast or broadcast packets during driver load, the device previously counted many GLV_RDPC (VSI dropped packets) events during init. This causes confusing dropped packet statistics during driver load. The dropped packets counter incrementing does stop once the driver finishes loading. Avoid this problem by baselining our statistics at the end of driver open instead of the end of probe. Fixes: cdedef59deb0 ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: Fix problems with DSCP QoS implementationDave Ertman
The patch that implemented DSCP QoS implementation removed a bandwidth check that was used to check for a specific condition caused by some corner cases. This check should not of been removed. The same patch also added a check for when the DCBx state could be changed in relation to DSCP, but the check was erroneously added nested in a check for CEE mode, which made the check useless. Fix these problems by re-adding the bandwidth check and relocating the DSCP mode check earlier in the function that changes DCBx state in the driver. Fixes: 2a87bd73e50d ("ice: Add DSCP support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: rearm other interrupt cause register after enabling VFsPaul Greenwalt
The other interrupt cause register (OICR), global interrupt 0, is disabled when enabling VFs to prevent handling VFLR. If the OICR is not rearmed then the VF cannot communicate with the PF. Rearm the OICR after enabling VFs. Fixes: 916c7fdf5e93 ("ice: Separate VF VSI initialization/creation from reset flow") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07ice: fix FDIR init missing when reset VFYahui Cao
When VF is being reset, ice_reset_vf() will be called and FDIR resource should be released and initialized again. Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF") Signed-off-by: Yahui Cao <yahui.cao@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-07net/qla3xxx: fix an error code in ql_adapter_up()Dan Carpenter
The ql_wait_for_drvr_lock() fails and returns false, then this function should return an error code instead of returning success. The other problem is that the success path prints an error message netdev_err(ndev, "Releasing driver lock\n"); Delete that and re-order the code a little to make it more clear. Fixes: 5a4faa873782 ("[PATCH] qla3xxx NIC driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20211207082416.GA16110@kili Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge tag 'linux-can-fixes-for-5.16-20211207' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== can 2021-12-07 The 1st patch is by Vincent Mailhol and fixes a use after free in the pch_can driver. Dan Carpenter fixes a use after free in the ems_pcmcia sja1000 driver. The remaining 7 patches target the m_can driver. Brian Silverman contributes a patch to disable and ignore the ELO interrupt, which is currently not handled in the driver and may lead to an interrupt storm. Vincent Mailhol's patch fixes a memory leak in the error path of the m_can_read_fifo() function. The remaining patches are contributed by Matthias Schiffer, first a iomap_read_fifo() and iomap_write_fifo() functions are fixed in the PCI glue driver, then the clock rate for the Intel Ekhart Lake platform is fixed, the last 3 patches add support for the custom bit timings on the Elkhart Lake platform. * tag 'linux-can-fixes-for-5.16-20211207' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: m_can: pci: use custom bit timings for Elkhart Lake can: m_can: make custom bittiming fields const Revert "can: m_can: remove support for custom bit timing" can: m_can: pci: fix incorrect reference clock rate can: m_can: pci: fix iomap_read_fifo() and iomap_write_fifo() can: m_can: m_can_read_fifo: fix memory leak in error branch can: m_can: Disable and ignore ELO interrupt can: sja1000: fix use after free in ems_pcmcia_add_card() can: pch_can: pch_can_rx_normal: fix use after free ==================== Link: https://lore.kernel.org/r/20211207102420.120131-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge tag 'platform-drivers-x86-v5.16-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "Various bug-fixes and hardware-id additions" * tag 'platform-drivers-x86-v5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel: hid: add quirk to support Surface Go 3 platform/x86: amd-pmc: Fix s2idle failures on certain AMD laptops platform/x86: touchscreen_dmi: Add TrekStor SurfTab duo W1 touchscreen info platform/x86: lg-laptop: Recognize more models platform/x86: thinkpad_acpi: Add lid_logo_dot to the list of safe LEDs platform/x86: thinkpad_acpi: Restore missing hotkey_tablet_mode and hotkey_radio_sw sysfs-attr
2021-12-07RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQTatyana Nikolova
Completion events (CEs) are lost if the application is allowed to arm the CQ more than two times when no new CE for this CQ has been generated by the HW. Check if arming has been done for the CQ and if not, arm the CQ for any event otherwise promote to arm the CQ for any event only when the last arm event was solicited. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20211201231509.1930-2-shiraz.saleem@intel.com Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07RDMA/irdma: Report correct WC errorsShiraz Saleem
Return IBV_WC_REM_OP_ERR for responder QP errors instead of IBV_WC_REM_ACCESS_ERR. Return IBV_WC_LOC_QP_OP_ERR for errors detected on the SQ with bad opcodes Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/20211201231509.1930-1-shiraz.saleem@intel.com Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07RDMA/irdma: Fix a potential memory allocation issue in ↵Christophe JAILLET
'irdma_prm_add_pble_mem()' 'pchunk->bitmapbuf' is a bitmap. Its size (in number of bits) is stored in 'pchunk->sizeofbitmap'. When it is allocated, the size (in bytes) is computed by: size_in_bits >> 3 There are 2 issues (numbers bellow assume that longs are 64 bits): - there is no guarantee here that 'pchunk->bitmapmem.size' is modulo BITS_PER_LONG but bitmaps are stored as longs (sizeofbitmap=8 bits will only allocate 1 byte, instead of 8 (1 long)) - the number of bytes is computed with a shift, not a round up, so we may allocate less memory than needed (sizeofbitmap=65 bits will only allocate 8 bytes (i.e. 1 long), when 2 longs are needed = 16 bytes) Fix both issues by using 'bitmap_zalloc()' and remove the useless 'bitmapmem' from 'struct irdma_chunk'. While at it, remove some useless NULL test before calling kfree/bitmap_free. Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Link: https://lore.kernel.org/r/5e670b640508e14b1869c3e8e4fb970d78cbe997.1638692171.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07RDMA/irdma: Fix a user-after-free in add_pble_prmShiraz Saleem
When irdma_hmc_sd_one fails, 'chunk' is freed while its still on the PBLE info list. Add the chunk entry to the PBLE info list only after successful setting of the SD in irdma_hmc_sd_one. Fixes: e8c4dbc2fcac ("RDMA/irdma: Add PBLE resource manager") Link: https://lore.kernel.org/r/20211207152135.2192-1-shiraz.saleem@intel.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddrMike Marciniszyn
This buffer is currently allocated in hfi1_init(): if (reinit) ret = init_after_reset(dd); else ret = loadtime_init(dd); if (ret) goto done; /* allocate dummy tail memory for all receive contexts */ dd->rcvhdrtail_dummy_kvaddr = dma_alloc_coherent(&dd->pcidev->dev, sizeof(u64), &dd->rcvhdrtail_dummy_dma, GFP_KERNEL); if (!dd->rcvhdrtail_dummy_kvaddr) { dd_dev_err(dd, "cannot allocate dummy tail memory\n"); ret = -ENOMEM; goto done; } The reinit triggered path will overwrite the old allocation and leak it. Fix by moving the allocation to hfi1_alloc_devdata() and the deallocation to hfi1_free_devdata(). Link: https://lore.kernel.org/r/20211129192008.101968.91302.stgit@awfm-01.cornelisnetworks.com Cc: stable@vger.kernel.org Fixes: 46b010d3eeb8 ("staging/rdma/hfi1: Workaround to prevent corruption during packet delivery") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07IB/hfi1: Fix early init panicMike Marciniszyn
The following trace can be observed with an init failure such as firmware load failures: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0010 [#1] SMP PTI CPU: 0 PID: 537 Comm: kworker/0:3 Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1 Workqueue: events work_for_cpu_fn RIP: 0010:0x0 Code: Bad RIP value. RSP: 0000:ffffae5f878a3c98 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff95e48e025c00 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff95e48e025c00 RBP: ffff95e4bf3660a4 R08: 0000000000000000 R09: ffffffff86d5e100 R10: ffff95e49e1de600 R11: 0000000000000001 R12: ffff95e4bf366180 R13: ffff95e48e025c00 R14: ffff95e4bf366028 R15: ffff95e4bf366000 FS: 0000000000000000(0000) GS:ffff95e4df200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000f86a0a003 CR4: 00000000001606f0 Call Trace: receive_context_interrupt+0x1f/0x40 [hfi1] __free_irq+0x201/0x300 free_irq+0x2e/0x60 pci_free_irq+0x18/0x30 msix_free_irq.part.2+0x46/0x80 [hfi1] msix_clean_up_interrupts+0x2b/0x70 [hfi1] hfi1_init_dd+0x640/0x1a90 [hfi1] do_init_one.isra.19+0x34d/0x680 [hfi1] local_pci_probe+0x41/0x90 work_for_cpu_fn+0x16/0x20 process_one_work+0x1a7/0x360 worker_thread+0x1cf/0x390 ? create_worker+0x1a0/0x1a0 kthread+0x112/0x130 ? kthread_flush_work_fn+0x10/0x10 ret_from_fork+0x35/0x40 The free_irq() results in a callback to the registered interrupt handler, and rcd->do_interrupt is NULL because the receive context data structures are not fully initialized. Fix by ensuring that the do_interrupt is always assigned and adding a guards in the slow path handler to detect and handle a partially initialized receive context and noop the receive. Link: https://lore.kernel.org/r/20211129192003.101968.33612.stgit@awfm-01.cornelisnetworks.com Cc: stable@vger.kernel.org Fixes: b0ba3c18d6bf ("IB/hfi1: Move normal functions from hfi1_devdata to const array") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07IB/hfi1: Insure use of smp_processor_id() is preempt disabledMike Marciniszyn
The following BUG has just surfaced with our 5.16 testing: BUG: using smp_processor_id() in preemptible [00000000] code: mpicheck/1581081 caller is sdma_select_user_engine+0x72/0x210 [hfi1] CPU: 0 PID: 1581081 Comm: mpicheck Tainted: G S 5.16.0-rc1+ #1 Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0016.033120161139 03/31/2016 Call Trace: <TASK> dump_stack_lvl+0x33/0x42 check_preemption_disabled+0xbf/0xe0 sdma_select_user_engine+0x72/0x210 [hfi1] ? _raw_spin_unlock_irqrestore+0x1f/0x31 ? hfi1_mmu_rb_insert+0x6b/0x200 [hfi1] hfi1_user_sdma_process_request+0xa02/0x1120 [hfi1] ? hfi1_write_iter+0xb8/0x200 [hfi1] hfi1_write_iter+0xb8/0x200 [hfi1] do_iter_readv_writev+0x163/0x1c0 do_iter_write+0x80/0x1c0 vfs_writev+0x88/0x1a0 ? recalibrate_cpu_khz+0x10/0x10 ? ktime_get+0x3e/0xa0 ? __fget_files+0x66/0xa0 do_writev+0x65/0x100 do_syscall_64+0x3a/0x80 Fix this long standing bug by moving the smp_processor_id() to after the rcu_read_lock(). The rcu_read_lock() implicitly disables preemption. Link: https://lore.kernel.org/r/20211129191958.101968.87329.stgit@awfm-01.cornelisnetworks.com Cc: stable@vger.kernel.org Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup") Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07IB/hfi1: Correct guard on eager buffer deallocationMike Marciniszyn
The code tests the dma address which legitimately can be 0. The code should test the kernel logical address to avoid leaking eager buffer allocations that happen to map to a dma address of 0. Fixes: 60368186fd85 ("IB/hfi1: Fix user-space buffers mapping with IOMMU enabled") Link: https://lore.kernel.org/r/20211129191952.101968.17137.stgit@awfm-01.cornelisnetworks.com Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07netfs: fix parameter of cleanup()Jeffle Xu
The order of these two parameters is just reversed. gcc didn't warn on that, probably because 'void *' can be converted from or to other pointer types without warning. Cc: stable@vger.kernel.org Fixes: 3d3c95046742 ("netfs: Provide readahead and readpage netfs helpers") Fixes: e1b1240c1ff5 ("netfs: Add write_begin helper") Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Link: https://lore.kernel.org/r/20211207031449.100510-1-jefflexu@linux.alibaba.com/ # v1
2021-12-07netfs: Fix lockdep warning from taking sb_writers whilst holding mmap_lockDavid Howells
Taking sb_writers whilst holding mmap_lock isn't allowed and will result in a lockdep warning like that below. The problem comes from cachefiles needing to take the sb_writers lock in order to do a write to the cache, but being asked to do this by netfslib called from readpage, readahead or write_begin[1]. Fix this by always offloading the write to the cache off to a worker thread. The main thread doesn't need to wait for it, so deadlock can be avoided. This can be tested by running the quick xfstests on something like afs or ceph with lockdep enabled. WARNING: possible circular locking dependency detected 5.15.0-rc1-build2+ #292 Not tainted ------------------------------------------------------ holetest/65517 is trying to acquire lock: ffff88810c81d730 (mapping.invalidate_lock#3){.+.+}-{3:3}, at: filemap_fault+0x276/0x7a5 but task is already holding lock: ffff8881595b53e8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x28d/0x59c which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&mm->mmap_lock#2){++++}-{3:3}: validate_chain+0x3c4/0x4a8 __lock_acquire+0x89d/0x949 lock_acquire+0x2dc/0x34b __might_fault+0x87/0xb1 strncpy_from_user+0x25/0x18c removexattr+0x7c/0xe5 __do_sys_fremovexattr+0x73/0x96 do_syscall_64+0x67/0x7a entry_SYSCALL_64_after_hwframe+0x44/0xae -> #1 (sb_writers#10){.+.+}-{0:0}: validate_chain+0x3c4/0x4a8 __lock_acquire+0x89d/0x949 lock_acquire+0x2dc/0x34b cachefiles_write+0x2b3/0x4bb netfs_rreq_do_write_to_cache+0x3b5/0x432 netfs_readpage+0x2de/0x39d filemap_read_page+0x51/0x94 filemap_get_pages+0x26f/0x413 filemap_read+0x182/0x427 new_sync_read+0xf0/0x161 vfs_read+0x118/0x16e ksys_read+0xb8/0x12e do_syscall_64+0x67/0x7a entry_SYSCALL_64_after_hwframe+0x44/0xae -> #0 (mapping.invalidate_lock#3){.+.+}-{3:3}: check_noncircular+0xe4/0x129 check_prev_add+0x16b/0x3a4 validate_chain+0x3c4/0x4a8 __lock_acquire+0x89d/0x949 lock_acquire+0x2dc/0x34b down_read+0x40/0x4a filemap_fault+0x276/0x7a5 __do_fault+0x96/0xbf do_fault+0x262/0x35a __handle_mm_fault+0x171/0x1b5 handle_mm_fault+0x12a/0x233 do_user_addr_fault+0x3d2/0x59c exc_page_fault+0x85/0xa5 asm_exc_page_fault+0x1e/0x30 other info that might help us debug this: Chain exists of: mapping.invalidate_lock#3 --> sb_writers#10 --> &mm->mmap_lock#2 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_lock#2); lock(sb_writers#10); lock(&mm->mmap_lock#2); lock(mapping.invalidate_lock#3); *** DEADLOCK *** 1 lock held by holetest/65517: #0: ffff8881595b53e8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x28d/0x59c stack backtrace: CPU: 0 PID: 65517 Comm: holetest Not tainted 5.15.0-rc1-build2+ #292 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Call Trace: dump_stack_lvl+0x45/0x59 check_noncircular+0xe4/0x129 ? print_circular_bug+0x207/0x207 ? validate_chain+0x461/0x4a8 ? add_chain_block+0x88/0xd9 ? hlist_add_head_rcu+0x49/0x53 check_prev_add+0x16b/0x3a4 validate_chain+0x3c4/0x4a8 ? check_prev_add+0x3a4/0x3a4 ? mark_lock+0xa5/0x1c6 __lock_acquire+0x89d/0x949 lock_acquire+0x2dc/0x34b ? filemap_fault+0x276/0x7a5 ? rcu_read_unlock+0x59/0x59 ? add_to_page_cache_lru+0x13c/0x13c ? lock_is_held_type+0x7b/0xd3 down_read+0x40/0x4a ? filemap_fault+0x276/0x7a5 filemap_fault+0x276/0x7a5 ? pagecache_get_page+0x2dd/0x2dd ? __lock_acquire+0x8bc/0x949 ? pte_offset_kernel.isra.0+0x6d/0xc3 __do_fault+0x96/0xbf ? do_fault+0x124/0x35a do_fault+0x262/0x35a ? handle_pte_fault+0x1c1/0x20d __handle_mm_fault+0x171/0x1b5 ? handle_pte_fault+0x20d/0x20d ? __lock_release+0x151/0x254 ? mark_held_locks+0x1f/0x78 ? rcu_read_unlock+0x3a/0x59 handle_mm_fault+0x12a/0x233 do_user_addr_fault+0x3d2/0x59c ? pgtable_bad+0x70/0x70 ? rcu_read_lock_bh_held+0xab/0xab exc_page_fault+0x85/0xa5 ? asm_exc_page_fault+0x8/0x30 asm_exc_page_fault+0x1e/0x30 RIP: 0033:0x40192f Code: ff 48 89 c3 48 8b 05 50 28 00 00 48 85 ed 7e 23 31 d2 4b 8d 0c 2f eb 0a 0f 1f 00 48 8b 05 39 28 00 00 48 0f af c2 48 83 c2 01 <48> 89 1c 01 48 39 d5 7f e8 8b 0d f2 27 00 00 31 c0 85 c9 74 0e 8b RSP: 002b:00007f9931867eb0 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00007f9931868700 RCX: 00007f993206ac00 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00007ffc13e06ee0 RBP: 0000000000000100 R08: 0000000000000000 R09: 00007f9931868700 R10: 00007f99318689d0 R11: 0000000000000202 R12: 00007ffc13e06ee0 R13: 0000000000000c00 R14: 00007ffc13e06e00 R15: 00007f993206a000 Fixes: 726218fdc22c ("netfs: Define an interface to talk to a cache") Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Jeff Layton <jlayton@kernel.org> cc: Jan Kara <jack@suse.cz> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20210922110420.GA21576@quack2.suse.cz/ [1] Link: https://lore.kernel.org/r/163887597541.1596626.2668163316598972956.stgit@warthog.procyon.org.uk/ # v1
2021-12-07can: m_can: pci: use custom bit timings for Elkhart LakeMatthias Schiffer
The relevant datasheet [1] specifies nonstandard limits for the bit timing parameters. While it is unclear what the exact effect of violating these limits is, it seems like a good idea to adhere to the documentation. [1] Intel Atom® x6000E Series, and Intel® Pentium® and Celeron® N and J Series Processors for IoT Applications Datasheet, Volume 2 (Book 3 of 3), July 2021, Revision 001 Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake") Link: https://lore.kernel.org/all/9eba5d7c05a48ead4024ffa6e5926f191d8c6b38.1636967198.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07can: m_can: make custom bittiming fields constMatthias Schiffer
The assigned timing structs will be defined a const anyway, so we can avoid a few casts by declaring the struct fields as const as well. Link: https://lore.kernel.org/all/4508fa4e639164b2584c49a065d90c78a91fa568.1636967198.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07Revert "can: m_can: remove support for custom bit timing"Matthias Schiffer
The timing limits specified by the Elkhart Lake CPU datasheets do not match the defaults. Let's reintroduce the support for custom bit timings. This reverts commit 0ddd83fbebbc5537f9d180d31f659db3564be708. Link: https://lore.kernel.org/all/00c9e2596b1a548906921a574d4ef7a03c0dace0.1636967198.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07can: m_can: pci: fix incorrect reference clock rateMatthias Schiffer
When testing the CAN controller on our Ekhart Lake hardware, we determined that all communication was running with twice the configured bitrate. Changing the reference clock rate from 100MHz to 200MHz fixed this. Intel's support has confirmed to us that 200MHz is indeed the correct clock rate. Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake") Link: https://lore.kernel.org/all/c9cf3995f45c363e432b3ae8eb1275e54f009fc8.1636967198.git.matthias.schiffer@ew.tq-group.com Cc: stable@vger.kernel.org Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07can: m_can: pci: fix iomap_read_fifo() and iomap_write_fifo()Matthias Schiffer
The same fix that was previously done in m_can_platform in commit 99d173fbe894 ("can: m_can: fix iomap_read_fifo() and iomap_write_fifo()") is required in m_can_pci as well to make iomap_read_fifo() and iomap_write_fifo() work for val_count > 1. Fixes: 812270e5445b ("can: m_can: Batch FIFO writes during CAN transmit") Fixes: 1aa6772f64b4 ("can: m_can: Batch FIFO reads during CAN receive") Link: https://lore.kernel.org/all/20211118144011.10921-1-matthias.schiffer@ew.tq-group.com Cc: stable@vger.kernel.org Cc: Matt Kline <matt@bitbashing.io> Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07can: m_can: m_can_read_fifo: fix memory leak in error branchVincent Mailhol
In m_can_read_fifo(), if the second call to m_can_fifo_read() fails, the function jump to the out_fail label and returns without calling m_can_receive_skb(). This means that the skb previously allocated by alloc_can_skb() is not freed. In other terms, this is a memory leak. This patch adds a goto label to destroy the skb if an error occurs. Issue was found with GCC -fanalyzer, please follow the link below for details. Fixes: e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors") Link: https://lore.kernel.org/all/20211107050755.70655-1-mailhol.vincent@wanadoo.fr Cc: stable@vger.kernel.org Cc: Matt Kline <matt@bitbashing.io> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07can: m_can: Disable and ignore ELO interruptBrian Silverman
With the design of this driver, this condition is often triggered. However, the counter that this interrupt indicates an overflow is never read either, so overflowing is harmless. On my system, when a CAN bus starts flapping up and down, this locks up the whole system with lots of interrupts and printks. Specifically, this interrupt indicates the CEL field of ECR has overflowed. All reads of ECR mask out CEL. Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support") Link: https://lore.kernel.org/all/20211129222628.7490-1-brian.silverman@bluerivertech.com Cc: stable@vger.kernel.org Signed-off-by: Brian Silverman <brian.silverman@bluerivertech.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07can: sja1000: fix use after free in ems_pcmcia_add_card()Dan Carpenter
If the last channel is not available then "dev" is freed. Fortunately, we can just use "pdev->irq" instead. Also we should check if at least one channel was set up. Fixes: fd734c6f25ae ("can/sja1000: add driver for EMS PCMCIA card") Link: https://lore.kernel.org/all/20211124145041.GB13656@kili Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07can: pch_can: pch_can_rx_normal: fix use after freeVincent Mailhol
After calling netif_receive_skb(skb), dereferencing skb is unsafe. Especially, the can_frame cf which aliases skb memory is dereferenced just after the call netif_receive_skb(skb). Reordering the lines solves the issue. Fixes: b21d18b51b31 ("can: Topcliff: Add PCH_CAN driver.") Link: https://lore.kernel.org/all/20211123111654.621610-1-mailhol.vincent@wanadoo.fr Cc: stable@vger.kernel.org Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-06perf bpf_skel: Do not use typedef to avoid error on old clangSong Liu
When building bpf_skel with clang-10, typedef causes confusions like: libbpf: map 'prev_readings': unexpected def kind var. Fix this by removing the typedef. Fixes: 7fac83aaf2eecc9e ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/BEF5C312-4331-4A60-AEC0-AD7617CB2BC4@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distrosSong Liu
Arnaldo reported that building all his containers with BUILD_BPF_SKEL=1 to then make this the default he found problems in some distros where the system linux/bpf.h file was being used and lacked this: util/bpf_skel/bperf_leader.bpf.c:13:20: error: use of undeclared identifier 'BPF_F_PRESERVE_ELEMS' __uint(map_flags, BPF_F_PRESERVE_ELEMS); So use instead the vmlinux.h file generated by bpftool from BTF info. This fixed these as well, getting the build back working on debian:11, debian:experimental and ubuntu:21.10: In file included from In file included from util/bpf_skel/bperf_leader.bpf.cutil/bpf_skel/bpf_prog_profiler.bpf.c::33: : In file included from In file included from /usr/include/linux/bpf.h/usr/include/linux/bpf.h::1111: : /usr/include/linux/types.h/usr/include/linux/types.h::55::1010:: In file included from util/bpf_skel/bperf_follower.bpf.c:3fatal errorfatal error: : : In file included from /usr/include/linux/bpf.h:'asm/types.h' file not found11'asm/types.h' file not found: /usr/include/linux/types.h:5:10: fatal error: 'asm/types.h' file not found #include <asm/types.h>#include <asm/types.h> ^~~~~~~~~~~~~ ^~~~~~~~~~~~~ #include <asm/types.h> ^~~~~~~~~~~~~ 1 error generated. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Song Liu <song@kernel.org> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/CF175681-8101-43D1-ABDB-449E644BE986@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06perf header: Fix memory leaks when processing feature headersIan Rogers
These leaks were found with leak sanitizer running "perf pipe recording and injection test". In pipe mode feat_fd may hold onto an events struct that needs freeing. When string features are processed they may overwrite an already created string, so free this before the overwrite. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20211118201730.2302927-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06perf test: Reset shadow counts before loadingIan Rogers
Otherwise load counting is an average. Without this change duration_time in test_memory_bandwidth will alter its value if an earlier test contains duration_time. This patch fixes an issue that's introduced in the proposed patch: https://lore.kernel.org/lkml/20211124015226.3317994-1-irogers@google.com/ in perf test "Parse and process metrics". Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20211128085810.4027314-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06perf test: Fix 'Simple expression parser' test on arch without CPU die ↵Thomas Richter
topology info Some platforms do not have CPU die support, for example s390. Commit Cc: Ian Rogers <irogers@google.com> Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.") fails on s390: # perf test -Fv 7 ... # FAILED tests/expr.c:173 #num_dies >= #num_packages ---- end ---- Simple expression parser: FAILED! # Investigating this issue leads to these functions: build_cpu_topology() +--> has_die_topology(void) { struct utsname uts; if (uname(&uts) < 0) return false; if (strncmp(uts.machine, "x86_64", 6)) return false; .... } which always returns false on s390. The caller build_cpu_topology() checks has_die_topology() return value. On false the the struct cpu_topology::die_cpu_list is not contructed and has zero entries. This leads to the failing comparison: #num_dies >= #num_packages. s390 of course has a positive number of packages. Fix this and check if the function build_cpu_topology() did build up a die_cpus_list. The number of entries in this list should be larger than 0. If the number of list element is zero, the die_cpus_list has not been created and the check in function test__expr(): TEST_ASSERT_VAL("#num_dies >= #num_packages", \ num_dies >= num_packages) always fails. Output after: # perf test -Fv 7 7: Simple expression parser : --- start --- division by zero syntax error ---- end ---- Simple expression parser: Ok # Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/20211129112339.3003036-1-tmricht@linux.ibm.com [ Added comment in the added 'if (num_dies)' line about architectures not having die topology ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06tools build: Remove needless libpython-version feature check that breaks ↵Arnaldo Carvalho de Melo
test-all fast path Since 66dfdff03d196e51 ("perf tools: Add Python 3 support") we don't use the tools/build/feature/test-libpython-version.c version in any Makefile feature check: $ find tools/ -type f | xargs grep feature-libpython-version $ The only place where this was used was removed in 66dfdff03d196e51: - ifneq ($(feature-libpython-version), 1) - $(warning Python 3 is not yet supported; please set) - $(warning PYTHON and/or PYTHON_CONFIG appropriately.) - $(warning If you also have Python 2 installed, then) - $(warning try something like:) - $(warning $(and ,)) - $(warning $(and ,) make PYTHON=python2) - $(warning $(and ,)) - $(warning Otherwise, disable Python support entirely:) - $(warning $(and ,)) - $(warning $(and ,) make NO_LIBPYTHON=1) - $(warning $(and ,)) - $(error $(and ,)) - else - LDFLAGS += $(PYTHON_EMBED_LDFLAGS) - EXTLIBS += $(PYTHON_EMBED_LIBADD) - LANG_BINDINGS += $(obj-perf)python/perf.so - $(call detected,CONFIG_LIBPYTHON) - endif And nowadays we either build with PYTHON=python3 or just install the python3 devel packages and perf will build against it. But the leftover feature-libpython-version check made the fast path feature detection to break in all cases except when python2 devel files were installed: $ rpm -qa | grep python.*devel python3-devel-3.9.7-1.fc34.x86_64 $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; $ make -C tools/perf O=/tmp/build/perf install-bin make: Entering directory '/var/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j32' parallel build HOSTCC /tmp/build/perf/fixdep.o <SNIP> $ cat /tmp/build/perf/feature/test-all.make.output In file included from test-all.c:18: test-libpython-version.c:5:10: error: #error 5 | #error | ^~~~~ $ ldd ~/bin/perf | grep python libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007fda6dbcf000) $ As python3 is the norm these days, fix this by just removing the unused feature-libpython-version feature check, making the test-all fast path to work with the common case. With this: $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ; $ make -C tools/perf O=/tmp/build/perf install-bin |& head make: Entering directory '/var/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j32' parallel build HOSTCC /tmp/build/perf/fixdep.o HOSTLD /tmp/build/perf/fixdep-in.o LINK /tmp/build/perf/fixdep Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] $ ldd ~/bin/perf | grep python libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007f58800b0000) $ cat /tmp/build/perf/feature/test-all.make.output $ Reviewed-by: James Clark <james.clark@arm.com> Fixes: 66dfdff03d196e51 ("perf tools: Add Python 3 support") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jaroslav Škarvada <jskarvad@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/YaYmeeC6CS2b8OSz@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06perf tools: Fix SMT detection fast read pathIan Rogers
sysfs__read_int() returns 0 on success, and so the fast read path was always failing. Fixes: bb629484d924118e ("perf tools: Simplify checking if SMT is active.") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20211124001231.3277836-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06tools headers UAPI: Sync powerpc syscall table file changed by new ↵Arnaldo Carvalho de Melo
futex_waitv syscall To pick the changes in this cset: a0eb2da92b715d0c ("futex: Wireup futex_waitv syscall") That add support for this new syscall in tools such as 'perf trace'. For instance, this is now possible (adapted from the x86_64 test output): # perf trace -e futex_waitv ^C# # perf trace -v -e futex_waitv event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449) ^C# # perf trace -v -e futex* --max-events 10 event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 221 || id == 449) mmap size 528384B ? ( ): Timer/219310 ... [continued]: futex()) = -1 ETIMEDOUT (Connection timed out) 0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0 0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.088 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ... 0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1) = 1 0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1) = 1 0.088 ( 0.089 ms): Timer/219310 ... [continued]: futex()) = 0 0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.181 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ... # That is the filter expression attached to the raw_syscalls:sys_{enter,exit} tracepoints. $ grep futex tools/perf/arch/powerpc/entry/syscalls/syscall.tbl 221 32 futex sys_futex_time32 221 64 futex sys_futex 221 spu futex sys_futex 422 32 futex_time64 sys_futex sys_futex 449 common futex_waitv sys_futex_waitv $ This addresses this perf build warnings: Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl' diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com>, Cc: André Almeida <andrealmeid@collabora.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/YZ%2F1OU9mJuyS2HMa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06perf inject: Fix itrace space allowed for new attributesAdrian Hunter
The space allowed for new attributes can be too small if existing header information is large. That can happen, for example, if there are very many CPUs, due to having an event ID per CPU per event being stored in the header information. Fix by adding the existing header.data_offset. Also increase the extra space allowed to 8KiB and align to a 4KiB boundary for neatness. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20211125071457.2066863-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv ↵Arnaldo Carvalho de Melo
syscall To pick the changes in these csets: 6c122360cf2f4c5a ("s390: wire up sys_futex_waitv system call") That add support for this new syscall in tools such as 'perf trace'. For instance, this is now possible (adapted from the x86_64 test output): # perf trace -e futex_waitv ^C# # perf trace -v -e futex_waitv event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449) ^C# # perf trace -v -e futex* --max-events 10 event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 238 || id == 449) ? ( ): Timer/219310 ... [continued]: futex()) = -1 ETIMEDOUT (Connection timed out) 0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0 0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.088 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ... 0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1) = 1 0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1) = 1 0.088 ( 0.089 ms): Timer/219310 ... [continued]: futex()) = 0 0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0 0.181 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ... # That is the filter expression attached to the raw_syscalls:sys_{enter,exit} tracepoints. $ grep futex tools/perf/arch/s390/entry/syscalls/syscall.tbl 238 common futex sys_futex sys_futex_time32 422 32 futex_time64 - sys_futex 449 common futex_waitv sys_futex_waitv sys_futex_waitv $ This addresses this perf build warnings: Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl' diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl Acked-by: Heiko Carstens <hca@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com>, Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/lkml/YZ%2F2qRW%2FTScYTP1U@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>