summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-09-28Revert "net: set proper memcg for net_init hooks allocations"Shakeel Butt
This reverts commit 1d0403d20f6c281cb3d14c5f1db5317caeec48e9. Anatoly Pugachev reported that the commit 1d0403d20f6c ("net: set proper memcg for net_init hooks allocations") is somehow causing the sparc64 VMs failed to boot and the VMs boot fine with that patch reverted. So, revert the patch for now and later we can debug the issue. Link: https://lore.kernel.org/all/20220918092849.GA10314@u164.east.ru/ Reported-by: Anatoly Pugachev <matorola@gmail.com> Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: Vasily Averin <vvs@openvz.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: cgroups@vger.kernel.org Cc: sparclinux@vger.kernel.org Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Tested-by: Anatoly Pugachev <matorola@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Fixes: 1d0403d20f6c ("net: set proper memcg for net_init hooks allocations") Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-09-28perf arm-spe: augment the data source type with neoverse_spe listJing Zhang
When synthesizing event with SPE data source, commit 4e6430cbb1a9("perf arm-spe: Use SPE data source for neoverse cores") augment the type with source information by MIDR. However, is_midr_in_range only compares the first entry in neoverse_spe. Change is_midr_in_range to is_midr_in_range_list to traverse the neoverse_spe array so that all neoverse cores synthesize event with data source packet. Fixes: 4e6430cbb1a9f1dc ("perf arm-spe: Use SPE data source for neoverse cores") Reviewed-by: Ali Saidi <alisaidi@amazon.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuai Xue <xueshuai@linux.alibaba.com> Cc: Timothy Hayes <timothy.hayes@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Zhuo Song <zhuo.song@linux.alibaba.com> Link: https://lore.kernel.org/r/1664197396-42672-1-git-send-email-renyu.zj@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-28perf tests vmlinux-kallsyms: Update is_ignored_symbol function to match the ↵Athira Rajeev
kernel ignored list The testcase “vmlinux-kallsyms.c” fails in powerpc. vmlinux symtab matches kallsyms: FAILED! This test look at the symbols in the vmlinux DSO and check if we find all of them in the kallsyms dso. But from the powerpc logs , observed that the failure happens for: ERR : 0xc0000000000fe9c8: .Lmfspr_table not on kallsyms ERR : 0xc0000000001009c8: .Lmtspr_table not on kallsyms These are labels ( with .L) in the source code and has to be ignored. Reference code with .Lmtspr_table: arch/powerpc/xmon/spr_access.S The testcases invokes is_ignored_symbol() function to ignore hidden symbols in the dso like local symbols. This function is adapted from is_ignored_symbol() kernel function in code: scripts/kallsyms.c . The kernel function got some updates which is not reflected in the testcase function and the new updates also handles ignoring "labels". Below is the changes that went in the kernel function. /* Symbol names that begin with the following are ignored.*/ static const char * const ignored_prefixes[] = { "$", /* local symbols for ARM, MIPS, etc. */ - ".LASANPC", /* s390 kasan local symbols */ + ".L", /* local labels, .LBB,.Ltmpxxx,.L__unnamed_xx,.LASANPC, etc. */ "__crc_", /* modversions */ "__efistub_", /* arm64 EFI stub namespace */ - "__kvm_nvhe_", /* arm64 non-VHE KVM namespace */ + "__kvm_nvhe_$", /* arm64 local symbols in non-VHE KVM namespace */ + "__kvm_nvhe_.L", /* arm64 local symbols in non-VHE KVM namespace */ "__AArch64ADRPThunk_", /* arm64 lld */ "__ARMV5PILongThunk_", /* arm lld */ "__ARMV7PILongThunk_", This change is part of below commits and will handle the symbols with “.L” commit d4c858643263 ("kallsyms: ignore all local labels prefixed by '.L'") commit 6ccf9cb557bd ("KVM: arm64: Symbolize the nVHE HYP addresses") Update the testcase function to include the new changes. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220928045218.37322-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-09-28ata: libata-sata: Fix device queue depth controlDamien Le Moal
The function __ata_change_queue_depth() uses the helper ata_scsi_find_dev() to get the ata_device structure of a scsi device and set that device maximum queue depth. However, when the ata device is managed by libsas, ata_scsi_find_dev() returns NULL, turning __ata_change_queue_depth() into a nop, which prevents the user from setting the maximum queue depth of ATA devices used with libsas based HBAs. Fix this by renaming __ata_change_queue_depth() to ata_change_queue_depth() and adding a pointer to the ata_device structure of the target device as argument. This pointer is provided by ata_scsi_change_queue_depth() using ata_scsi_find_dev() in the case of a libata managed device and by sas_change_queue_depth() using sas_to_ata_dev() in the case of a libsas managed ata device. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: John Garry <john.garry@huawei.com>
2022-09-28ata: libata-scsi: Fix initialization of device queue depthDamien Le Moal
For SATA devices supporting NCQ, drivers using libsas first initialize a scsi device queue depth based on the controller and device capabilities, leading to the scsi device queue_depth field being 32 (ATA maximum queue depth) for most setup. However, if libata was loaded using the force=[ID]]noncq argument, the default queue depth should be set to 1 to reflect the fact that queuable commands will never be used. This is consistent with manually setting a device queue depth to 1 through sysfs as that disables NCQ use for the device. Fix ata_scsi_dev_config() to honor the noncq parameter by sertting the device queue depth to 1 for devices that do not have the ATA_DFLAG_NCQ flag set. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Tested-by: John Garry <john.garry@huawei.com>
2022-09-28can: c_can: don't cache TX messages for C_CAN coresMarc Kleine-Budde
As Jacob noticed, the optimization introduced in 387da6bc7a82 ("can: c_can: cache frames to operate as a true FIFO") doesn't properly work on C_CAN, but on D_CAN IP cores. The exact reasons are still unknown. For now disable caching if CAN frames in the TX path for C_CAN cores. Fixes: 387da6bc7a82 ("can: c_can: cache frames to operate as a true FIFO") Link: https://lore.kernel.org/all/20220928083354.1062321-1-mkl@pengutronix.de Link: https://lore.kernel.org/all/15a8084b-9617-2da1-6704-d7e39d60643b@gmail.com Reported-by: Jacob Kroon <jacob.kroon@gmail.com> Tested-by: Jacob Kroon <jacob.kroon@gmail.com> Cc: stable@vger.kernel.org # v5.15 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-09-27Merge tag 'wireless-2022-09-27' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== A few late-comer fixes: * locking in mac80211 MLME * non-QoS driver crash/regression * minstrel memory corruption * TX deadlock * TX queues not always enabled * HE/EHT bitrate calculation * tag 'wireless-2022-09-27' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: mac80211: mlme: Fix double unlock on assoc success handling wifi: mac80211: mlme: Fix missing unlock on beacon RX wifi: mac80211: fix memory corruption in minstrel_ht_update_rates() wifi: mac80211: fix regression with non-QoS drivers wifi: mac80211: ensure vif queues are operational after start wifi: mac80211: don't start TX with fq->lock to fix deadlock wifi: cfg80211: fix MCS divisor value ==================== Link: https://lore.kernel.org/r/20220927135923.45312-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27Merge tag 'soc-fixes-6.0-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "This should be the last set of bugfixes in the SoC tree: - Two fixes for Arm integrator, dealing with a regression caused by invalid DT properties combined with a change in dma address translation, and missing device_type annotations on the PCI bus - Fixes for drivers/reset/, addressing bugs in i.MX8MP, Sparx5 and NPCM8XX platforms - Bjorn Andersson's email address changes in the MAINTAINERS file - Multiple minor fixes to Qualcomm dts files, and a change to the remoteproc firmware filename that did not match the actual path in the linux-firmware package - Minor code fixes for the Allwinner/sunxi SRAM driver, and the broadcom STB Bus Interface Unit driver - A build fix for the sunplus sp7021 platform - Two dts fixes for TI OMAP family SoCs, addressing an extraneous usb4 device node and an incorrect DMA handle" * tag 'soc-fixes-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: ARM: dts: integrator: Fix DMA ranges ARM: dts: integrator: Tag PCI host with device_type ARM: sunplus: fix serial console kconfig and build problems reset: npcm: fix iprst2 and iprst4 setting arm64: dts: qcom: sm8350: fix UFS PHY serdes size soc: bcm: brcmstb: biuctrl: Avoid double of_node_put() arm64: dts: qcom: sc8280xp-x13s: Update firmware location soc: sunxi: sram: Fix debugfs info for A64 SRAM C soc: sunxi: sram: Fix probe function ordering issues soc: sunxi: sram: Prevent the driver from being unbound soc: sunxi: sram: Actually claim SRAM regions ARM: dts: am5748: keep usb4_tm disabled reset: microchip-sparx5: issue a reset on startup reset: imx7: Fix the iMX8MP PCIe PHY PERST support MAINTAINERS: Update Bjorn's email address arm64: dts: qcom: sc7280: move USB wakeup-source property arm64: dts: qcom: thinkpad-x13s: Fix firmware location arm64: dts: qcom: sm8150: Fix fastrpc iommu values ARM: dts: am33xx: Fix MMCHS0 dma properties
2022-09-27vdpa/mlx5: Fix MQ to support non power of two num queuesEli Cohen
RQT objects require that a power of two value be configured for both rqt_max_size and rqt_actual size. For create_rqt, make sure to round up to the power of two the value of given by the user who created the vdpa device and given by ndev->rqt_size. The actual size is also rounded up to the power of two using the current number of VQs given by ndev->cur_num_vqs. Same goes with modify_rqt where we need to make sure act size is power of two based on the new number of QPs. Without this patch, attempt to create a device with non power of two QPs would result in error from firmware. Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support") Signed-off-by: Eli Cohen <elic@nvidia.com> Message-Id: <20220912125019.833708-1-elic@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-09-27vduse: prevent uninitialized memory accessesMaxime Coquelin
If the VDUSE application provides a smaller config space than the driver expects, the driver may use uninitialized memory from the stack. This patch prevents it by initializing the buffer passed by the driver to store the config value. This fix addresses CVE-2022-2308. Cc: stable@vger.kernel.org # v5.15+ Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Reviewed-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20220831154923.97809-1-maxime.coquelin@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
2022-09-27virtio-blk: Fix WARN_ON_ONCE in virtio_queue_rq()Suwan Kim
If a request fails at virtio_queue_rqs(), it is inserted to requeue_list and passed to virtio_queue_rq(). Then blk_mq_start_request() can be called again at virtio_queue_rq() and trigger WARN_ON_ONCE like below trace because request state was already set to MQ_RQ_IN_FLIGHT in virtio_queue_rqs() despite the failure. [ 1.890468] ------------[ cut here ]------------ [ 1.890776] WARNING: CPU: 2 PID: 122 at block/blk-mq.c:1143 blk_mq_start_request+0x8a/0xe0 [ 1.891045] Modules linked in: [ 1.891250] CPU: 2 PID: 122 Comm: journal-offline Not tainted 5.19.0+ #44 [ 1.891504] Hardware name: ChromiumOS crosvm, BIOS 0 [ 1.891739] RIP: 0010:blk_mq_start_request+0x8a/0xe0 [ 1.891961] Code: 12 80 74 22 48 8b 4b 10 8b 89 64 01 00 00 8b 53 20 83 fa ff 75 08 ba 00 00 00 80 0b 53 24 c1 e1 10 09 d1 89 48 34 5b 41 5e c3 <0f> 0b eb b8 65 8b 05 2b 39 b6 7e 89 c0 48 0f a3 05 39 77 5b 01 0f [ 1.892443] RSP: 0018:ffffc900002777b0 EFLAGS: 00010202 [ 1.892673] RAX: 0000000000000000 RBX: ffff888004bc0000 RCX: 0000000000000000 [ 1.892952] RDX: 0000000000000000 RSI: ffff888003d7c200 RDI: ffff888004bc0000 [ 1.893228] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888004bc0100 [ 1.893506] R10: ffffffffffffffff R11: ffffffff8185ca10 R12: ffff888004bc0000 [ 1.893797] R13: ffffc90000277900 R14: ffff888004ab2340 R15: ffff888003d86e00 [ 1.894060] FS: 00007ffa143a4640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 1.894412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.894682] CR2: 00005648577d9088 CR3: 00000000053da004 CR4: 0000000000170ee0 [ 1.894953] Call Trace: [ 1.895139] <TASK> [ 1.895303] virtblk_prep_rq+0x1e5/0x280 [ 1.895509] virtio_queue_rq+0x5c/0x310 [ 1.895710] ? virtqueue_add_sgs+0x95/0xb0 [ 1.895905] ? _raw_spin_unlock_irqrestore+0x16/0x30 [ 1.896133] ? virtio_queue_rqs+0x340/0x390 [ 1.896453] ? sbitmap_get+0xfa/0x220 [ 1.896678] __blk_mq_issue_directly+0x41/0x180 [ 1.896906] blk_mq_plug_issue_direct+0xd8/0x2c0 [ 1.897115] blk_mq_flush_plug_list+0x115/0x180 [ 1.897342] blk_add_rq_to_plug+0x51/0x130 [ 1.897543] blk_mq_submit_bio+0x3a1/0x570 [ 1.897750] submit_bio_noacct_nocheck+0x418/0x520 [ 1.897985] ? submit_bio_noacct+0x1e/0x260 [ 1.897989] ext4_bio_write_page+0x222/0x420 [ 1.898000] mpage_process_page_bufs+0x178/0x1c0 [ 1.899451] mpage_prepare_extent_to_map+0x2d2/0x440 [ 1.899603] ext4_writepages+0x495/0x1020 [ 1.899733] do_writepages+0xcb/0x220 [ 1.899871] ? __seccomp_filter+0x171/0x7e0 [ 1.900006] file_write_and_wait_range+0xcd/0xf0 [ 1.900167] ext4_sync_file+0x72/0x320 [ 1.900308] __x64_sys_fsync+0x66/0xa0 [ 1.900449] do_syscall_64+0x31/0x50 [ 1.900595] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 1.900747] RIP: 0033:0x7ffa16ec96ea [ 1.900883] Code: b8 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 02 f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 43 03 f8 ff 8b 44 24 [ 1.901302] RSP: 002b:00007ffa143a3ac0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 1.901499] RAX: ffffffffffffffda RBX: 0000560277ec6fe0 RCX: 00007ffa16ec96ea [ 1.901696] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000016 [ 1.901884] RBP: 0000560277ec5910 R08: 0000000000000000 R09: 00007ffa143a4640 [ 1.902082] R10: 00007ffa16e4d39e R11: 0000000000000293 R12: 00005602773f59e0 [ 1.902459] R13: 0000000000000000 R14: 00007fffbfc007ff R15: 00007ffa13ba4000 [ 1.902763] </TASK> [ 1.902877] ---[ end trace 0000000000000000 ]--- To avoid calling blk_mq_start_request() twice, This patch moves the execution of blk_mq_start_request() to the end of virtblk_prep_rq(). And instead of requeuing failed request to plug list in the error path of virtblk_add_req_batch(), it uses blk_mq_requeue_request() to change failed request state to MQ_RQ_IDLE. Then virtblk can safely handle the request on the next trial. Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()") Reported-by: Alexandre Courbot <acourbot@chromium.org> Tested-by: Alexandre Courbot <acourbot@chromium.org> Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> Message-Id: <20220830150153.12627-1-suwan.kim027@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
2022-09-27virtio_test: fixup for vq resetXuan Zhuo
Fix virtio test compilation failure caused by vq reset. ../../drivers/virtio/virtio_ring.c: In function ‘vring_create_virtqueue_packed’: ../../drivers/virtio/virtio_ring.c:1999:8: error: ‘struct virtqueue’ has no member named ‘reset’ 1999 | vq->vq.reset = false; | ^ ../../drivers/virtio/virtio_ring.c: In function ‘__vring_new_virtqueue’: ../../drivers/virtio/virtio_ring.c:2493:8: error: ‘struct virtqueue’ has no member named ‘reset’ 2493 | vq->vq.reset = false; | ^ ../../drivers/virtio/virtio_ring.c: In function ‘virtqueue_resize’: ../../drivers/virtio/virtio_ring.c:2587:18: error: ‘struct virtqueue’ has no member named ‘num_max’ 2587 | if (num > vq->vq.num_max) | ^ ../../drivers/virtio/virtio_ring.c:2596:11: error: ‘struct virtio_device’ has no member named ‘config’ 2596 | if (!vdev->config->disable_vq_and_reset) | ^~ ../../drivers/virtio/virtio_ring.c:2599:11: error: ‘struct virtio_device’ has no member named ‘config’ 2599 | if (!vdev->config->enable_vq_after_reset) | ^~ ../../drivers/virtio/virtio_ring.c:2602:12: error: ‘struct virtio_device’ has no member named ‘config’ 2602 | err = vdev->config->disable_vq_and_reset(_vq); | ^~ ../../drivers/virtio/virtio_ring.c:2614:10: error: ‘struct virtio_device’ has no member named ‘config’ 2614 | if (vdev->config->enable_vq_after_reset(_vq)) | ^~ make: *** [<builtin>: virtio_ring.o] Error 1 Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20220830110549.103168-1-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-09-27virtio-crypto: fix memory-leaklei he
Fix memory-leak for virtio-crypto akcipher request, this problem is introduced by 59ca6c93387d3(virtio-crypto: implement RSA algorithm). The leak can be reproduced and tested with the following script inside virtual machine: #!/bin/bash LOOP_TIMES=10000 # required module: pkcs8_key_parser, virtio_crypto modprobe pkcs8_key_parser # if CONFIG_PKCS8_PRIVATE_KEY_PARSER=m modprobe virtio_crypto # if CONFIG_CRYPTO_DEV_VIRTIO=m rm -rf /tmp/data dd if=/dev/random of=/tmp/data count=1 bs=230 # generate private key and self-signed cert openssl req -nodes -x509 -newkey rsa:2048 -keyout key.pem \ -outform der -out cert.der \ -subj "/C=CN/ST=GD/L=SZ/O=vihoo/OU=dev/CN=always.com/emailAddress=yy@always.com" # convert private key from pem to der openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER -out key.der # add key PRIV_KEY_ID=`cat key.der | keyctl padd asymmetric test_priv_key @s` echo "priv key id = "$PRIV_KEY_ID PUB_KEY_ID=`cat cert.der | keyctl padd asymmetric test_pub_key @s` echo "pub key id = "$PUB_KEY_ID # query key keyctl pkey_query $PRIV_KEY_ID 0 keyctl pkey_query $PUB_KEY_ID 0 # here we only run pkey_encrypt becasuse it is the fastest interface function bench_pub() { keyctl pkey_encrypt $PUB_KEY_ID 0 /tmp/data enc=pkcs1 >/tmp/enc.pub } # do bench_pub in loop to obtain the memory leak for (( i = 0; i < ${LOOP_TIMES}; ++i )); do bench_pub done Signed-off-by: lei he <helei.sig11@bytedance.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Message-Id: <20220919075158.3625-1-helei.sig11@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-09-27vdpa/ifcvf: fix the calculation of queuepairAngus Chen
The q_pair_id to address a queue pair in the lm bar should be calculated by queue_id / 2 rather than queue_id / nr_vring. Fixes: 2ddae773c93b ("vDPA/ifcvf: detect and use the onboard number of queues directly") Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com> Reviewed-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Zhu Lingshan <lingshan.zhu@intel.com> Message-Id: <20220923091013.191-1-angus.chen@jaguarmicro.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-09-27drm/amdgpu: Add amdgpu suspend-resume code path under SRIOVBokun Zhang
- Under SRIOV, we need to send REQ_GPU_FINI to the hypervisor during the suspend time. Furthermore, we cannot request a mode 1 reset under SRIOV as VF. Therefore, we will skip it as it is called in suspend_noirq() function. - In the resume code path, we need to send REQ_GPU_INIT to the hypervisor and also resume PSP IP block under SRIOV. Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-09-27drm/amdgpu: Remove fence_process in count_emittedJiadong.Zhu
The function amdgpu_fence_count_emitted used in work_hander should not call amdgpu_fence_process which must be used in irq handler. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amdgpu: Correct the position in patch_cond_execJiadong.Zhu
The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec underflows when the wptr is divisible by ring->buf_mask + 1. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jiadong.Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: fill in clock values when DPM is not enabledSamson Tam
[Why] For individual feature testing, PMFW may not report all clock values back. Driver will default them to 0 but this will cause the BB table to be skipped and default to one state with max clocks. [How] Add helper function to scan through initial clock values and populate them with default clock limits so that BB table can be built. Add dpm_enabled flag to check when DPM is not enabled and to trigger helper function. Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Samson Tam <samson.tam@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Avoid unnecessary pixel rate divider programmingTaimur Hassan
[Why] Programming pixel rate divider when FIFO is enabled can cause FIFO error. [How] Skip divider programming when divider values are the same to prevent FIFO error. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Remove assert for odm transition caseEric Bernstein
Remove assert that will hit during odm transition case, since this is a valid case. Signed-off-by: Eric Bernstein <eric.bernstein@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Fix typo in get_pixel_rate_divTaimur Hassan
[Why & How] Some FIFO errors still occur due to reading wrong pixel rate divider. Fix typo to prevent FIFO error. Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Fix audio on display after unplugging anotherAric Cyr
Revert "dc: skip audio setup when audio stream is enabled" This reverts commit 65fbfb02c2734cacffec5e3f492e1b4f1dabcf98 [why] We have minimal pipe split transition method to avoid pipe allocation outage.However, this method will invoke audio setup which cause audio output stuck once pipe reallocate. [how] skip audio setup for pipelines which audio stream has been enabled Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Add explicit FIFO disable for DP blankNicholas Kazlauskas
[Why] We rely on DMCUB to do this when disabling the link but it should actually come before we disable the DP VID stream. If we don't then the FIFO can end up with underflow that persists the next time it's enabled. [How] Add a DCN314 specific blank sequence that will disable the DIG FIFO first. Reviewed-by: Syed Hassan <Syed.Hassan@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Wrap OTG disable workaround with FIFO controlNicholas Kazlauskas
[Why] The DIO FIFO will underflow if we turn off the OTG before we turn off the FIFO. Since this happens as part of the OTG workaround and we don't reset the FIFO afterwards we see the error persist. [How] Add disable FIFO before the disable CRTC and enable FIFO after enabling the CRTC. Reviewed-by: Syed Hassan <Syed.Hassan@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Do DIO FIFO enable after DP video stream enableNicholas Kazlauskas
[Why] Avoids a race condition where DIO FIFO can underflow due to no incoming data available. [How] Shift the FIFO enable below stream enable. Make sure fullness level is written before the DIO reset takes place and that we're not doing it twice. Reviewed-by: Syed Hassan <Syed.Hassan@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Update DCN32 to use new SR latenciesAlvin Lee
[Description] Update to new SR latencies for DCN32 Reviewed-by: Nevenko Stupar <Nevenko.Stupar@amd.com> Reviewed-by: Jun Lei <Jun.Lei@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/display: Avoid avoid unnecessary pixel rate divider programmingTaimur Hassan
[Why] Programming pixel rate divider when FIFO is enabled can cause FIFO error. [How] Skip divider programming when divider values are the same to prevent FIFO error. Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Jasdeep Dhillon <jdhillon@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amdkfd: fix dropped interrupt in kfd_int_process_v11Graham Sider
Shader wave interrupts were getting dropped in event_interrupt_wq_v11 if the PRIV bit was set to 1. This would often lead to a hang. Until debugger logic is upstreamed, expand comment to stop early return. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amdgpu: pass queue size and is_aql_queue to MESGraham Sider
Update mes_v11_api_def.h add_queue API with is_aql_queue parameter. Also re-use gds_size for the queue size (unused for KFD). MES requires the queue size in order to compute the actual wptr offset within the queue RB since it increases monotonically for AQL queues. v2: Make is_aql_queue assign clearer Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amdkfd: fix MQD init for GFX11 in init_mqdGraham Sider
Set remaining compute_static_thread_mgmt_se* accordingly. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/pm: use adverse selection for dpm features unsupported by driverEvan Quan
It's vbios and pmfw instead of driver who decide whether some dpm features is supported or not. Driver just de-selects those features which are not permitted on user's request. Thus, we use adverse selects model. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amd/pm: enable gfxoff feature for SMU 13.0.0Evan Quan
The feature is ready with latest 78.58.0 PMFW. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27drm/amdgpu: avoid gfx register accessing during gfxoffEvan Quan
Make sure gfxoff is disabled before gfx register accessing. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-09-27ice: xsk: drop power of 2 ring size restriction for AF_XDPMaciej Fijalkowski
We had multiple customers in the past months that reported commit 296f13ff3854 ("ice: xsk: Force rings to be sized to power of 2") makes them unable to use ring size of 8160 in conjunction with AF_XDP. Remove this restriction. Fixes: 296f13ff3854 ("ice: xsk: Force rings to be sized to power of 2") CC: Alasdair McWilliam <alasdair.mcwilliam@outlook.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27ice: xsk: change batched Tx descriptor cleaningMaciej Fijalkowski
AF_XDP Tx descriptor cleaning in ice driver currently works in a "lazy" way - descriptors are not cleaned immediately after send. We rather hold on with cleaning until we see that free space in ring drops below particular threshold. This was supposed to reduce the amount of unnecessary work related to cleaning and instead of keeping the ring empty, ring was rather saturated. In AF_XDP realm cleaning Tx descriptors implies producing them to CQ. This is a way of letting know user space that particular descriptor has been sent, as John points out in [0]. We tried to implement serial descriptor cleaning which would be used in conjunction with batched cleaning but it made code base more convoluted and probably harder to maintain in future. Therefore we step away from batched cleaning in a current form in favor of an approach where we set RS bit on every last descriptor from a batch and clean always at the beginning of ice_xmit_zc(). This means that we give up a bit of Tx performance, but this doesn't hurt l2fwd scenario which is way more meaningful than txonly as this can be treaten as AF_XDP based packet generator. l2fwd is not hurt due to the fact that Tx side is much faster than Rx and Rx is the one that has to catch Tx up. FWIW Tx descriptors are still produced in a batched way. [0]: https://lore.kernel.org/bpf/62b0a20232920_3573208ab@john.notmuch/ Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-09-27net: usb: qmi_wwan: Add new usb-id for Dell branded EM7455Frank Wunderlich
Add support for Dell 5811e (EM7455) with USB-id 0x413c:0x81c2. Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Cc: stable@vger.kernel.org Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20220926150740.6684-3-linux@fw-web.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27Input: snvs_pwrkey - fix SNVS_HPVIDR1 register addressSebastian Krzyszkowiak
Both i.MX6 and i.MX8 reference manuals list 0xBF8 as SNVS_HPVIDR1 (chapters 57.9 and 6.4.5 respectively). Without this, trying to read the revision number results in 0 on all revisions, causing the i.MX6 quirk to apply on all platforms, which in turn causes the driver to synthesise power button release events instead of passing the real one as they happen even on platforms like i.MX8 where that's not wanted. Fixes: 1a26c920717a ("Input: snvs_pwrkey - send key events for i.MX6 S, DL and Q") Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm> Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm> Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/4599101.ElGaqSPkdT@pliszka Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-09-27Merge tag 'sound-6.0-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A few device-specific fixes, mostly for ASoC. All look small / trivial enough" * tag 'sound-6.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda: intel-dsp-config: add missing RaptorLake PCI IDs ASoC: tas2770: Reinit regcache on reset ASoC: nau8824: Fix semaphore is released unexpectedly ASoC: Intel: sof_sdw: add support for Dell SKU 0AFF ASoC: imx-card: Fix refcount issue with of_node_put ASoC: rt5640: Fix the issue of the abnormal JD2 status
2022-09-27Merge tag 'irqchip-fixes-6.0-2' of ↵Borislav Petkov
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull more irqchip fixes for 6.0 from Marc Zyngier: - A couple of configuration fixes for the recently merged Loongarch drivers - A fix to avoid dynamic allocation of a cpumask which was causing issues with PREEMPT_RT and the GICv3 ITS - A tightening of an error check in the stm32 exti driver Link: https://lore.kernel.org/r/20220916085158.2592518-1-maz@kernel.org
2022-09-27KVM: selftests: Skip tests that require EPT when it is not availableDavid Matlack
Skip selftests that require EPT support in the VM when it is not available. For example, if running on a machine where kvm_intel.ept=N since KVM does not offer EPT support to guests if EPT is not supported on the host. This commit causes vmx_dirty_log_test to be skipped instead of failing on hosts where kvm_intel.ept=N. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220926171457.532542-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-27mmc: hsq: Fix data stomping during mmc recoveryWenchao Chen
The block device uses multiple queues to access emmc. There will be up to 3 requests in the hsq of the host. The current code will check whether there is a request doing recovery before entering the queue, but it will not check whether there is a request when the lock is issued. The request is in recovery mode. If there is a request in recovery, then a read and write request is initiated at this time, and the conflict between the request and the recovery request will cause the data to be trampled. Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com> Fixes: 511ce378e16f ("mmc: Add MMC host software queue support") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220916090506.10662-1-wenchao.chen666@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-09-27selftests: Fix the if conditions of in test_extra_filter()Wang Yufen
The socket 2 bind the addr in use, bind should fail with EADDRINUSE. So if bind success or errno != EADDRINUSE, testcase should be failed. Fixes: 3ca8e4029969 ("soreuseport: BPF selection functional test") Signed-off-by: Wang Yufen <wangyufen@huawei.com> Link: https://lore.kernel.org/r/1663916557-10730-1-git-send-email-wangyufen@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-27net: phy: Don't WARN for PHY_UP state in mdio_bus_phy_resume()Lukas Wunner
Commit 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") introduced a WARN() on resume from system sleep if a PHY is not in PHY_HALTED state. Commit 6dbe852c379f ("net: phy: Don't WARN for PHY_READY state in mdio_bus_phy_resume()") added an exemption for PHY_READY state from the WARN(). It turns out PHY_UP state needs to be exempted as well because the following may happen on suspend: mdio_bus_phy_suspend() phy_stop_machine() phydev->state = PHY_UP # if (phydev->state >= PHY_UP) Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/netdev/2b1a1588-505e-dff3-301d-bfc1fb14d685@samsung.com/ Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Cc: Xiaolei Wang <xiaolei.wang@windriver.com> Link: https://lore.kernel.org/r/8128fdb51eeebc9efbf3776a4097363a1317aaf1.1663905575.git.lukas@wunner.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-27net: stmmac: power up/down serdes in stmmac_open/releaseJunxiao Chang
This commit fixes DMA engine reset timeout issue in suspend/resume with ADLink I-Pi SMARC Plus board which dmesg shows: ... [ 54.678271] PM: suspend exit [ 54.754066] intel-eth-pci 0000:00:1d.2 enp0s29f2: PHY [stmmac-3:01] driver [Maxlinear Ethernet GPY215B] (irq=POLL) [ 54.755808] intel-eth-pci 0000:00:1d.2 enp0s29f2: Register MEM_TYPE_PAGE_POOL RxQ-0 ... [ 54.780482] intel-eth-pci 0000:00:1d.2 enp0s29f2: Register MEM_TYPE_PAGE_POOL RxQ-7 [ 55.784098] intel-eth-pci 0000:00:1d.2: Failed to reset the dma [ 55.784111] intel-eth-pci 0000:00:1d.2 enp0s29f2: stmmac_hw_setup: DMA engine initialization failed [ 55.784115] intel-eth-pci 0000:00:1d.2 enp0s29f2: stmmac_open: Hw setup failed ... The issue is related with serdes which impacts clock. There is serdes in ADLink I-Pi SMARC board ethernet controller. Please refer to commit b9663b7ca6ff78 ("net: stmmac: Enable SERDES power up/down sequence") for detial. When issue is reproduced, DMA engine clock is not ready because serdes is not powered up. To reproduce DMA engine reset timeout issue with hardware which has serdes in GBE controller, install Ubuntu. In Ubuntu GUI, click "Power Off/Log Out" -> "Suspend" menu, it disables network interface, then goes to sleep mode. When it wakes up, it enables network interface again. Stmmac driver is called in this way: 1. stmmac_release: Stop network interface. In this function, it disables DMA engine and network interface; 2. stmmac_suspend: It is called in kernel suspend flow. But because network interface has been disabled(netif_running(ndev) is false), it does nothing and returns directly; 3. System goes into S3 or S0ix state. Some time later, system is waken up by keyboard or mouse; 4. stmmac_resume: It does nothing because network interface has been disabled; 5. stmmac_open: It is called to enable network interace again. DMA engine is initialized in this API, but serdes is not power on so there will be DMA engine reset timeout issue. Similarly, serdes powerdown should be added in stmmac_release. Network interface might be disabled by cmd "ifconfig eth0 down", DMA engine, phy and mac have been disabled in ndo_stop callback, serdes should be powered down as well. It doesn't make sense that serdes is on while other components have been turned off. If ethernet interface is in enabled state(netif_running(ndev) is true) before suspend/resume, the issue couldn't be reproduced because serdes could be powered up in stmmac_resume. Because serdes_powerup is added in stmmac_open, it doesn't need to be called in probe function. Fixes: b9663b7ca6ff78 ("net: stmmac: Enable SERDES power up/down sequence") Signed-off-by: Junxiao Chang <junxiao.chang@intel.com> Reviewed-by: Voon Weifeng <weifeng.voon@intel.com> Tested-by: Jimmy JS Chen <jimmyjs.chen@adlinktech.com> Tested-by: Looi, Hong Aun <hong.aun.looi@intel.com> Link: https://lore.kernel.org/r/20220923050448.1220250-1-junxiao.chang@intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-27wifi: mac80211: mlme: Fix double unlock on assoc success handlingRafael Mendonca
Commit 6911458dc428 ("wifi: mac80211: mlme: refactor assoc success handling") moved the per-link setup out of ieee80211_assoc_success() into a new function ieee80211_assoc_config_link() but missed to remove the unlock of 'sta_mtx' in case of HE capability/operation missing on HE AP, which leads to a double unlock: ieee80211_assoc_success() { ... ieee80211_assoc_config_link() { ... if (!(link->u.mgd.conn_flags & IEEE80211_CONN_DISABLE_HE) && (!elems->he_cap || !elems->he_operation)) { mutex_unlock(&sdata->local->sta_mtx); ... } ... } ... mutex_unlock(&sdata->local->sta_mtx); ... } Fixes: 6911458dc428 ("wifi: mac80211: mlme: refactor assoc success handling") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Link: https://lore.kernel.org/r/20220925143420.784975-1-rafaelmendsr@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-27wifi: mac80211: mlme: Fix missing unlock on beacon RXRafael Mendonca
Commit 98b0b467466c ("wifi: mac80211: mlme: use correct link_sta") switched to link station instead of deflink and added some checks to do that, which are done with the 'sta_mtx' mutex held. However, the error path of these checks does not unlock 'sta_mtx' before returning. Fixes: 98b0b467466c ("wifi: mac80211: mlme: use correct link_sta") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Link: https://lore.kernel.org/r/20220924184042.778676-1-rafaelmendsr@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-27wifi: mac80211: fix memory corruption in minstrel_ht_update_rates()Paweł Lenkow
During our testing of WFM200 module over SDIO on i.MX6Q-based platform, we discovered a memory corruption on the system, tracing back to the wfx driver. Using kfence, it was possible to trace it back to the root cause, which is hw->max_rates set to 8 in wfx_init_common, while the maximum defined by IEEE80211_TX_TABLE_SIZE is 4. This causes array out-of-bounds writes during updates of the rate table, as seen below: BUG: KFENCE: memory corruption in kfree_rcu_work+0x320/0x36c Corrupted memory at 0xe0a4ffe0 [ 0x03 0x03 0x03 0x03 0x01 0x00 0x00 0x02 0x02 0x02 0x09 0x00 0x21 0xbb 0xbb 0xbb ] (in kfence-#81): kfree_rcu_work+0x320/0x36c process_one_work+0x3ec/0x920 worker_thread+0x60/0x7a4 kthread+0x174/0x1b4 ret_from_fork+0x14/0x2c 0x0 kfence-#81: 0xe0a4ffc0-0xe0a4ffdf, size=32, cache=kmalloc-64 allocated by task 297 on cpu 0 at 631.039555s: minstrel_ht_update_rates+0x38/0x2b0 [mac80211] rate_control_tx_status+0xb4/0x148 [mac80211] ieee80211_tx_status_ext+0x364/0x1030 [mac80211] ieee80211_tx_status+0xe0/0x118 [mac80211] ieee80211_tasklet_handler+0xb0/0xe0 [mac80211] tasklet_action_common.constprop.0+0x11c/0x148 __do_softirq+0x1a4/0x61c irq_exit+0xcc/0x104 call_with_stack+0x18/0x20 __irq_svc+0x80/0xb0 wq_worker_sleeping+0x10/0x100 wq_worker_sleeping+0x10/0x100 schedule+0x50/0xe0 schedule_timeout+0x2e0/0x474 wait_for_completion+0xdc/0x1ec mmc_wait_for_req_done+0xc4/0xf8 mmc_io_rw_extended+0x3b4/0x4ec sdio_io_rw_ext_helper+0x290/0x384 sdio_memcpy_toio+0x30/0x38 wfx_sdio_copy_to_io+0x88/0x108 [wfx] wfx_data_write+0x88/0x1f0 [wfx] bh_work+0x1c8/0xcc0 [wfx] process_one_work+0x3ec/0x920 worker_thread+0x60/0x7a4 kthread+0x174/0x1b4 ret_from_fork+0x14/0x2c 0x0 After discussion on the wireless mailing list it was clarified that the issue has been introduced by: commit ee0e16ab756a ("mac80211: minstrel_ht: fill all requested rates") and fix shall be in minstrel_ht_update_rates in rc80211_minstrel_ht.c. Fixes: ee0e16ab756a ("mac80211: minstrel_ht: fill all requested rates") Link: https://lore.kernel.org/all/12e5adcd-8aed-f0f7-70cc-4fb7b656b829@camlingroup.com/ Link: https://lore.kernel.org/linux-wireless/20220915131445.30600-1-lech.perczak@camlingroup.com/ Cc: Jérôme Pouiller <jerome.pouiller@silabs.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Peter Seiderer <ps.report@gmx.net> Cc: Kalle Valo <kvalo@kernel.org> Cc: Krzysztof Drobiński <krzysztof.drobinski@camlingroup.com>, Signed-off-by: Paweł Lenkow <pawel.lenkow@camlingroup.com> Signed-off-by: Lech Perczak <lech.perczak@camlingroup.com> Reviewed-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Jérôme Pouiller <jerome.pouiller@silabs.com> Acked-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-27wifi: mac80211: fix regression with non-QoS driversHans de Goede
Commit 10cb8e617560 ("mac80211: enable QoS support for nl80211 ctrl port") changed ieee80211_tx_control_port() to aways call __ieee80211_select_queue() without checking local->hw.queues. __ieee80211_select_queue() returns a queue-id between 0 and 3, which means that now ieee80211_tx_control_port() may end up setting the queue-mapping for a skb to a value higher then local->hw.queues if local->hw.queues is less then 4. Specifically this is a problem for ralink rt2500-pci cards where local->hw.queues is 2. There this causes rt2x00queue_get_tx_queue() to return NULL and the following error to be logged: "ieee80211 phy0: rt2x00mac_tx: Error - Attempt to send packet over invalid queue 2", after which association with the AP fails. Other callers of __ieee80211_select_queue() skip calling it when local->hw.queues < IEEE80211_NUM_ACS, add the same check to ieee80211_tx_control_port(). This fixes ralink rt2500-pci and similar cards when less then 4 tx-queues no longer working. Fixes: 10cb8e617560 ("mac80211: enable QoS support for nl80211 ctrl port") Cc: Markus Theil <markus.theil@tu-ilmenau.de> Suggested-by: Stanislaw Gruszka <stf_xl@wp.pl> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20220918192052.443529-1-hdegoede@redhat.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-27wifi: mac80211: ensure vif queues are operational after startAlexander Wetzel
Make sure local->queue_stop_reasons and vif.txqs_stopped stay in sync. When a new vif is created the queues may end up in an inconsistent state and be inoperable: Communication not using iTXQ will work, allowing to e.g. complete the association. But the 4-way handshake will time out. The sta will not send out any skbs queued in iTXQs. All normal attempts to start the queues will fail when reaching this state. local->queue_stop_reasons will have marked all queues as operational but vif.txqs_stopped will still be set, creating an inconsistent internal state. In reality this seems to be race between the mac80211 function ieee80211_do_open() setting SDATA_STATE_RUNNING and the wake_txqs_tasklet: Depending on the driver and the timing the queues may end up to be operational or not. Cc: stable@vger.kernel.org Fixes: f856373e2f31 ("wifi: mac80211: do not wake queues on a vif that is being stopped") Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> Acked-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20220915130946.302803-1-alexander@wetzel-home.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2022-09-27wifi: mac80211: don't start TX with fq->lock to fix deadlockAlexander Wetzel
ieee80211_txq_purge() calls fq_tin_reset() and ieee80211_purge_tx_queue(); Both are then calling ieee80211_free_txskb(). Which can decide to TX the skb again. There are at least two ways to get a deadlock: 1) When we have a TDLS teardown packet queued in either tin or frags ieee80211_tdls_td_tx_handle() will call ieee80211_subif_start_xmit() while we still hold fq->lock. ieee80211_txq_enqueue() will thus deadlock. 2) A variant of the above happens if aggregation is up and running: In that case ieee80211_iface_work() will deadlock with the original task: The original tasks already holds fq->lock and tries to get sta->lock after kicking off ieee80211_iface_work(). But the worker can get sta->lock prior to the original task and will then spin for fq->lock. Avoid these deadlocks by not sending out any skbs when called via ieee80211_free_txskb(). Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> Link: https://lore.kernel.org/r/20220915124120.301918-1-alexander@wetzel-home.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>