summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-15MAINTAINERS: add reviewers for SMC SocketsJan Karcher
adding three people from Alibaba as reviewers for SMC. They are currently working on improving SMC on other architectures than s390 and help with reviewing patches on top. Thank you D. Wythe, Tony Lu and Wen Gu for your contributions and collaboration and welcome on board as reviewers! Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Jan Karcher <jaka@linux.ibm.com> Acked-by: Tony Lu <tonylu@linux.alibaba.com> Acked-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15s390/ism: Fix trying to free already-freed IRQ by repeated ism_dev_exit()Julian Ruess
This patch prevents the system from crashing when unloading the ISM module. How to reproduce: Attach an ISM device and execute 'rmmod ism'. Error-Log: - Trying to free already-free IRQ 0 - WARNING: CPU: 1 PID: 966 at kernel/irq/manage.c:1890 free_irq+0x140/0x540 After calling ism_dev_exit() for each ISM device in the exit routine, pci_unregister_driver() will execute ism_remove() for each ISM device. Because ism_remove() also calls ism_dev_exit(), free_irq(pci_irq_vector(pdev, 0), ism) is called twice for each ISM device. This results in a crash with the error 'Trying to free already-free IRQ'. In the exit routine, it is enough to call pci_unregister_driver() because it ensures that ism_dev_exit() is called once per ISM device. Cc: <stable@vger.kernel.org> # 6.3+ Fixes: 89e7d2ba61b7 ("net/ism: Add new API for client registration") Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Julian Ruess <julianr@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-15LoongArch: Fix debugfs_create_dir() error checkingImmad Mir
The debugfs_create_dir() returns ERR_PTR in case of an error and the correct way of checking it is using the IS_ERR_OR_NULL inline function rather than the simple null comparision. This patch fixes the issue. Cc: stable@vger.kernel.org Suggested-By: Ivan Orlov <ivan.orlov0322@gmail.com> Signed-off-by: Immad Mir <mirimmad17@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-15LoongArch: Avoid uninitialized alignment_maskQing Zhang
The hardware monitoring points for instruction fetching and load/store operations need to align 4 bytes and 1/2/4/8 bytes respectively. Reported-by: Colin King <colin.i.king@gmail.com> Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-15LoongArch: Fix perf event id calculationHuacai Chen
LoongArch PMCFG has 10bit event id rather than 8 bit, so fix it. Cc: stable@vger.kernel.org Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-15LoongArch: Fix the write_fcsr() macroQi Hu
The "write_fcsr()" macro uses wrong the positions for val and dest in asm. Fix it! Reported-by: Miao HAO <haomiao19@mails.ucas.ac.cn> Signed-off-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-15LoongArch: Let pmd_present() return true when splitting pmdHongchen Zhang
When we split a pmd into ptes, pmd_present() and pmd_trans_huge() should return true, otherwise it would be treated as a swap pmd. This is the same as arm64 does in commit b65399f6111b ("arm64/mm: Change THP helpers to comply with generic MM semantics"), we also add a new bit named _PAGE_PRESENT_INVALID for LoongArch. Signed-off-by: Hongchen Zhang <zhanghongchen@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-14net: dsa: felix: fix taprio guard band overflow at 10Mbps with jumbo framesVladimir Oltean
The DEV_MAC_MAXLEN_CFG register contains a 16-bit value - up to 65535. Plus 2 * VLAN_HLEN (4), that is up to 65543. The picos_per_byte variable is the largest when "speed" is lowest - SPEED_10 = 10. In that case it is (1000000L * 8) / 10 = 800000. Their product - 52434400000 - exceeds 32 bits, which is a problem, because apparently, a multiplication between two 32-bit factors is evaluated as 32-bit before being assigned to a 64-bit variable. In fact it's a problem for any MTU value larger than 5368. Cast one of the factors of the multiplication to u64 to force the multiplication to take place on 64 bits. Issue found by Coverity. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230613170907.2413559-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14net/sched: cls_api: Fix lockup on flushing explicitly created chainVlad Buslov
Mingshuai Ren reports: When a new chain is added by using tc, one soft lockup alarm will be generated after delete the prio 0 filter of the chain. To reproduce the problem, perform the following steps: (1) tc qdisc add dev eth0 root handle 1: htb default 1 (2) tc chain add dev eth0 (3) tc filter del dev eth0 chain 0 parent 1: prio 0 (4) tc filter add dev eth0 chain 0 parent 1: Fix the issue by accounting for additional reference to chains that are explicitly created by RTM_NEWCHAIN message as opposed to implicitly by RTM_NEWTFILTER message. Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush") Reported-by: Mingshuai Ren <renmingshuai@huawei.com> Closes: https://lore.kernel.org/lkml/87legswvi3.fsf@nvidia.com/T/ Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Link: https://lore.kernel.org/r/20230612093426.2867183-1-vladbu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14ice: Fix ice module unloadJakub Buchocki
Clearing the interrupt scheme before PFR reset, during the removal routine, could cause the hardware errors and possibly lead to system reboot, as the PF reset can cause the interrupt to be generated. Place the call for PFR reset inside ice_deinit_dev(), wait until reset and all pending transactions are done, then call ice_clear_interrupt_scheme(). This introduces a PFR reset to multiple error paths. Additionally, remove the call for the reset from ice_load() - it will be a part of ice_unload() now. Error example: [ 75.229328] ice 0000:ca:00.1: Failed to read Tx Scheduler Tree - User Selection data from flash [ 77.571315] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [ 77.571418] {1}[Hardware Error]: event severity: recoverable [ 77.571459] {1}[Hardware Error]: Error 0, type: recoverable [ 77.571500] {1}[Hardware Error]: section_type: PCIe error [ 77.571540] {1}[Hardware Error]: port_type: 4, root port [ 77.571580] {1}[Hardware Error]: version: 3.0 [ 77.571615] {1}[Hardware Error]: command: 0x0547, status: 0x4010 [ 77.571661] {1}[Hardware Error]: device_id: 0000:c9:02.0 [ 77.571703] {1}[Hardware Error]: slot: 25 [ 77.571736] {1}[Hardware Error]: secondary_bus: 0xca [ 77.571773] {1}[Hardware Error]: vendor_id: 0x8086, device_id: 0x347a [ 77.571821] {1}[Hardware Error]: class_code: 060400 [ 77.571858] {1}[Hardware Error]: bridge: secondary_status: 0x2800, control: 0x0013 [ 77.572490] pcieport 0000:c9:02.0: AER: aer_status: 0x00200000, aer_mask: 0x00100020 [ 77.572870] pcieport 0000:c9:02.0: [21] ACSViol (First) [ 77.573222] pcieport 0000:c9:02.0: AER: aer_layer=Transaction Layer, aer_agent=Receiver ID [ 77.573554] pcieport 0000:c9:02.0: AER: aer_uncor_severity: 0x00463010 [ 77.691273] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [ 77.691738] {2}[Hardware Error]: event severity: recoverable [ 77.691971] {2}[Hardware Error]: Error 0, type: recoverable [ 77.692192] {2}[Hardware Error]: section_type: PCIe error [ 77.692403] {2}[Hardware Error]: port_type: 4, root port [ 77.692616] {2}[Hardware Error]: version: 3.0 [ 77.692825] {2}[Hardware Error]: command: 0x0547, status: 0x4010 [ 77.693032] {2}[Hardware Error]: device_id: 0000:c9:02.0 [ 77.693238] {2}[Hardware Error]: slot: 25 [ 77.693440] {2}[Hardware Error]: secondary_bus: 0xca [ 77.693641] {2}[Hardware Error]: vendor_id: 0x8086, device_id: 0x347a [ 77.693853] {2}[Hardware Error]: class_code: 060400 [ 77.694054] {2}[Hardware Error]: bridge: secondary_status: 0x0800, control: 0x0013 [ 77.719115] pci 0000:ca:00.1: AER: can't recover (no error_detected callback) [ 77.719140] pcieport 0000:c9:02.0: AER: device recovery failed [ 77.719216] pcieport 0000:c9:02.0: AER: aer_status: 0x00200000, aer_mask: 0x00100020 [ 77.719390] pcieport 0000:c9:02.0: [21] ACSViol (First) [ 77.719557] pcieport 0000:c9:02.0: AER: aer_layer=Transaction Layer, aer_agent=Receiver ID [ 77.719723] pcieport 0000:c9:02.0: AER: aer_uncor_severity: 0x00463010 Fixes: 5b246e533d01 ("ice: split probe into smaller functions") Signed-off-by: Jakub Buchocki <jakubx.buchocki@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230612171421.21570-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-06-12 (igc, igb) This series contains updates to igc and igb drivers. Husaini clears Tx rings when interface is brought down for igc. Vinicius disables PTM and PCI busmaster when removing igc driver. Alex adds error check and path for NVM read error on igb. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: igb: fix nvm.ops.read() error handling igc: Fix possible system crash when loading module igc: Clean the TX buffer and TX descriptor ring ==================== Link: https://lore.kernel.org/r/20230612205208.115292-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14net/handshake: remove fput() that causes use-after-freeLin Ma
A reference underflow is found in TLS handshake subsystem that causes a direct use-after-free. Part of the crash log is like below: [ 2.022114] ------------[ cut here ]------------ [ 2.022193] refcount_t: underflow; use-after-free. [ 2.022288] WARNING: CPU: 0 PID: 60 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110 [ 2.022432] Modules linked in: [ 2.022848] RIP: 0010:refcount_warn_saturate+0xbe/0x110 [ 2.023231] RSP: 0018:ffffc900001bfe18 EFLAGS: 00000286 [ 2.023325] RAX: 0000000000000000 RBX: 0000000000000007 RCX: 00000000ffffdfff [ 2.023438] RDX: 0000000000000000 RSI: 00000000ffffffea RDI: 0000000000000001 [ 2.023555] RBP: ffff888004c20098 R08: ffffffff82b392c8 R09: 00000000ffffdfff [ 2.023693] R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888004c200d8 [ 2.023813] R13: 0000000000000000 R14: ffff888004c20000 R15: ffffc90000013ca8 [ 2.023930] FS: 0000000000000000(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 [ 2.024062] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.024161] CR2: ffff888003601000 CR3: 0000000002a2e000 CR4: 00000000000006f0 [ 2.024275] Call Trace: [ 2.024322] <TASK> [ 2.024367] ? __warn+0x7f/0x130 [ 2.024430] ? refcount_warn_saturate+0xbe/0x110 [ 2.024513] ? report_bug+0x199/0x1b0 [ 2.024585] ? handle_bug+0x3c/0x70 [ 2.024676] ? exc_invalid_op+0x18/0x70 [ 2.024750] ? asm_exc_invalid_op+0x1a/0x20 [ 2.024830] ? refcount_warn_saturate+0xbe/0x110 [ 2.024916] ? refcount_warn_saturate+0xbe/0x110 [ 2.024998] __tcp_close+0x2f4/0x3d0 [ 2.025065] ? __pfx_kunit_generic_run_threadfn_adapter+0x10/0x10 [ 2.025168] tcp_close+0x1f/0x70 [ 2.025231] inet_release+0x33/0x60 [ 2.025297] sock_release+0x1f/0x80 [ 2.025361] handshake_req_cancel_test2+0x100/0x2d0 [ 2.025457] kunit_try_run_case+0x4c/0xa0 [ 2.025532] kunit_generic_run_threadfn_adapter+0x15/0x20 [ 2.025644] kthread+0xe1/0x110 [ 2.025708] ? __pfx_kthread+0x10/0x10 [ 2.025780] ret_from_fork+0x2c/0x50 One can enable CONFIG_NET_HANDSHAKE_KUNIT_TEST config to reproduce above crash. The root cause of this bug is that the commit 1ce77c998f04 ("net/handshake: Unpin sock->file if a handshake is cancelled") adds one additional fput() function. That patch claims that the fput() is used to enable sock->file to be freed even when user space never calls DONE. However, it seems that the intended DONE routine will never give an additional fput() of ths sock->file. The existing two of them are just used to balance the reference added in sockfd_lookup(). This patch revert the mentioned commit to avoid the use-after-free. The patched kernel could successfully pass the KUNIT test and boot to shell. [ 0.733613] # Subtest: Handshake API tests [ 0.734029] 1..11 [ 0.734255] KTAP version 1 [ 0.734542] # Subtest: req_alloc API fuzzing [ 0.736104] ok 1 handshake_req_alloc NULL proto [ 0.736114] ok 2 handshake_req_alloc CLASS_NONE [ 0.736559] ok 3 handshake_req_alloc CLASS_MAX [ 0.737020] ok 4 handshake_req_alloc no callbacks [ 0.737488] ok 5 handshake_req_alloc no done callback [ 0.737988] ok 6 handshake_req_alloc excessive privsize [ 0.738529] ok 7 handshake_req_alloc all good [ 0.739036] # req_alloc API fuzzing: pass:7 fail:0 skip:0 total:7 [ 0.739444] ok 1 req_alloc API fuzzing [ 0.740065] ok 2 req_submit NULL req arg [ 0.740436] ok 3 req_submit NULL sock arg [ 0.740834] ok 4 req_submit NULL sock->file [ 0.741236] ok 5 req_lookup works [ 0.741621] ok 6 req_submit max pending [ 0.741974] ok 7 req_submit multiple [ 0.742382] ok 8 req_cancel before accept [ 0.742764] ok 9 req_cancel after accept [ 0.743151] ok 10 req_cancel after done [ 0.743510] ok 11 req_destroy works [ 0.743882] # Handshake API tests: pass:11 fail:0 skip:0 total:11 [ 0.744205] # Totals: pass:17 fail:0 skip:0 total:17 Acked-by: Chuck Lever <chuck.lever@oracle.com> Fixes: 1ce77c998f04 ("net/handshake: Unpin sock->file if a handshake is cancelled") Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://lore.kernel.org/r/20230613083204.633896-1-linma@zju.edu.cn Link: https://lore.kernel.org/r/20230614015249.987448-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14Merge tag 'wireless-2023-06-14' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== A couple of straggler fixes, mostly in the stack: - fix fragmentation for multi-link related elements - fix callback copy/paste error - fix multi-link locking - remove double-locking of wiphy mutex - transmit only on active links, not all - activate links in the correct order - don't remove links that weren't added - disable soft-IRQs for LQ lock in iwlwifi * tag 'wireless-2023-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regression wifi: mac80211: fragment per STA profile correctly wifi: mac80211: Use active_links instead of valid_links in Tx wifi: cfg80211: remove links only on AP wifi: mac80211: take lock before setting vif links wifi: cfg80211: fix link del callback to call correct handler wifi: mac80211: fix link activation settings order wifi: cfg80211: fix double lock bug in reg_wdev_chan_valid() ==================== Link: https://lore.kernel.org/r/20230614075502.11765-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-14ext4: drop the call to ext4_error() from ext4_get_group_info()Fabio M. De Francesco
A recent patch added a call to ext4_error() which is problematic since some callers of the ext4_get_group_info() function may be holding a spinlock, whereas ext4_error() must never be called in atomic context. This triggered a report from Syzbot: "BUG: sleeping function called from invalid context in ext4_update_super" (see the link below). Therefore, drop the call to ext4_error() from ext4_get_group_info(). In the meantime use eight characters tabs instead of nine characters ones. Reported-by: syzbot+4acc7d910e617b360859@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/00000000000070575805fdc6cdb2@google.com/ Fixes: 5354b2af3406 ("ext4: allow ext4_get_group_info() to fail") Suggested-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Link: https://lore.kernel.org/r/20230614100446.14337-1-fmdefrancesco@gmail.com
2023-06-14Revert "ext4: remove unnecessary check in ext4_bg_num_gdb_nometa"Kemeng Shi
This reverts commit ad3f09be6cfe332be8ff46c78e6ec0f8839107aa. The reverted commit was intended to simpfy the code to get group descriptor block number in non-meta block group by assuming s_gdb_count is block number used for all non-meta block group descriptors. However s_gdb_count is block number used for all meta *and* non-meta group descriptors. So s_gdb_group will be > actual group descriptor block number used for all non-meta block group which should be "total non-meta block group" / "group descriptors per block", e.g. s_first_meta_bg. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://lore.kernel.org/r/20230613225025.3859522-1-shikemeng@huaweicloud.com Fixes: ad3f09be6cfe ("ext4: remove unnecessary check in ext4_bg_num_gdb_nometa") Cc: stable@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-06-14scsi: lpfc: Fix incorrect big endian type assignment in bsg loopback pathJustin Tee
The kernel test robot reported sparse warnings regarding incorrect type assignment for __be16 variables in bsg loopback path. Change the flagged lines to use the be16_to_cpu() and cpu_to_be16() macros appropriately. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230614175944.3577-1-justintee8345@gmail.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306110819.sDIKiGgg-lkp@intel.com/ Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-14scsi: target: core: Fix error path in target_setup_session()Bob Pearson
In the error exits in target_setup_session(), if a branch is taken to free_sess: transport_free_session() may call to target_free_cmd_counter() and then fall through to call target_free_cmd_counter() a second time. This can, and does, sometimes cause seg faults since the data field in cmd_cnt->refcnt has been freed in the first call. Fix this problem by simply returning after the call to transport_free_session(). The second call is redundant for those cases. Fixes: 4edba7e4a8f3 ("scsi: target: Move cmd counter allocation") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Link: https://lore.kernel.org/r/20230613144259.12890-1-rpearsonhpe@gmail.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-14scsi: storvsc: Always set no_report_opcodesMichael Kelley
Hyper-V synthetic SCSI devices do not support the MAINTENANCE_IN SCSI command, so scsi_report_opcode() always fails, resulting in messages like this: hv_storvsc <guid>: tag#205 cmd 0xa3 status: scsi 0x2 srb 0x86 hv 0xc0000001 The recently added support for command duration limits calls scsi_report_opcode() four times as each device comes online, which significantly increases the number of messages logged in a system with many disks. Fix the problem by always marking Hyper-V synthetic SCSI devices as not supporting scsi_report_opcode(). With this setting, the MAINTENANCE_IN SCSI command is not issued and no messages are logged. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1686343101-18930-1-git-send-email-mikelley@microsoft.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-14scsi: aacraid: Reply queue mapping to CPUs based on IRQ affinitySagar Biradar
Fix the I/O hang that arises because of the MSIx vector not having a mapped online CPU upon receiving completion. SCSI cmds take the blk_mq route, which is setup during init. Reserved cmds fetch the vector_no from mq_map after init is complete. Before init, they have to use 0 - as per the norm. Reviewed-by: Gilbert Wu <gilbert.wu@microchip.com> Signed-off-by: Sagar Biradar <Sagar.Biradar@microchip.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20230519230834.27436-1-sagar.biradar@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-06-14clk: pxa: fix NULL pointer dereference in pxa3xx_clk_update_accrArnd Bergmann
sparse points out an embarrasing bug in an older patch of mine, which uses the register offset instead of an __iomem pointer: drivers/clk/pxa/clk-pxa3xx.c:167:9: sparse: sparse: Using plain integer as NULL pointer Unlike sparse, gcc and clang ignore this bug and fail to warn because a literal '0' is considered a valid representation of a NULL pointer. Fixes: 3c816d950a49 ("ARM: pxa: move clk register definitions to driver") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202305111301.RAHohdob-lkp@intel.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230511105845.299859-1-arnd@kernel.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2023-06-14Revert "media: dvb-core: Fix use-after-free on race condition at dvb_frontend"Mauro Carvalho Chehab
As reported by Thomas Voegtle <tv@lio96.de>, sometimes a DVB card does not initialize properly booting Linux 6.4-rc4. This is not always, maybe in 3 out of 4 attempts. After double-checking, the root cause seems to be related to the UAF fix, which is causing a race issue: [ 26.332149] tda10071 7-0005: found a 'NXP TDA10071' in cold state, will try to load a firmware [ 26.340779] tda10071 7-0005: downloading firmware from file 'dvb-fe-tda10071.fw' [ 989.277402] INFO: task vdr:743 blocked for more than 491 seconds. [ 989.283504] Not tainted 6.4.0-rc5-i5 #249 [ 989.288036] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 989.295860] task:vdr state:D stack:0 pid:743 ppid:711 flags:0x00004002 [ 989.295865] Call Trace: [ 989.295867] <TASK> [ 989.295869] __schedule+0x2ea/0x12d0 [ 989.295877] ? asm_sysvec_apic_timer_interrupt+0x16/0x20 [ 989.295881] schedule+0x57/0xc0 [ 989.295884] schedule_preempt_disabled+0xc/0x20 [ 989.295887] __mutex_lock.isra.16+0x237/0x480 [ 989.295891] ? dvb_get_property.isra.10+0x1bc/0xa50 [ 989.295898] ? dvb_frontend_stop+0x36/0x180 [ 989.338777] dvb_frontend_stop+0x36/0x180 [ 989.338781] dvb_frontend_open+0x2f1/0x470 [ 989.338784] dvb_device_open+0x81/0xf0 [ 989.338804] ? exact_lock+0x20/0x20 [ 989.338808] chrdev_open+0x7f/0x1c0 [ 989.338811] ? generic_permission+0x1a2/0x230 [ 989.338813] ? link_path_walk.part.63+0x340/0x380 [ 989.338815] ? exact_lock+0x20/0x20 [ 989.338817] do_dentry_open+0x18e/0x450 [ 989.374030] path_openat+0xca5/0xe00 [ 989.374031] ? terminate_walk+0xec/0x100 [ 989.374034] ? path_lookupat+0x93/0x140 [ 989.374036] do_filp_open+0xc0/0x140 [ 989.374038] ? __call_rcu_common.constprop.91+0x92/0x240 [ 989.374041] ? __check_object_size+0x147/0x260 [ 989.374043] ? __check_object_size+0x147/0x260 [ 989.374045] ? alloc_fd+0xbb/0x180 [ 989.374048] ? do_sys_openat2+0x243/0x310 [ 989.374050] do_sys_openat2+0x243/0x310 [ 989.374052] do_sys_open+0x52/0x80 [ 989.374055] do_syscall_64+0x5b/0x80 [ 989.421335] ? __task_pid_nr_ns+0x92/0xa0 [ 989.421337] ? syscall_exit_to_user_mode+0x20/0x40 [ 989.421339] ? do_syscall_64+0x67/0x80 [ 989.421341] ? syscall_exit_to_user_mode+0x20/0x40 [ 989.421343] ? do_syscall_64+0x67/0x80 [ 989.421345] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 989.421348] RIP: 0033:0x7fe895d067e3 [ 989.421349] RSP: 002b:00007fff933c2ba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [ 989.421351] RAX: ffffffffffffffda RBX: 00007fff933c2c10 RCX: 00007fe895d067e3 [ 989.421352] RDX: 0000000000000802 RSI: 00005594acdce160 RDI: 00000000ffffff9c [ 989.421353] RBP: 0000000000000802 R08: 0000000000000000 R09: 0000000000000000 [ 989.421353] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000001 [ 989.421354] R13: 00007fff933c2ca0 R14: 00000000ffffffff R15: 00007fff933c2c90 [ 989.421355] </TASK> This reverts commit 6769a0b7ee0c3b31e1b22c3fadff2bfb642de23f. Fixes: 6769a0b7ee0c ("media: dvb-core: Fix use-after-free on race condition at dvb_frontend") Link: https://lore.kernel.org/all/da5382ad-09d6-20ac-0d53-611594b30861@lio96.de/ Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2023-06-14io_uring/io-wq: clear current->worker_private on exitJens Axboe
A recent fix stopped clearing PF_IO_WORKER from current->flags on exit, which meant that we can now call inc/dec running on the worker after it has been removed if it ends up scheduling in/out as part of exit. If this happens after an RCU grace period has passed, then the struct pointed to by current->worker_private may have been freed, and we can now be accessing memory that is freed. Ensure this doesn't happen by clearing the task worker_private field. Both io_wq_worker_running() and io_wq_worker_sleeping() check this field before going any further, and we don't need any accounting etc done after this worker has exited. Fixes: fd37b884003c ("io_uring/io-wq: don't clear PF_IO_WORKER on exit") Reported-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-14RDMA/rxe: Fix rxe_cq_postBob Pearson
A recent patch replaced a tasklet execution of cq->comp_handler by a direct call. While this made sense it let changes to cq->notify state be unprotected and assumed that the cq completion machinery and the ulp done callbacks were reentrant. The result is that in some cases completion events can be lost. This patch moves the cq->comp_handler call inside of the spinlock in rxe_cq_post which solves both issues. This is compatible with the matching code in the request notify verb. Fixes: 78b26a335310 ("RDMA/rxe: Remove tasklet call from rxe_cq.c") Link: https://lore.kernel.org/r/20230612155032.17036-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-06-14btrfs: scrub: fix a return value overwrite in scrub_stripe()Qu Wenruo
[RETURN VALUE OVERWRITE] Inside scrub_stripe(), we would submit all the remaining stripes after iterating all extents. But since flush_scrub_stripes() can return error, we need to avoid overwriting the existing @ret if there is any error. However the existing check is doing the wrong check: ret2 = flush_scrub_stripes(); if (!ret2) ret = ret2; This would overwrite the existing @ret to 0 as long as the final flush detects no critical errors. [FIX] We should check @ret other than @ret2 in that case. Fixes: 8eb3dd17eadd ("btrfs: dev-replace: error out if we have unrepaired metadata error during") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-14cifs: add a warning when the in-flight count goes negativeShyam Prasad N
We've seen the in-flight count go into negative with some internal stress testing in Microsoft. Adding a WARN when this happens, in hope of understanding why this happens when it happens. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-14cifs: fix lease break oops in xfstest generic/098Steve French
umount can race with lease break so need to check if tcon->ses->server is still valid to send the lease break response. Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Fixes: 59a556aebc43 ("SMB3: drop reference to cfile before sending oplock break") Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-14Documentation: RISC-V: patch-acceptance: mention patchwork's roleConor Dooley
Palmer suggested at some point, not sure if it was in one of the weekly linux-riscv syncs, or a conversation at FOSDEM, that we should document the role of the automation running on our patchwork instance plays in patch acceptance. Add a short note to the patch-acceptance document to that end. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20230606-rehab-monsoon-12c17bbe08e3@wendy Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-06-14selftests: forwarding: hw_stats_l3: Set addrgenmode in a separate stepDanielle Ratson
Setting the IPv6 address generation mode of a net device during its creation never worked, but after commit b0ad3c179059 ("rtnetlink: call validate_linkmsg in rtnl_create_link") it explicitly fails [1]. The failure is caused by the fact that validate_linkmsg() is called before the net device is registered, when it still does not have an 'inet6_dev'. Likewise, raising the net device before setting the address generation mode is meaningless, because by the time the mode is set, the address has already been generated. Therefore, fix the test to first create the net device, then set its IPv6 address generation mode and finally bring it up. [1] # ip link add name mydev addrgenmode eui64 type dummy RTNETLINK answers: Address family not supported by protocol Fixes: ba95e7930957 ("selftests: forwarding: hw_stats_l3: Add a new test") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/f3b05d85b2bc0c3d6168fe8f7207c6c8365703db.1686580046.git.petrm@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14Merge branch 'net-sched-fix-race-conditions-in-mini_qdisc_pair_swap'Paolo Abeni
Peilin Ye says: ==================== net/sched: Fix race conditions in mini_qdisc_pair_swap() These 2 patches fix race conditions for ingress and clsact Qdiscs as reported [1] by syzbot, split out from another [2] series (last 2 patches of it). Per-patch changelog omitted. Patch 1 hasn't been touched since last version; I just included everybody's tag. Patch 2 bases on patch 6 v1 of [2], with comments and commit log slightly changed. We also need rtnl_dereference() to load ->qdisc_sleeping since commit d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping"), so I changed that; please take yet another look, thanks! Patch 2 has been tested with the new reproducer Pedro posted [3]. [1] https://syzkaller.appspot.com/bug?extid=b53a9c0d1ea4ad62da8b [2] https://lore.kernel.org/r/cover.1684887977.git.peilin.ye@bytedance.com/ [3] https://lore.kernel.org/r/7879f218-c712-e9cc-57ba-665990f5f4c9@mojatatu.com/ ==================== Link: https://lore.kernel.org/r/cover.1686355297.git.peilin.ye@bytedance.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: qdisc_destroy() old ingress and clsact Qdiscs before graftingPeilin Ye
mini_Qdisc_pair::p_miniq is a double pointer to mini_Qdisc, initialized in ingress_init() to point to net_device::miniq_ingress. ingress Qdiscs access this per-net_device pointer in mini_qdisc_pair_swap(). Similar for clsact Qdiscs and miniq_egress. Unfortunately, after introducing RTNL-unlocked RTM_{NEW,DEL,GET}TFILTER requests (thanks Hillf Danton for the hint), when replacing ingress or clsact Qdiscs, for example, the old Qdisc ("@old") could access the same miniq_{in,e}gress pointer(s) concurrently with the new Qdisc ("@new"), causing race conditions [1] including a use-after-free bug in mini_qdisc_pair_swap() reported by syzbot: BUG: KASAN: slab-use-after-free in mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 Write of size 8 at addr ffff888045b31308 by task syz-executor690/14901 ... Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106 print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:319 print_report mm/kasan/report.c:430 [inline] kasan_report+0x11c/0x130 mm/kasan/report.c:536 mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573 tcf_chain_head_change_item net/sched/cls_api.c:495 [inline] tcf_chain0_head_change.isra.0+0xb9/0x120 net/sched/cls_api.c:509 tcf_chain_tp_insert net/sched/cls_api.c:1826 [inline] tcf_chain_tp_insert_unique net/sched/cls_api.c:1875 [inline] tc_new_tfilter+0x1de6/0x2290 net/sched/cls_api.c:2266 ... @old and @new should not affect each other. In other words, @old should never modify miniq_{in,e}gress after @new, and @new should not update @old's RCU state. Fixing without changing sch_api.c turned out to be difficult (please refer to Closes: for discussions). Instead, make sure @new's first call always happen after @old's last call (in {ingress,clsact}_destroy()) has finished: In qdisc_graft(), return -EBUSY if @old has any ongoing filter requests, and call qdisc_destroy() for @old before grafting @new. Introduce qdisc_refcount_dec_if_one() as the counterpart of qdisc_refcount_inc_nz() used for filter requests. Introduce a non-static version of qdisc_destroy() that does a TCQ_F_BUILTIN check, just like qdisc_put() etc. Depends on patch "net/sched: Refactor qdisc_graft() for ingress and clsact Qdiscs". [1] To illustrate, the syzkaller reproducer adds ingress Qdiscs under TC_H_ROOT (no longer possible after commit c7cfbd115001 ("net/sched: sch_ingress: Only create under TC_H_INGRESS")) on eth0 that has 8 transmission queues: Thread 1 creates ingress Qdisc A (containing mini Qdisc a1 and a2), then adds a flower filter X to A. Thread 2 creates another ingress Qdisc B (containing mini Qdisc b1 and b2) to replace A, then adds a flower filter Y to B. Thread 1 A's refcnt Thread 2 RTM_NEWQDISC (A, RTNL-locked) qdisc_create(A) 1 qdisc_graft(A) 9 RTM_NEWTFILTER (X, RTNL-unlocked) __tcf_qdisc_find(A) 10 tcf_chain0_head_change(A) mini_qdisc_pair_swap(A) (1st) | | RTM_NEWQDISC (B, RTNL-locked) RCU sync 2 qdisc_graft(B) | 1 notify_and_destroy(A) | tcf_block_release(A) 0 RTM_NEWTFILTER (Y, RTNL-unlocked) qdisc_destroy(A) tcf_chain0_head_change(B) tcf_chain0_head_change_cb_del(A) mini_qdisc_pair_swap(B) (2nd) mini_qdisc_pair_swap(A) (3rd) | ... ... Here, B calls mini_qdisc_pair_swap(), pointing eth0->miniq_ingress to its mini Qdisc, b1. Then, A calls mini_qdisc_pair_swap() again during ingress_destroy(), setting eth0->miniq_ingress to NULL, so ingress packets on eth0 will not find filter Y in sch_handle_ingress(). This is just one of the possible consequences of concurrently accessing miniq_{in,e}gress pointers. Fixes: 7a096d579e8e ("net: sched: ingress: set 'unlocked' flag for Qdisc ops") Fixes: 87f373921c4e ("net: sched: ingress: set 'unlocked' flag for clsact Qdisc ops") Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/ Cc: Hillf Danton <hdanton@sina.com> Cc: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: Refactor qdisc_graft() for ingress and clsact QdiscsPeilin Ye
Grafting ingress and clsact Qdiscs does not need a for-loop in qdisc_graft(). Refactor it. No functional changes intended. Tested-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14net/sched: act_ct: Fix promotion of offloaded unreplied tuplePaul Blakey
Currently UNREPLIED and UNASSURED connections are added to the nf flow table. This causes the following connection packets to be processed by the flow table which then skips conntrack_in(), and thus such the connections will remain UNREPLIED and UNASSURED even if reply traffic is then seen. Even still, the unoffloaded reply packets are the ones triggering hardware update from new to established state, and if there aren't any to triger an update and/or previous update was missed, hardware can get out of sync with sw and still mark packets as new. Fix the above by: 1) Not skipping conntrack_in() for UNASSURED packets, but still refresh for hardware, as before the cited patch. 2) Try and force a refresh by reply-direction packets that update the hardware rules from new to established state. 3) Remove any bidirectional flows that didn't failed to update in hardware for re-insertion as bidrectional once any new packet arrives. Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections") Co-developed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/1686313379-117663-1-git-send-email-paulb@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-14wifi: iwlwifi: mvm: spin_lock_bh() to fix lockdep regressionHugh Dickins
Lockdep on 6.4-rc on ThinkPad X1 Carbon 5th says ===================================================== WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected 6.4.0-rc5 #1 Not tainted ----------------------------------------------------- kworker/3:1/49 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire: ffff8881066fa368 (&mvm_sta->deflink.lq_sta.rs_drv.pers.lock){+.+.}-{2:2}, at: rs_drv_get_rate+0x46/0xe7 and this task is already holding: ffff8881066f80a8 (&sta->rate_ctrl_lock){+.-.}-{2:2}, at: rate_control_get_rate+0xbd/0x126 which would create a new lock dependency: (&sta->rate_ctrl_lock){+.-.}-{2:2} -> (&mvm_sta->deflink.lq_sta.rs_drv.pers.lock){+.+.}-{2:2} but this new dependency connects a SOFTIRQ-irq-safe lock: (&sta->rate_ctrl_lock){+.-.}-{2:2} etc. etc. etc. Changing the spin_lock() in rs_drv_get_rate() to spin_lock_bh() was not enough to pacify lockdep, but changing them all on pers.lock has worked. Fixes: a8938bc881d2 ("wifi: iwlwifi: mvm: Add locking to the rate read flow") Signed-off-by: Hugh Dickins <hughd@google.com> Link: https://lore.kernel.org/r/79ffcc22-9775-cb6d-3ffd-1a517c40beef@google.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-06-13Merge branch 'fix-small-bugs-and-annoyances-in-tc-testing'Jakub Kicinski
Vlad Buslov says: ==================== Fix small bugs and annoyances in tc-testing ==================== Link: https://lore.kernel.org/r/20230612075712.2861848-1-vladbu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Remove configs that no longer existVlad Buslov
Some qdiscs and classifiers have recently been retired from kernel. However, tc-testing config is still cluttered with them which causes noise when using merge_config.sh script to update existing config for tc-testing compatibility. Remove the config settings for affected qdiscs and classifiers. Fixes: fb38306ceb9e ("net/sched: Retire ATM qdisc") Fixes: 051d44209842 ("net/sched: Retire CBQ qdisc") Fixes: bbe77c14ee61 ("net/sched: Retire dsmark qdisc") Fixes: 265b4da82dbf ("net/sched: Retire rsvp classifier") Fixes: 8c710f75256b ("net/sched: Retire tcindex classifier") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix SFB db testVlad Buslov
Setting very small value of db like 10ms introduces rounding errors when converting to/from jiffies on some kernel configs. For example, on 250hz the actual value will be set to 12ms which causes the test to fail: # $ sudo ./tdc.py -d eth2 -e 3410 # -- ns/SubPlugin.__init__ # Test 3410: Create SFB with db setting # # All test results: # # 1..1 # not ok 1 3410 - Create SFB with db setting # Could not match regex pattern. Verify command output: # qdisc sfb 1: root refcnt 2 rehash 600s db 12ms limit 1000p max 25p target 20p increment 0.000503548 decrement 4.57771e-05 penalty_rate 10pps penalty_burst 20p Set the value to 100ms instead which currently seem to work on 100hz, 250hz, 300hz and 1000hz kernel configs. Fixes: 6ad92dc56fca ("selftests/tc-testing: add selftests for sfb qdisc") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix Error: failed to find target LOGVlad Buslov
Add missing netfilter config dependency. Fixes following example error when running tests via tdc.sh for all XT tests: # $ sudo ./tdc.py -d eth2 -e 2029 # Test 2029: Add xt action with log-prefix # exit: 255 # exit: 0 # failed to find target LOG # # bad action parsing # parse_action: bad value (7:xt)! # Illegal "action" # # -----> teardown stage *** Could not execute: "$TC actions flush action xt" # # -----> teardown stage *** Error message: "Error: Cannot flush unknown TC action. # We have an error flushing # " # returncode 1; expected [0] # # -----> teardown stage *** Aborting test run. # # <_io.BufferedReader name=3> *** stdout *** # # <_io.BufferedReader name=5> *** stderr *** # "-----> teardown stage" did not complete successfully # Exception <class '__main__.PluginMgrTestFail'> ('teardown', ' failed to find target LOG\n\nbad action parsing\nparse_action: bad value (7:xt)!\nIllegal "action"\n', '"-----> teardown stage" did not complete successfully') (caught in test_runner, running test 2 2029 Add xt action with log-prefix stage teardown) # --------------- # traceback # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 495, in test_runner # res = run_one_test(pm, args, index, tidx) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 434, in run_one_test # prepare_env(args, pm, 'teardown', '-----> teardown stage', tidx['teardown'], procout) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 245, in prepare_env # raise PluginMgrTestFail( # --------------- # accumulated output for this test: # failed to find target LOG # # bad action parsing # parse_action: bad value (7:xt)! # Illegal "action" # # --------------- # # All test results: # # 1..1 # ok 1 2029 - Add xt action with log-prefix # skipped - "-----> teardown stage" did not complete successfully Fixes: 910d504bc187 ("selftests/tc-testings: add selftests for xt action") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13selftests/tc-testing: Fix Error: Specified qdisc kind is unknown.Vlad Buslov
All TEQL tests assume that sch_teql module is loaded. Load module in tdc.sh before running qdisc tests. Fixes following example error when running tests via tdc.sh for all TEQL tests: # $ sudo ./tdc.py -d eth2 -e 84a0 # -- ns/SubPlugin.__init__ # Test 84a0: Create TEQL with default setting # exit: 2 # exit: 0 # Error: Specified qdisc kind is unknown. # # -----> teardown stage *** Could not execute: "$TC qdisc del dev $DUMMY handle 1: root" # # -----> teardown stage *** Error message: "Error: Invalid handle. # " # returncode 2; expected [0] # # -----> teardown stage *** Aborting test run. # # <_io.BufferedReader name=3> *** stdout *** # # <_io.BufferedReader name=5> *** stderr *** # "-----> teardown stage" did not complete successfully # Exception <class '__main__.PluginMgrTestFail'> ('teardown', 'Error: Specified qdisc kind is unknown.\n', '"-----> teardown stage" did not complete successfully') (caught in test_runner, running test 2 84a0 Create TEQL with default setting stage teardown) # --------------- # traceback # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 495, in test_runner # res = run_one_test(pm, args, index, tidx) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 434, in run_one_test # prepare_env(args, pm, 'teardown', '-----> teardown stage', tidx['teardown'], procout) # File "/images/src/linux/tools/testing/selftests/tc-testing/./tdc.py", line 245, in prepare_env # raise PluginMgrTestFail( # --------------- # accumulated output for this test: # Error: Specified qdisc kind is unknown. # # --------------- # # All test results: # # 1..1 # ok 1 84a0 - Create TEQL with default setting # skipped - "-----> teardown stage" did not complete successfully Fixes: cc62fbe114c9 ("selftests/tc-testing: add selftests for teql qdisc") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Pedro Tammela <pctammela@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13net: ethernet: ti: am65-cpsw: Call of_node_put() on error pathDan Carpenter
This code returns directly but it should instead call of_node_put() to drop some reference counts. Fixes: dab2b265dd23 ("net: ethernet: ti: am65-cpsw: Add support for SERDES configuration") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/e3012f0c-1621-40e6-bf7d-03c276f6e07f@kili.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-13io_uring/net: save msghdr->msg_control for retriesJens Axboe
If the application sets ->msg_control and we have to later retry this command, or if it got queued with IOSQE_ASYNC to begin with, then we need to retain the original msg_control value. This is due to the net stack overwriting this field with an in-kernel pointer, to copy it in. Hitting that path for the second time will now fail the copy from user, as it's attempting to copy from a non-user address. Cc: stable@vger.kernel.org # 5.10+ Link: https://github.com/axboe/liburing/issues/880 Reported-and-tested-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-13Merge tag 'nios2_fix_v6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux Pull NIOS2 dts fix from Dinh Nguyen: - Fix tse_mac "max-frame-size" property * tag 'nios2_fix_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: dts: Fix tse_mac "max-frame-size" property
2023-06-13nios2: dts: Fix tse_mac "max-frame-size" propertyJanne Grunau
The given value of 1518 seems to refer to the layer 2 ethernet frame size without 802.1Q tag. Actual use of the "max-frame-size" including in the consumer of the "altr,tse-1.0" compatible is the MTU. Fixes: 95acd4c7b69c ("nios2: Device tree support") Fixes: 61c610ec61bb ("nios2: Add Max10 device tree") Cc: <stable@vger.kernel.org> Signed-off-by: Janne Grunau <j@jannau.net> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
2023-06-13drm/amd/display: limit DPIA link rate to HBR3Peichen Huang
[Why] DPIA doesn't support UHBR, driver should not enable UHBR for dp tunneling [How] limit DPIA link rate to HBR3 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Peichen Huang <peichen.huang@amd.com> Reviewed-by: Mustapha Ghaddar <Mustapha.Ghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13drm/amd/display: fix the system hang while disable PSRTom Chung
[Why] When the PSR enabled. If you try to adjust the timing parameters, it may cause system hang. Because the timing mismatch with the DMCUB settings. [How] Disable the PSR before adjusting timing parameters. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13drm/amd/display: edp do not add non-edid timingsHersen Wu
[Why] most edp support only timings from edid. applying non-edid timings, especially those timings out of edp bandwidth, may damage edp. [How] do not add non-edid timings for edp. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Acked-by: Stylon Wang <stylon.wang@amd.com> Signed-off-by: Hersen Wu <hersenxs.wu@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar ↵Arunpravin Paneer Selvam
system" This reverts commit c105518679b6e87232874ffc989ec403bee59664. This patch disables the TOPDOWN flag for APU and few dGPU cards which has the VRAM size equal to the BAR size. When we enable the TOPDOWN flag, we get the free blocks at the highest available memory region and we don't split the lower order blocks. This change is required to keep off the fragmentation related issues particularly in ASIC which has VRAM space <= 500MiB Hence, we are reverting this patch. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-06-13drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1Sonny Jiang
Only vcn0 can process AV1 codecx. In order to use both vcn0 and vcn1 in h264/265 transcode to AV1 cases, set vcn0 sched score to 1 at initialization time. Signed-off-by: Sonny Jiang <sonjiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-06-13drm/radeon: Disable outputs when releasing fbdev clientThomas Zimmermann
Disable the modesetting pipeline before release the radeon's fbdev client. Fixes the following error: [ 17.217408] WARNING: CPU: 5 PID: 1464 at drivers/gpu/drm/ttm/ttm_bo.c:326 ttm_bo_release+0x27e/0x2d0 [ttm] [ 17.217418] Modules linked in: edac_mce_amd radeon(+) drm_ttm_helper ttm video drm_suballoc_helper drm_display_helper kvm irqbypass drm_kms_helper syscopyarea crc32_pclmul sysfillrect sha512_ssse3 sysimgblt sha512_generic cfbfillrect cfbimgblt wmi_bmof aesni_intel cfbcopyarea crypto_simd cryptd k10temp acpi_cpufreq wmi dm_mod [ 17.217432] CPU: 5 PID: 1464 Comm: systemd-udevd Not tainted 6.4.0-rc4+ #1 [ 17.217436] Hardware name: Micro-Star International Co., Ltd. MS-7A38/B450M PRO-VDH MAX (MS-7A38), BIOS B.G0 07/26/2022 [ 17.217438] RIP: 0010:ttm_bo_release+0x27e/0x2d0 [ttm] [ 17.217444] Code: 48 89 43 38 48 89 43 40 48 8b 5c 24 30 48 8b b5 40 08 00 00 48 8b 6c 24 38 48 83 c4 58 e9 7a 49 f7 e0 48 89 ef e9 6c fe ff ff <0f> 0b 48 83 7b 20 00 0f 84 b7 fd ff ff 0f 0b 0f 1f 00 e9 ad fd ff [ 17.217448] RSP: 0018:ffffc9000095fbb0 EFLAGS: 00010202 [ 17.217451] RAX: 0000000000000001 RBX: ffff8881052c8de0 RCX: 0000000000000000 [ 17.217453] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881052c8de0 [ 17.217455] RBP: ffff888104a66e00 R08: ffff8881052c8de0 R09: ffff888104a7cf08 [ 17.217457] R10: ffffc9000095fbe0 R11: ffffc9000095fbe8 R12: ffff8881052c8c78 [ 17.217458] R13: ffff8881052c8c78 R14: dead000000000100 R15: ffff88810528b108 [ 17.217460] FS: 00007f319fcbb8c0(0000) GS:ffff88881a540000(0000) knlGS:0000000000000000 [ 17.217463] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 17.217464] CR2: 000055dc8b0224a0 CR3: 000000010373d000 CR4: 0000000000750ee0 [ 17.217466] PKRU: 55555554 [ 17.217468] Call Trace: [ 17.217470] <TASK> [ 17.217472] ? __warn+0x97/0x160 [ 17.217476] ? ttm_bo_release+0x27e/0x2d0 [ttm] [ 17.217481] ? report_bug+0x1ec/0x200 [ 17.217487] ? handle_bug+0x3c/0x70 [ 17.217490] ? exc_invalid_op+0x1f/0x90 [ 17.217493] ? preempt_count_sub+0xb5/0x100 [ 17.217496] ? asm_exc_invalid_op+0x16/0x20 [ 17.217500] ? ttm_bo_release+0x27e/0x2d0 [ttm] [ 17.217505] ? ttm_resource_move_to_lru_tail+0x1ab/0x1d0 [ttm] [ 17.217511] radeon_bo_unref+0x1a/0x30 [radeon] [ 17.217547] radeon_gem_object_free+0x20/0x30 [radeon] [ 17.217579] radeon_fbdev_fb_destroy+0x57/0x90 [radeon] [ 17.217616] unregister_framebuffer+0x72/0x110 [ 17.217620] drm_client_dev_unregister+0x6d/0xe0 [ 17.217623] drm_dev_unregister+0x2e/0x90 [ 17.217626] drm_put_dev+0x26/0x90 [ 17.217628] pci_device_remove+0x44/0xc0 [ 17.217631] really_probe+0x257/0x340 [ 17.217635] __driver_probe_device+0x73/0x120 [ 17.217638] driver_probe_device+0x2c/0xb0 [ 17.217641] __driver_attach+0xa0/0x150 [ 17.217643] ? __pfx___driver_attach+0x10/0x10 [ 17.217646] bus_for_each_dev+0x67/0xa0 [ 17.217649] bus_add_driver+0x10e/0x210 [ 17.217651] driver_register+0x5c/0x120 [ 17.217653] ? __pfx_radeon_module_init+0x10/0x10 [radeon] [ 17.217681] do_one_initcall+0x44/0x220 [ 17.217684] ? kmalloc_trace+0x37/0xc0 [ 17.217688] do_init_module+0x64/0x240 [ 17.217691] __do_sys_finit_module+0xb2/0x100 [ 17.217694] do_syscall_64+0x3b/0x90 [ 17.217697] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 17.217700] RIP: 0033:0x7f319feaa5a9 [ 17.217702] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 08 0d 00 f7 d8 64 89 01 48 [ 17.217706] RSP: 002b:00007ffc6bf3e7f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 17.217709] RAX: ffffffffffffffda RBX: 00005607204f3170 RCX: 00007f319feaa5a9 [ 17.217710] RDX: 0000000000000000 RSI: 00007f31a002eefd RDI: 0000000000000018 [ 17.217712] RBP: 00007f31a002eefd R08: 0000000000000000 R09: 00005607204f1860 [ 17.217714] R10: 0000000000000018 R11: 0000000000000246 R12: 0000000000020000 [ 17.217716] R13: 0000000000000000 R14: 0000560720522450 R15: 0000560720255899 [ 17.217718] </TASK> [ 17.217719] ---[ end trace 0000000000000000 ]--- The buffer object backing the fbdev emulation got pinned twice: by the fb_probe helper radeon_fbdev_create_pinned_object() and the modesetting code when the framebuffer got displayed. It only got unpinned once by the fbdev helper radeon_fbdev_destroy_pinned_object(). Hence TTM's BO- release function complains about the pin counter. Forcing the outputs off also undoes the modesettings pin increment. Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Reported-by: Borislav Petkov <bp@alien8.de> Closes: https://lore.kernel.org/dri-devel/20230603174814.GCZHt83pN+wNjf63sC@fat_crate.local/ Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation") Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: amd-gfx@lists.freedesktop.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-13drm/amd/pm: workaround for compute workload type on some skusKenneth Feng
On smu 13.0.0, the compute workload type cannot be set on all the skus due to some other problems. This workaround is to make sure compute workload type can also run on some specific skus. v2: keep the variable consistent Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-06-13drm/amd: Tighten permissions on VBIOS flashing attributesMario Limonciello
Non-root users shouldn't be able to try to trigger a VBIOS flash or query the flashing status. This should be reserved for users with the appropriate permissions. Cc: stable@vger.kernel.org Fixes: 8424f2ccb3c0 ("drm/amdgpu/psp: Add vbflash sysfs interface support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>