pm24.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2023-07-17	tpm: tis_i2c: Limit read bursts to I2C_SMBUS_BLOCK_MAX (32) bytes	Alexander Sverdlin
	Underlying I2C bus drivers not always support longer transfers and imx-lpi2c for instance doesn't. SLB 9673 offers 427-bytes packets. Visible symptoms are: tpm tpm0: Error left over data tpm tpm0: tpm_transmit: tpm_recv: error -5 tpm_tis_i2c: probe of 1-002e failed with error -5 Cc: stable@vger.kernel.org # v5.20+ Fixes: bbc23a07b072 ("tpm: Add tpm_tis_i2c backend for tpm_tis_core") Tested-by: Michael Haener <michael.haener@siemens.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-07-17	tpm_tis_spi: Release chip select when flow control fails	Peijie Shao
	The failure paths in tpm_tis_spi_transfer() do not deactivate chip select. Send an empty message (cs_select == 0) to overcome this. The patch is tested by two ways. One way needs to touch hardware: 1. force pull MISO pin down to GND, it emulates a forever 'WAIT' timing. 2. probe cs pin by an oscilloscope. 3. load tpm_tis_spi.ko. After loading, dmesg prints: "probe of spi0.0 failed with error -110" and oscilloscope shows cs pin goes high(deactivated) after the failure. Before the patch, cs pin keeps low. Second way is by writing a fake spi controller. 1. implement .transfer_one method, fill all rx buf with 0. 2. implement .set_cs method, print the state of cs pin. we can see cs goes high after the failure. Signed-off-by: Peijie Shao <shaopeijie@cestc.cn> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-07-17	tpm: tpm_tis: Disable interrupts only for AEON UPX-i11	Peter Ujfalusi
	Further restrict with DMI_PRODUCT_VERSION. Cc: stable@vger.kernel.org # v6.4+ Link: https://lore.kernel.org/linux-integrity/20230517122931.22385-1-peter.ujfalusi@linux.intel.com/ Fixes: 95a9359ee22f ("tpm: tpm_tis: Disable interrupts for AEON UPX-i11") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-07-17	tpm: tpm_vtpm_proxy: fix a race condition in /dev/vtpmx creation	Jarkko Sakkinen
	/dev/vtpmx is made visible before 'workqueue' is initialized, which can lead to a memory corruption in the worst case scenario. Address this by initializing 'workqueue' as the very first step of the driver initialization. Cc: stable@vger.kernel.org Fixes: 6f99612e2500 ("tpm: Proxy driver for supporting multiple emulated TPMs") Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@tuni.fi> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2023-07-17	drm/msm: Fix IS_ERR_OR_NULL() vs NULL check in a5xx_submit_in_rb()	Gaosheng Cui
	The msm_gem_get_vaddr() returns an ERR_PTR() on failure, and a null is catastrophic here, so we should use IS_ERR_OR_NULL() to check the return value. Fixes: 6a8bd08d0465 ("drm/msm: add sudo flag to submit ioctl") Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/547712/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-07-17	can: mcp251xfd: __mcp251xfd_chip_set_mode(): increase poll timeout	Fedor Ross
	The mcp251xfd controller needs an idle bus to enter 'Normal CAN 2.0 mode' or . The maximum length of a CAN frame is 736 bits (64 data bytes, CAN-FD, EFF mode, worst case bit stuffing and interframe spacing). For low bit rates like 10 kbit/s the arbitrarily chosen MCP251XFD_POLL_TIMEOUT_US of 1 ms is too small. Otherwise during polling for the CAN controller to enter 'Normal CAN 2.0 mode' the timeout limit is exceeded and the configuration fails with: \| $ ip link set dev can1 up type can bitrate 10000 \| [ 731.911072] mcp251xfd spi2.1 can1: Controller failed to enter mode CAN 2.0 Mode (6) and stays in Configuration Mode (4) (con=0x068b0760, osc=0x00000468). \| [ 731.927192] mcp251xfd spi2.1 can1: CRC read error at address 0x0e0c (length=4, data=00 00 00 00, CRC=0x0000) retrying. \| [ 731.938101] A link change request failed with some changes committed already. Interface can1 may have been left with an inconsistent configuration, please check. \| RTNETLINK answers: Connection timed out Make MCP251XFD_POLL_TIMEOUT_US timeout calculation dynamic. Use maximum of 1ms and bit time of 1 full 64 data bytes CAN-FD frame in EFF mode, worst case bit stuffing and interframe spacing at the current bit rate. For easier backporting define the macro MCP251XFD_FRAME_LEN_MAX_BITS that holds the max frame length in bits, which is 736. This can be replaced by can_frame_bits(true, true, true, true, CANFD_MAX_DLEN) in a cleanup patch later. Fixes: 55e5b97f003e8 ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN") Signed-off-by: Fedor Ross <fedor.ross@ifm.com> Signed-off-by: Marek Vasut <marex@denx.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20230717-mcp251xfd-fix-increase-poll-timeout-v5-1-06600f34c684@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-07-17	iavf: fix reset task race with iavf_remove()	Ahmed Zaki
	The reset task is currently scheduled from the watchdog or adminq tasks. First, all direct calls to schedule the reset task are replaced with the iavf_schedule_reset(), which is modified to accept the flag showing the type of reset. To prevent the reset task from starting once iavf_remove() starts, we need to check the __IAVF_IN_REMOVE_TASK bit before we schedule it. This is now easily added to iavf_schedule_reset(). Finally, remove the check for IAVF_FLAG_RESET_NEEDED in the watchdog task. It is redundant since all callers who set the flag immediately schedules the reset task. Fixes: 3ccd54ef44eb ("iavf: Fix init state closure on remove") Fixes: 14756b2ae265 ("iavf: Fix __IAVF_RESETTING state usage") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17	iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies	Ahmed Zaki
	A driver's lock (crit_lock) is used to serialize all the driver's tasks. Lockdep, however, shows a circular dependency between rtnl and crit_lock. This happens when an ndo that already holds the rtnl requests the driver to reset, since the reset task (in some paths) tries to grab rtnl to either change real number of queues of update netdev features. [566.241851] ====================================================== [566.241893] WARNING: possible circular locking dependency detected [566.241936] 6.2.14-100.fc36.x86_64+debug #1 Tainted: G OE [566.241984] ------------------------------------------------------ [566.242025] repro.sh/2604 is trying to acquire lock: [566.242061] ffff9280fc5ceee8 (&adapter->crit_lock){+.+.}-{3:3}, at: iavf_close+0x3c/0x240 [iavf] [566.242167] but task is already holding lock: [566.242209] ffffffff9976d350 (rtnl_mutex){+.+.}-{3:3}, at: iavf_remove+0x6b5/0x730 [iavf] [566.242300] which lock already depends on the new lock. [566.242353] the existing dependency chain (in reverse order) is: [566.242401] -> #1 (rtnl_mutex){+.+.}-{3:3}: [566.242451] __mutex_lock+0xc1/0xbb0 [566.242489] iavf_init_interrupt_scheme+0x179/0x440 [iavf] [566.242560] iavf_watchdog_task+0x80b/0x1400 [iavf] [566.242627] process_one_work+0x2b3/0x560 [566.242663] worker_thread+0x4f/0x3a0 [566.242696] kthread+0xf2/0x120 [566.242730] ret_from_fork+0x29/0x50 [566.242763] -> #0 (&adapter->crit_lock){+.+.}-{3:3}: [566.242815] __lock_acquire+0x15ff/0x22b0 [566.242869] lock_acquire+0xd2/0x2c0 [566.242901] __mutex_lock+0xc1/0xbb0 [566.242934] iavf_close+0x3c/0x240 [iavf] [566.242997] __dev_close_many+0xac/0x120 [566.243036] dev_close_many+0x8b/0x140 [566.243071] unregister_netdevice_many_notify+0x165/0x7c0 [566.243116] unregister_netdevice_queue+0xd3/0x110 [566.243157] iavf_remove+0x6c1/0x730 [iavf] [566.243217] pci_device_remove+0x33/0xa0 [566.243257] device_release_driver_internal+0x1bc/0x240 [566.243299] pci_stop_bus_device+0x6c/0x90 [566.243338] pci_stop_and_remove_bus_device+0xe/0x20 [566.243380] pci_iov_remove_virtfn+0xd1/0x130 [566.243417] sriov_disable+0x34/0xe0 [566.243448] ice_free_vfs+0x2da/0x330 [ice] [566.244383] ice_sriov_configure+0x88/0xad0 [ice] [566.245353] sriov_numvfs_store+0xde/0x1d0 [566.246156] kernfs_fop_write_iter+0x15e/0x210 [566.246921] vfs_write+0x288/0x530 [566.247671] ksys_write+0x74/0xf0 [566.248408] do_syscall_64+0x58/0x80 [566.249145] entry_SYSCALL_64_after_hwframe+0x72/0xdc [566.249886] other info that might help us debug this: [566.252014] Possible unsafe locking scenario: [566.253432] CPU0 CPU1 [566.254118] ---- ---- [566.254800] lock(rtnl_mutex); [566.255514] lock(&adapter->crit_lock); [566.256233] lock(rtnl_mutex); [566.256897] lock(&adapter->crit_lock); [566.257388] * DEADLOCK * The deadlock can be triggered by a script that is continuously resetting the VF adapter while doing other operations requiring RTNL, e.g: while :; do ip link set $VF up ethtool --set-channels $VF combined 2 ip link set $VF down ip link set $VF up ethtool --set-channels $VF combined 4 ip link set $VF down done Any operation that triggers a reset can substitute "ethtool --set-channles" As a fix, add a new task "finish_config" that do all the work which needs rtnl lock. With the exception of iavf_remove(), all work that require rtnl should be called from this task. As for iavf_remove(), at the point where we need to call unregister_netdevice() (and grab rtnl_lock), we make sure the finish_config task is not running (cancel_work_sync()) to safely grab rtnl. Subsequent finish_config work cannot restart after that since the task is guarded by the __IAVF_IN_REMOVE_TASK bit in iavf_schedule_finish_config(). Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17	Revert "iavf: Do not restart Tx queues after reset task failure"	Marcin Szycik
	This reverts commit 08f1c147b7265245d67321585c68a27e990e0c4b. Netdev is no longer being detached during reset, so this fix can be reverted. We leave the removal of "hacky" IFF_UP flag update. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17	Revert "iavf: Detach device during reset task"	Marcin Szycik
	This reverts commit aa626da947e9cd30c4cf727493903e1adbb2c0a0. Detaching device during reset was not fully fixing the rtnl locking issue, as there could be a situation where callback was already in progress before detaching netdev. Furthermore, detaching netdevice causes TX timeouts if traffic is running. To reproduce: ip netns exec ns1 iperf3 -c $PEER_IP -t 600 --logfile /dev/null & while :; do for i in 200 7000 400 5000 300 3000 ; do ip netns exec ns1 ip link set $VF1 mtu $i sleep 2 done sleep 10 done Currently, callbacks such as iavf_change_mtu() wait for the reset. If the reset fails to acquire the rtnl_lock, they schedule the netdev update for later while continuing the reset flow. Operations like MTU changes are performed under the rtnl_lock. Therefore, when the operation finishes, another callback that uses rtnl_lock can start. Signed-off-by: Dawid Wesierski <dawidx.wesierski@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17	iavf: Wait for reset in callbacks which trigger it	Marcin Szycik
	There was a fail when trying to add the interface to bonding right after changing the MTU on the interface. It was caused by bonding interface unable to open the interface due to interface being in __RESETTING state because of MTU change. Add new reset_waitqueue to indicate that reset has finished. Add waiting for reset to finish in callbacks which trigger hw reset: iavf_set_priv_flags(), iavf_change_mtu() and iavf_set_ringparam(). We use a 5000ms timeout period because on Hyper-V based systems, this operation takes around 3000-4000ms. In normal circumstances, it doesn't take more than 500ms to complete. Add a function iavf_wait_for_reset() to reuse waiting for reset code and use it also in iavf_set_channels(), which already waits for reset. We don't use error handling in iavf_set_channels() as this could cause the device to be in incorrect state if the reset was scheduled but hit timeout or the waitng function was interrupted by a signal. Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Co-developed-by: Dawid Wesierski <dawidx.wesierski@intel.com> Signed-off-by: Dawid Wesierski <dawidx.wesierski@intel.com> Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17	iavf: use internal state to free traffic IRQs	Ahmed Zaki
	If the system tries to close the netdev while iavf_reset_task() is running, __LINK_STATE_START will be cleared and netif_running() will return false in iavf_reinit_interrupt_scheme(). This will result in iavf_free_traffic_irqs() not being called and a leak as follows: [7632.489326] remove_proc_entry: removing non-empty directory 'irq/999', leaking at least 'iavf-enp24s0f0v0-TxRx-0' [7632.490214] WARNING: CPU: 0 PID: 10 at fs/proc/generic.c:718 remove_proc_entry+0x19b/0x1b0 is shown when pci_disable_msix() is later called. Fix by using the internal adapter state. The traffic IRQs will always exist if state == __IAVF_RUNNING. Fixes: 5b36e8d04b44 ("i40evf: Enable VF to request an alternate queue allocation") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17	iavf: Fix out-of-bounds when setting channels on remove	Ding Hui
	If we set channels greater during iavf_remove(), and waiting reset done would be timeout, then returned with error but changed num_active_queues directly, that will lead to OOB like the following logs. Because the num_active_queues is greater than tx/rx_rings[] allocated actually. Reproducer: [root@host ~]# cat repro.sh #!/bin/bash pf_dbsf="0000:41:00.0" vf0_dbsf="0000:41:02.0" g_pids=() function do_set_numvf() { echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) } function do_set_channel() { local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/) [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; } ifconfig $nic 192.168.18.5 netmask 255.255.255.0 ifconfig $nic up ethtool -L $nic combined 1 ethtool -L $nic combined 4 sleep $((RANDOM%3)) } function on_exit() { local pid for pid in "${g_pids[@]}"; do kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null done g_pids=() } trap "on_exit; exit" EXIT while :; do do_set_numvf ; done & g_pids+=($!) while :; do do_set_channel ; done & g_pids+=($!) wait Result: [ 3506.152887] iavf 0000:41:02.0: Removing device [ 3510.400799] ================================================================== [ 3510.400820] BUG: KASAN: slab-out-of-bounds in iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400823] Read of size 8 at addr ffff88b6f9311008 by task repro.sh/55536 [ 3510.400823] [ 3510.400830] CPU: 101 PID: 55536 Comm: repro.sh Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [ 3510.400832] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021 [ 3510.400835] Call Trace: [ 3510.400851] dump_stack+0x71/0xab [ 3510.400860] print_address_description+0x6b/0x290 [ 3510.400865] ? iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400868] kasan_report+0x14a/0x2b0 [ 3510.400873] iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400880] iavf_remove+0x2b6/0xc70 [iavf] [ 3510.400884] ? iavf_free_all_rx_resources+0x160/0x160 [iavf] [ 3510.400891] ? wait_woken+0x1d0/0x1d0 [ 3510.400895] ? notifier_call_chain+0xc1/0x130 [ 3510.400903] pci_device_remove+0xa8/0x1f0 [ 3510.400910] device_release_driver_internal+0x1c6/0x460 [ 3510.400916] pci_stop_bus_device+0x101/0x150 [ 3510.400919] pci_stop_and_remove_bus_device+0xe/0x20 [ 3510.400924] pci_iov_remove_virtfn+0x187/0x420 [ 3510.400927] ? pci_iov_add_virtfn+0xe10/0xe10 [ 3510.400929] ? pci_get_subsys+0x90/0x90 [ 3510.400932] sriov_disable+0xed/0x3e0 [ 3510.400936] ? bus_find_device+0x12d/0x1a0 [ 3510.400953] i40e_free_vfs+0x754/0x1210 [i40e] [ 3510.400966] ? i40e_reset_all_vfs+0x880/0x880 [i40e] [ 3510.400968] ? pci_get_device+0x7c/0x90 [ 3510.400970] ? pci_get_subsys+0x90/0x90 [ 3510.400982] ? pci_vfs_assigned.part.7+0x144/0x210 [ 3510.400987] ? __mutex_lock_slowpath+0x10/0x10 [ 3510.400996] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 3510.401001] sriov_numvfs_store+0x214/0x290 [ 3510.401005] ? sriov_totalvfs_show+0x30/0x30 [ 3510.401007] ? __mutex_lock_slowpath+0x10/0x10 [ 3510.401011] ? __check_object_size+0x15a/0x350 [ 3510.401018] kernfs_fop_write+0x280/0x3f0 [ 3510.401022] vfs_write+0x145/0x440 [ 3510.401025] ksys_write+0xab/0x160 [ 3510.401028] ? __ia32_sys_read+0xb0/0xb0 [ 3510.401031] ? fput_many+0x1a/0x120 [ 3510.401032] ? filp_close+0xf0/0x130 [ 3510.401038] do_syscall_64+0xa0/0x370 [ 3510.401041] ? page_fault+0x8/0x30 [ 3510.401043] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 3510.401073] RIP: 0033:0x7f3a9bb842c0 [ 3510.401079] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24 [ 3510.401080] RSP: 002b:00007ffc05f1fe18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 3510.401083] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f3a9bb842c0 [ 3510.401085] RDX: 0000000000000002 RSI: 0000000002327408 RDI: 0000000000000001 [ 3510.401086] RBP: 0000000002327408 R08: 00007f3a9be53780 R09: 00007f3a9c8a4700 [ 3510.401086] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 3510.401087] R13: 0000000000000001 R14: 00007f3a9be52620 R15: 0000000000000001 [ 3510.401090] [ 3510.401093] Allocated by task 76795: [ 3510.401098] kasan_kmalloc+0xa6/0xd0 [ 3510.401099] __kmalloc+0xfb/0x200 [ 3510.401104] iavf_init_interrupt_scheme+0x26f/0x1310 [iavf] [ 3510.401108] iavf_watchdog_task+0x1d58/0x4050 [iavf] [ 3510.401114] process_one_work+0x56a/0x11f0 [ 3510.401115] worker_thread+0x8f/0xf40 [ 3510.401117] kthread+0x2a0/0x390 [ 3510.401119] ret_from_fork+0x1f/0x40 [ 3510.401122] 0xffffffffffffffff [ 3510.401123] In timeout handling, we should keep the original num_active_queues and reset num_req_queues to 0. Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Cc: Donglin Peng <pengdonglin@sangfor.com.cn> Cc: Huang Cun <huangcun@sangfor.com.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17	iavf: Fix use-after-free in free_netdev	Ding Hui
	We do netif_napi_add() for all allocated q_vectors[], but potentially do netif_napi_del() for part of them, then kfree q_vectors and leave invalid pointers at dev->napi_list. Reproducer: [root@host ~]# cat repro.sh #!/bin/bash pf_dbsf="0000:41:00.0" vf0_dbsf="0000:41:02.0" g_pids=() function do_set_numvf() { echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs sleep $((RANDOM%3+1)) } function do_set_channel() { local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/) [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; } ifconfig $nic 192.168.18.5 netmask 255.255.255.0 ifconfig $nic up ethtool -L $nic combined 1 ethtool -L $nic combined 4 sleep $((RANDOM%3)) } function on_exit() { local pid for pid in "${g_pids[@]}"; do kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null done g_pids=() } trap "on_exit; exit" EXIT while :; do do_set_numvf ; done & g_pids+=($!) while :; do do_set_channel ; done & g_pids+=($!) wait Result: [ 4093.900222] ================================================================== [ 4093.900230] BUG: KASAN: use-after-free in free_netdev+0x308/0x390 [ 4093.900232] Read of size 8 at addr ffff88b4dc145640 by task repro.sh/6699 [ 4093.900233] [ 4093.900236] CPU: 10 PID: 6699 Comm: repro.sh Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 [ 4093.900238] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021 [ 4093.900239] Call Trace: [ 4093.900244] dump_stack+0x71/0xab [ 4093.900249] print_address_description+0x6b/0x290 [ 4093.900251] ? free_netdev+0x308/0x390 [ 4093.900252] kasan_report+0x14a/0x2b0 [ 4093.900254] free_netdev+0x308/0x390 [ 4093.900261] iavf_remove+0x825/0xd20 [iavf] [ 4093.900265] pci_device_remove+0xa8/0x1f0 [ 4093.900268] device_release_driver_internal+0x1c6/0x460 [ 4093.900271] pci_stop_bus_device+0x101/0x150 [ 4093.900273] pci_stop_and_remove_bus_device+0xe/0x20 [ 4093.900275] pci_iov_remove_virtfn+0x187/0x420 [ 4093.900277] ? pci_iov_add_virtfn+0xe10/0xe10 [ 4093.900278] ? pci_get_subsys+0x90/0x90 [ 4093.900280] sriov_disable+0xed/0x3e0 [ 4093.900282] ? bus_find_device+0x12d/0x1a0 [ 4093.900290] i40e_free_vfs+0x754/0x1210 [i40e] [ 4093.900298] ? i40e_reset_all_vfs+0x880/0x880 [i40e] [ 4093.900299] ? pci_get_device+0x7c/0x90 [ 4093.900300] ? pci_get_subsys+0x90/0x90 [ 4093.900306] ? pci_vfs_assigned.part.7+0x144/0x210 [ 4093.900309] ? __mutex_lock_slowpath+0x10/0x10 [ 4093.900315] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 4093.900318] sriov_numvfs_store+0x214/0x290 [ 4093.900320] ? sriov_totalvfs_show+0x30/0x30 [ 4093.900321] ? __mutex_lock_slowpath+0x10/0x10 [ 4093.900323] ? __check_object_size+0x15a/0x350 [ 4093.900326] kernfs_fop_write+0x280/0x3f0 [ 4093.900329] vfs_write+0x145/0x440 [ 4093.900330] ksys_write+0xab/0x160 [ 4093.900332] ? __ia32_sys_read+0xb0/0xb0 [ 4093.900334] ? fput_many+0x1a/0x120 [ 4093.900335] ? filp_close+0xf0/0x130 [ 4093.900338] do_syscall_64+0xa0/0x370 [ 4093.900339] ? page_fault+0x8/0x30 [ 4093.900341] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 4093.900357] RIP: 0033:0x7f16ad4d22c0 [ 4093.900359] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24 [ 4093.900360] RSP: 002b:00007ffd6491b7f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 4093.900362] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f16ad4d22c0 [ 4093.900363] RDX: 0000000000000002 RSI: 0000000001a41408 RDI: 0000000000000001 [ 4093.900364] RBP: 0000000001a41408 R08: 00007f16ad7a1780 R09: 00007f16ae1f2700 [ 4093.900364] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 4093.900365] R13: 0000000000000001 R14: 00007f16ad7a0620 R15: 0000000000000001 [ 4093.900367] [ 4093.900368] Allocated by task 820: [ 4093.900371] kasan_kmalloc+0xa6/0xd0 [ 4093.900373] __kmalloc+0xfb/0x200 [ 4093.900376] iavf_init_interrupt_scheme+0x63b/0x1320 [iavf] [ 4093.900380] iavf_watchdog_task+0x3d51/0x52c0 [iavf] [ 4093.900382] process_one_work+0x56a/0x11f0 [ 4093.900383] worker_thread+0x8f/0xf40 [ 4093.900384] kthread+0x2a0/0x390 [ 4093.900385] ret_from_fork+0x1f/0x40 [ 4093.900387] 0xffffffffffffffff [ 4093.900387] [ 4093.900388] Freed by task 6699: [ 4093.900390] __kasan_slab_free+0x137/0x190 [ 4093.900391] kfree+0x8b/0x1b0 [ 4093.900394] iavf_free_q_vectors+0x11d/0x1a0 [iavf] [ 4093.900397] iavf_remove+0x35a/0xd20 [iavf] [ 4093.900399] pci_device_remove+0xa8/0x1f0 [ 4093.900400] device_release_driver_internal+0x1c6/0x460 [ 4093.900401] pci_stop_bus_device+0x101/0x150 [ 4093.900402] pci_stop_and_remove_bus_device+0xe/0x20 [ 4093.900403] pci_iov_remove_virtfn+0x187/0x420 [ 4093.900404] sriov_disable+0xed/0x3e0 [ 4093.900409] i40e_free_vfs+0x754/0x1210 [i40e] [ 4093.900415] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 4093.900416] sriov_numvfs_store+0x214/0x290 [ 4093.900417] kernfs_fop_write+0x280/0x3f0 [ 4093.900418] vfs_write+0x145/0x440 [ 4093.900419] ksys_write+0xab/0x160 [ 4093.900420] do_syscall_64+0xa0/0x370 [ 4093.900421] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 4093.900422] 0xffffffffffffffff [ 4093.900422] [ 4093.900424] The buggy address belongs to the object at ffff88b4dc144200 which belongs to the cache kmalloc-8k of size 8192 [ 4093.900425] The buggy address is located 5184 bytes inside of 8192-byte region [ffff88b4dc144200, ffff88b4dc146200) [ 4093.900425] The buggy address belongs to the page: [ 4093.900427] page:ffffea00d3705000 refcount:1 mapcount:0 mapping:ffff88bf04415c80 index:0x0 compound_mapcount: 0 [ 4093.900430] flags: 0x10000000008100(slab\|head) [ 4093.900433] raw: 0010000000008100 dead000000000100 dead000000000200 ffff88bf04415c80 [ 4093.900434] raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000 [ 4093.900434] page dumped because: kasan: bad access detected [ 4093.900435] [ 4093.900435] Memory state around the buggy address: [ 4093.900436] ffff88b4dc145500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900437] ffff88b4dc145580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900438] >ffff88b4dc145600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900438] ^ [ 4093.900439] ffff88b4dc145680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900440] ffff88b4dc145700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 4093.900440] ================================================================== Although the patch #2 (of 2) can avoid the issue triggered by this repro.sh, there still are other potential risks that if num_active_queues is changed to less than allocated q_vectors[] by unexpected, the mismatched netif_napi_add/del() can also cause UAF. Since we actually call netif_napi_add() for all allocated q_vectors unconditionally in iavf_alloc_q_vectors(), so we should fix it by letting netif_napi_del() match to netif_napi_add(). Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Cc: Donglin Peng <pengdonglin@sangfor.com.cn> Cc: Huang Cun <huangcun@sangfor.com.cn> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17	Revert "drm/i915: use localized __diag_ignore_all() instead of per file"	Jani Nikula
	This reverts commit 88e9664434c994e97a9f6f8cdd1535495c660cea. __diag_ignore_all() only works for GCC 8 or later. -Woverride-init (from -Wextra, enabled in i915 Makefile) combined with CONFIG_WERROR=y or W=e breaks the build for older GCC. With i386_defconfig and x86_64_defconfig enabling CONFIG_WERROR=y by default, we really need to roll back the change. An alternative would be to disable -Woverride-init in the Makefile for GCC <8, but the revert seems like the safest bet now. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8768 Reported-by: John Garry <john.g.garry@oracle.com> References: https://lore.kernel.org/r/ad2601c0-84bb-c574-3702-a83ff8faf98c@oracle.com References: https://lore.kernel.org/r/87wmzezns4.fsf@intel.com Fixes: 88e9664434c9 ("drm/i915: use localized __diag_ignore_all() instead of per file") Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Tested-by: John Garry <john.g.garry@oracle.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230711110214.25093-1-jani.nikula@intel.com (cherry picked from commit 290d161045753240f2100b8f44660426ecc97be5) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-07-17	drm/i915/perf: add sentinel to xehp_oa_b_counters	Andrzej Hajda
	Arrays passed to reg_in_range_table should end with empty record. The patch solves KASAN detected bug with signature: BUG: KASAN: global-out-of-bounds in xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915] Read of size 4 at addr ffffffffa1555d90 by task perf/1518 CPU: 4 PID: 1518 Comm: perf Tainted: G U 6.4.0-kasan_438-g3303d06107f3+ #1 Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPFWI1.R00.3223.D80.2305311348 05/31/2023 Call Trace: <TASK> ... xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915] Fixes: 0fa9349dda03 ("drm/i915/perf: complete programming whitelisting for XEHPSDV") Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230711153410.1224997-1-andrzej.hajda@intel.com (cherry picked from commit 2f42c5afb34b5696cf5fe79e744f99be9b218798) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2023-07-17	can: gs_usb: fix time stamp counter initialization	Marc Kleine-Budde
	If the gs_usb device driver is unloaded (or unbound) before the interface is shut down, the USB stack first calls the struct usb_driver::disconnect and then the struct net_device_ops::ndo_stop callback. In gs_usb_disconnect() all pending bulk URBs are killed, i.e. no more RX'ed CAN frames are send from the USB device to the host. Later in gs_can_close() a reset control message is send to each CAN channel to remove the controller from the CAN bus. In this race window the USB device can still receive CAN frames from the bus and internally queue them to be send to the host. At least in the current version of the candlelight firmware, the queue of received CAN frames is not emptied during the reset command. After loading (or binding) the gs_usb driver, new URBs are submitted during the struct net_device_ops::ndo_open callback and the candlelight firmware starts sending its already queued CAN frames to the host. However, this scenario was not considered when implementing the hardware timestamp function. The cycle counter/time counter infrastructure is set up (gs_usb_timestamp_init()) after the USBs are submitted, resulting in a NULL pointer dereference if timecounter_cyc2time() (via the call chain: gs_usb_receive_bulk_callback() -> gs_usb_set_timestamp() -> gs_usb_skb_set_timestamp()) is called too early. Move the gs_usb_timestamp_init() function before the URBs are submitted to fix this problem. For a comprehensive solution, we need to consider gs_usb devices with more than 1 channel. The cycle counter/time counter infrastructure is setup per channel, but the RX URBs are per device. Once gs_can_open() of _a_ channel has been called, and URBs have been submitted, the gs_usb_receive_bulk_callback() can be called for _all_ available channels, even for channels that are not running, yet. As cycle counter/time counter has not set up, this will again lead to a NULL pointer dereference. Convert the cycle counter/time counter from a "per channel" to a "per device" functionality. Also set it up, before submitting any URBs to the device. Further in gs_usb_receive_bulk_callback(), don't process any URBs for not started CAN channels, only resubmit the URB. Fixes: 45dfa45f52e6 ("can: gs_usb: add RX and TX hardware timestamp support") Closes: https://github.com/candle-usb/candleLight_fw/issues/137#issuecomment-1623532076 Cc: stable@vger.kernel.org Cc: John Whittington <git@jbrengineering.co.uk> Link: https://lore.kernel.org/all/20230716-gs_usb-fix-time-stamp-counter-v1-2-9017cefcd9d5@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-07-17	can: gs_usb: gs_can_open(): improve error handling	Marc Kleine-Budde
	The gs_usb driver handles USB devices with more than 1 CAN channel. The RX path for all channels share the same bulk endpoint (the transmitted bulk data encodes the channel number). These per-device resources are allocated and submitted by the first opened channel. During this allocation, the resources are either released immediately in case of a failure or the URBs are anchored. All anchored URBs are finally killed with gs_usb_disconnect(). Currently, gs_can_open() returns with an error if the allocation of a URB or a buffer fails. However, if usb_submit_urb() fails, the driver continues with the URBs submitted so far, even if no URBs were successfully submitted. Treat every error as fatal and free all allocated resources immediately. Switch to goto-style error handling, to prepare the driver for more per-device resource allocation. Cc: stable@vger.kernel.org Cc: John Whittington <git@jbrengineering.co.uk> Link: https://lore.kernel.org/all/20230716-gs_usb-fix-time-stamp-counter-v1-1-9017cefcd9d5@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-07-17	r8169: fix ASPM-related problem for chip version 42 and 43	Heiner Kallweit
	Referenced commit missed that for chip versions 42 and 43 ASPM remained disabled in the respective rtl_hw_start_...() routines. This resulted in problems as described in the referenced bug ticket. Therefore re-instantiate the previous logic. Fixes: 5fc3f6c90cca ("r8169: consolidate disabling ASPM before EPHY access") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217635 Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-17	net: dsa: microchip: correct KSZ8795 static MAC table access	Tristram Ha
	The KSZ8795 driver code was modified to use on KSZ8863/73, which has different register definitions. Some of the new KSZ8795 register information are wrong compared to previous code. KSZ8795 also behaves differently in that the STATIC_MAC_TABLE_USE_FID and STATIC_MAC_TABLE_FID bits are off by 1 when doing MAC table reading than writing. To compensate that a special code was added to shift the register value by 1 before applying those bits. This is wrong when the code is running on KSZ8863, so this special code is only executed when KSZ8795 is detected. Fixes: 4b20a07e103f ("net: dsa: microchip: ksz8795: add support for ksz88xx chips") Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-17	regulator: da9063: fix null pointer deref with partial DT config	Martin Fuzzey
	When some of the da9063 regulators do not have corresponding DT nodes a null pointer dereference occurs on boot because such regulators have no init_data causing the pointers calculated in da9063_check_xvp_constraints() to be invalid. Do not dereference them in this case. Fixes: b8717a80e6ee ("regulator: da9063: implement setter for voltage monitoring") Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group> Link: https://lore.kernel.org/r/20230616143736.2946173-1-martin.fuzzey@flowbird.group Signed-off-by: Mark Brown <broonie@kernel.org>
2023-07-17	regmap: Account for register length in SMBus I/O limits	Mark Brown
	The SMBus I2C buses have limits on the size of transfers they can do but do not factor in the register length meaning we may try to do a transfer longer than our length limit, the core will not take care of this. Future changes will factor this out into the core but there are a number of users that assume current behaviour so let's just do something conservative here. This does not take account padding bits but practically speaking these are very rarely if ever used on I2C buses given that they generally run slowly enough to mean there's no issue. Cc: stable@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Xu Yilun <yilun.xu@intel.com> Link: https://lore.kernel.org/r/20230712-regmap-max-transfer-v1-2-80e2aed22e83@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2023-07-17	regmap: Drop initial version of maximum transfer length fixes	Mark Brown
	When problems were noticed with the register address not being taken into account when limiting raw transfers with I2C devices we fixed this in the core. Unfortunately it has subsequently been realised that a lot of buses were relying on the prior behaviour, partly due to unclear documentation not making it obvious what was intended in the core. This is all more involved to fix than is sensible for a fix commit so let's just drop the original fixes, a separate commit will fix the originally observed problem in an I2C specific way Fixes: 3981514180c9 ("regmap: Account for register length when chunking") Fixes: c8e796895e23 ("regmap: spi-avmm: Fix regmap_bus max_raw_write") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Xu Yilun <yilun.xu@intel.com> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230712-regmap-max-transfer-v1-1-80e2aed22e83@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2023-07-17	RDMA/bnxt_re: Fix hang during driver unload	Selvin Xavier
	Driver unload hits a hang during stress testing of load/unload. stack trace snippet - tasklet_kill at ffffffff9aabb8b2 bnxt_qplib_nq_stop_irq at ffffffffc0a805fb [bnxt_re] bnxt_qplib_disable_nq at ffffffffc0a80c5b [bnxt_re] bnxt_re_dev_uninit at ffffffffc0a67d15 [bnxt_re] bnxt_re_remove_device at ffffffffc0a6af1d [bnxt_re] tasklet_kill can hang if the tasklet is scheduled after it is disabled. Modified the sequences to disable the interrupt first and synchronize irq before disabling the tasklet. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1689322969-25402-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17	RDMA/bnxt_re: Prevent handling any completions after qp destroy	Kashyap Desai
	HW may generate completions that indicates QP is destroyed. Driver should not be scheduling any more completion handlers for this QP, after the QP is destroyed. Since CQs are active during the QP destroy, driver may still schedule completion handlers. This can cause a race where the destroy_cq and poll_cq running simultaneously. Snippet of kernel panic while doing bnxt_re driver load unload in loop. This indicates a poll after the CQ is freed. [77786.481636] Call Trace: [77786.481640] <TASK> [77786.481644] bnxt_re_poll_cq+0x14a/0x620 [bnxt_re] [77786.481658] ? kvm_clock_read+0x14/0x30 [77786.481693] __ib_process_cq+0x57/0x190 [ib_core] [77786.481728] ib_cq_poll_work+0x26/0x80 [ib_core] [77786.481761] process_one_work+0x1e5/0x3f0 [77786.481768] worker_thread+0x50/0x3a0 [77786.481785] ? __pfx_worker_thread+0x10/0x10 [77786.481790] kthread+0xe2/0x110 [77786.481794] ? __pfx_kthread+0x10/0x10 [77786.481797] ret_from_fork+0x2c/0x50 To avoid this, complete all completion handlers before returning the destroy QP. If free_cq is called soon after destroy_qp, IB stack will cancel the CQ work before invoking the destroy_cq verb and this will prevent any race mentioned. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://lore.kernel.org/r/1689322969-25402-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17	RDMA/mthca: Fix crash when polling CQ for shared QPs	Thomas Bogendoerfer
	Commit 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") introduced a new struct mthca_sqp which doesn't contain struct mthca_qp any longer. Placing a pointer of this new struct into qptable leads to crashes, because mthca_poll_one() expects a qp pointer. Fix this by putting the correct pointer into qptable. Fixes: 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Link: https://lore.kernel.org/r/20230713141658.9426-1-tbogendoerfer@suse.de Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17	RDMA/core: Update CMA destination address on rdma_resolve_addr	Shiraz Saleem
	8d037973d48c ("RDMA/core: Refactor rdma_bind_addr") intoduces as regression on irdma devices on certain tests which uses rdma CM, such as cmtime. No connections can be established with the MAD QP experiences a fatal error on the active side. The cma destination address is not updated with the dst_addr when ULP on active side calls rdma_bind_addr followed by rdma_resolve_addr. The id_priv state is 'bound' in resolve_prepare_src and update is skipped. This leaves the dgid passed into irdma driver to create an Address Handle (AH) for the MAD QP at 0. The create AH descriptor as well as the ARP cache entry is invalid and HW throws an asynchronous events as result. [ 1207.656888] resolve_prepare_src caller: ucma_resolve_addr+0xff/0x170 [rdma_ucm] daddr=200.0.4.28 id_priv->state=7 [....] [ 1207.680362] ice 0000:07:00.1 rocep7s0f1: caller: irdma_create_ah+0x3e/0x70 [irdma] ah_id=0 arp_idx=0 dest_ip=0.0.0.0 destMAC=00:00:64:ca:b7:52 ipvalid=1 raw=0000:0000:0000:0000:0000:ffff:0000:0000 [ 1207.682077] ice 0000:07:00.1 rocep7s0f1: abnormal ae_id = 0x401 bool qp=1 qp_id = 1, ae_src=5 [ 1207.691657] infiniband rocep7s0f1: Fatal error (1) on MAD QP (1) Fix this by updating the CMA destination address when the ULP calls a resolve address with the CM state already bound. Fixes: 8d037973d48c ("RDMA/core: Refactor rdma_bind_addr") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230712234133.1343-1-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17	RDMA/irdma: Fix data race on CQP request done	Shiraz Saleem
	KCSAN detects a data race on cqp_request->request_done memory location which is accessed locklessly in irdma_handle_cqp_op while being updated in irdma_cqp_ce_handler. Annotate lockless intent with READ_ONCE/WRITE_ONCE to avoid any compiler optimizations like load fusing and/or KCSAN warning. [222808.417128] BUG: KCSAN: data-race in irdma_cqp_ce_handler [irdma] / irdma_wait_event [irdma] [222808.417532] write to 0xffff8e44107019dc of 1 bytes by task 29658 on cpu 5: [222808.417610] irdma_cqp_ce_handler+0x21e/0x270 [irdma] [222808.417725] cqp_compl_worker+0x1b/0x20 [irdma] [222808.417827] process_one_work+0x4d1/0xa40 [222808.417835] worker_thread+0x319/0x700 [222808.417842] kthread+0x180/0x1b0 [222808.417852] ret_from_fork+0x22/0x30 [222808.417918] read to 0xffff8e44107019dc of 1 bytes by task 29688 on cpu 1: [222808.417995] irdma_wait_event+0x1e2/0x2c0 [irdma] [222808.418099] irdma_handle_cqp_op+0xae/0x170 [irdma] [222808.418202] irdma_cqp_cq_destroy_cmd+0x70/0x90 [irdma] [222808.418308] irdma_puda_dele_rsrc+0x46d/0x4d0 [irdma] [222808.418411] irdma_rt_deinit_hw+0x179/0x1d0 [irdma] [222808.418514] irdma_ib_dealloc_device+0x11/0x40 [irdma] [222808.418618] ib_dealloc_device+0x2a/0x120 [ib_core] [222808.418823] __ib_unregister_device+0xde/0x100 [ib_core] [222808.418981] ib_unregister_device+0x22/0x40 [ib_core] [222808.419142] irdma_ib_unregister_device+0x70/0x90 [irdma] [222808.419248] i40iw_close+0x6f/0xc0 [irdma] [222808.419352] i40e_client_device_unregister+0x14a/0x180 [i40e] [222808.419450] i40iw_remove+0x21/0x30 [irdma] [222808.419554] auxiliary_bus_remove+0x31/0x50 [222808.419563] device_remove+0x69/0xb0 [222808.419572] device_release_driver_internal+0x293/0x360 [222808.419582] driver_detach+0x7c/0xf0 [222808.419592] bus_remove_driver+0x8c/0x150 [222808.419600] driver_unregister+0x45/0x70 [222808.419610] auxiliary_driver_unregister+0x16/0x30 [222808.419618] irdma_exit_module+0x18/0x1e [irdma] [222808.419733] __do_sys_delete_module.constprop.0+0x1e2/0x310 [222808.419745] __x64_sys_delete_module+0x1b/0x30 [222808.419755] do_syscall_64+0x39/0x90 [222808.419763] entry_SYSCALL_64_after_hwframe+0x63/0xcd [222808.419829] value changed: 0x01 -> 0x03 Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230711175253.1289-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17	RDMA/irdma: Fix data race on CQP completion stats	Shiraz Saleem
	CQP completion statistics is read lockesly in irdma_wait_event and irdma_check_cqp_progress while it can be updated in the completion thread irdma_sc_ccq_get_cqe_info on another CPU as KCSAN reports. Make completion statistics an atomic variable to reflect coherent updates to it. This will also avoid load/store tearing logic bug potentially possible by compiler optimizations. [77346.170861] BUG: KCSAN: data-race in irdma_handle_cqp_op [irdma] / irdma_sc_ccq_get_cqe_info [irdma] [77346.171383] write to 0xffff8a3250b108e0 of 8 bytes by task 9544 on cpu 4: [77346.171483] irdma_sc_ccq_get_cqe_info+0x27a/0x370 [irdma] [77346.171658] irdma_cqp_ce_handler+0x164/0x270 [irdma] [77346.171835] cqp_compl_worker+0x1b/0x20 [irdma] [77346.172009] process_one_work+0x4d1/0xa40 [77346.172024] worker_thread+0x319/0x700 [77346.172037] kthread+0x180/0x1b0 [77346.172054] ret_from_fork+0x22/0x30 [77346.172136] read to 0xffff8a3250b108e0 of 8 bytes by task 9838 on cpu 2: [77346.172234] irdma_handle_cqp_op+0xf4/0x4b0 [irdma] [77346.172413] irdma_cqp_aeq_cmd+0x75/0xa0 [irdma] [77346.172592] irdma_create_aeq+0x390/0x45a [irdma] [77346.172769] irdma_rt_init_hw.cold+0x212/0x85d [irdma] [77346.172944] irdma_probe+0x54f/0x620 [irdma] [77346.173122] auxiliary_bus_probe+0x66/0xa0 [77346.173137] really_probe+0x140/0x540 [77346.173154] __driver_probe_device+0xc7/0x220 [77346.173173] driver_probe_device+0x5f/0x140 [77346.173190] __driver_attach+0xf0/0x2c0 [77346.173208] bus_for_each_dev+0xa8/0xf0 [77346.173225] driver_attach+0x29/0x30 [77346.173240] bus_add_driver+0x29c/0x2f0 [77346.173255] driver_register+0x10f/0x1a0 [77346.173272] __auxiliary_driver_register+0xbc/0x140 [77346.173287] irdma_init_module+0x55/0x1000 [irdma] [77346.173460] do_one_initcall+0x7d/0x410 [77346.173475] do_init_module+0x81/0x2c0 [77346.173491] load_module+0x1232/0x12c0 [77346.173506] __do_sys_finit_module+0x101/0x180 [77346.173522] __x64_sys_finit_module+0x3c/0x50 [77346.173538] do_syscall_64+0x39/0x90 [77346.173553] entry_SYSCALL_64_after_hwframe+0x63/0xcd [77346.173634] value changed: 0x0000000000000094 -> 0x0000000000000095 Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230711175253.1289-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17	RDMA/irdma: Add missing read barriers	Shiraz Saleem
	On code inspection, there are many instances in the driver where CEQE and AEQE fields written to by HW are read without guaranteeing that the polarity bit has been read and checked first. Add a read barrier to avoid reordering of loads on the CEQE/AEQE fields prior to checking the polarity bit. Fixes: 3f49d6842569 ("RDMA/irdma: Implement HW Admin Queue OPs") Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20230711175253.1289-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17	ata: pata_parport: Add missing protocol modules description	Damien Le Moal
	Most of the protocol modules for the pata_parport driver are missing a module description, causing warnings such as: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/pata_parport/aten.o when compiling with W=1. Add the missing MODULE_DESCRIPTION() definitions to avoid these warnings. While at it, also add the missing MODULE_AUTHOR() definitions. Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2023-07-16	Merge tag 'pinctrl-v6.5-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "I'm mostly on vacation but what would vacation be without a few critical fixes so people can use their gaming laptops when hiding away from the sun (or rain)? - Fix a really annoying interrupt storm in the AMD driver affecting Asus TUF gaming notebooks - Fix device tree parsing in the Renesas driver" * tag 'pinctrl-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: amd: Unify debounce handling into amd_pinconf_set() pinctrl: amd: Drop pull up select configuration pinctrl: amd: Use amd_pinconf_set() for all config options pinctrl: amd: Only use special debounce behavior for GPIO 0 pinctrl: renesas: rzg2l: Handle non-unique subnode names pinctrl: renesas: rzv2m: Handle non-unique subnode names
2023-07-15	mtd: rawnand: rockchip: Align hwecc vs. raw page helper layouts	Johan Jonker
	Currently, read/write_page_hwecc() and read/write_page_raw() are not aligned: there is a mismatch in the OOB bytes which are not read/written at the same offset in both cases (raw vs. hwecc). This is a real problem when relying on the presence of the Page Addresses (PA) when using the NAND chip as a boot device, as the BootROM expects additional data in the OOB area at specific locations. Rockchip boot blocks are written per 4 x 512 byte sectors per page. Each page with boot blocks must have a page address (PA) pointer in OOB to the next page. Pages are written in a pattern depending on the NAND chip ID. Generate boot block page address and pattern for hwecc in user space and copy PA data to/from the already reserved last 4 bytes before ECC in the chip->oob_poi data layout. Align the different helpers. This change breaks existing jffs2 users. Fixes: 058e0e847d54 ("mtd: rawnand: rockchip: NFC driver for RK3308, RK2928 and others") Signed-off-by: Johan Jonker <jbx6244@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/5e782c08-862b-51ae-47ff-3299940928ca@gmail.com
2023-07-15	mtd: rawnand: rockchip: fix oobfree offset and description	Johan Jonker
	Rockchip boot blocks are written per 4 x 512 byte sectors per page. Each page with boot blocks must have a page address (PA) pointer in OOB to the next page. The currently advertised free OOB area starts at offset 6, like if 4 PA bytes were located right after the BBM. This is wrong as the PA bytes are located right before the ECC bytes. Fix the layout by allowing access to all bytes between the BBM and the PA bytes instead of reserving 4 bytes right after the BBM. This change breaks existing jffs2 users. Fixes: 058e0e847d54 ("mtd: rawnand: rockchip: NFC driver for RK3308, RK2928 and others") Signed-off-by: Johan Jonker <jbx6244@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/d202f12d-188c-20e8-f2c2-9cc874ad4d22@gmail.com
2023-07-15	hwmon: (aquacomputer_d5next) Fix incorrect PWM value readout	Aleksa Savic
	Commit 662d20b3a5af ("hwmon: (aquacomputer_d5next) Add support for temperature sensor offsets") changed aqc_get_ctrl_val() to return the value through a parameter instead of through the return value, but didn't fix up a case that relied on the old behavior. Fix it to use the proper received value and not the return code. Fixes: 662d20b3a5af ("hwmon: (aquacomputer_d5next) Add support for temperature sensor offsets") Cc: stable@vger.kernel.org Signed-off-by: Aleksa Savic <savicaleksa83@gmail.com> Link: https://lore.kernel.org/r/20230714120712.16721-1-savicaleksa83@gmail.com Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-07-15	hwmon: (nct6775) Fix register for nct6799	Ahmad Khalifa
	Datasheet and variable name point to 0xe6 Fixes: aee395bb1905 ("hwmon: (nct6755) Add support for NCT6799D") Signed-off-by: Ahmad Khalifa <ahmad@khalifa.ws> Link: https://lore.kernel.org/r/20230715145831.1304633-1-ahmad@khalifa.ws Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-07-15	Merge tag 'spi-fix-v6.5-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A couple of fairly minor driver specific fixes here, plus a bunch of maintainership and admin updates. Nothing too remarkable" * tag 'spi-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: mailmap: add entry for Jonas Gorski MAINTAINERS: add myself for spi-bcm63xx spi: s3c64xx: clear loopback bit after loopback test spi: bcm63xx: fix max prepend length MAINTAINERS: Add myself as a maintainer for Microchip SPI
2023-07-15	Merge tag 'regmap-fix-v6.5-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "One fix for an out of bounds access in the interupt code here" * tag 'regmap-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap-irq: Fix out-of-bounds access when allocating config buffers
2023-07-15	Merge tag 'iommu-fixes-v6.5-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix a regression causing a crash on sysfs access of iommu-group specific files - Fix signedness bug in SVA code * tag 'iommu-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/sva: Fix signedness bug in iommu_sva_alloc_pasid() iommu: Fix crash during syfs iommu_groups/N/type
2023-07-15	drm/msm/adreno: Fix snapshot BINDLESS_DATA size	Rob Clark
	The incorrect size was causing "CP \| AHB bus error" when snapshotting the GPU state on a6xx gen4 (a660 family). Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/26 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Fixes: 1707add81551 ("drm/msm/a6xx: Add a6xx gpu state") Patchwork: https://patchwork.freedesktop.org/patch/546763/
2023-07-15	drm/msm/a690: Remove revn and name	Rob Clark
	These fields are deprecated. But any userspace new enough to support a690 also knows how to identify the GPU based on chip-id. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/545552/
2023-07-15	drm/msm/adreno: Fix warn splat for devices without revn	Rob Clark
	Recently, a WARN_ON() was introduced to ensure that revn is filled before adreno_is_aXYZ is called. This however doesn't work very well when revn is 0 by design (such as for A635). Cc: Konrad Dybcio <konrad.dybcio@linaro.org> Fixes: cc943f43ece7 ("drm/msm/adreno: warn if chip revn is verified before being set") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Tested-by: Abhinav Kumar <quic_abhinavk@quicinc.com> # sc7280 Patchwork: https://patchwork.freedesktop.org/patch/545554/
2023-07-15	dma-buf/dma-resv: Stop leaking on krealloc() failure	Ville Syrjälä
	Currently dma_resv_get_fences() will leak the previously allocated array if the fence iteration got restarted and the krealloc_array() fails. Free the old array by hand, and make sure we still clear the returned fences so the caller won't end up accessing freed memory. Some (but not all) of the callers of dma_resv_get_fences() seem to still trawl through the array even when dma_resv_get_fences() failed. And let's zero out num_fences as well for good measure. Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Christian König <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Fixes: d3c80698c9f5 ("dma-buf: use new iterator in dma_resv_get_fences v3") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230713194745.1751-1-ville.syrjala@linux.intel.com Signed-off-by: Christian König <christian.koenig@amd.com>
2023-07-14	Merge tag 'scsi-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a bunch of small driver fixes and a larger rework of zone disk handling (which reaches into blk and nvme). The aacraid array-bounds fix is now critical since the security people turned on -Werror for some build tests, which now fail without it" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: storvsc: Handle SRB status value 0x30 scsi: block: Improve checks in blk_revalidate_disk_zones() scsi: block: virtio_blk: Set zone limits before revalidating zones scsi: block: nullblk: Set zone limits before revalidating zones scsi: nvme: zns: Set zone limits before revalidating zones scsi: sd_zbc: Set zone limits before revalidating zones scsi: ufs: core: Add support for qTimestamp attribute scsi: aacraid: Avoid -Warray-bounds warning scsi: ufs: ufs-mediatek: Add dependency for RESET_CONTROLLER scsi: ufs: core: Update contact email for monitor sysfs nodes scsi: scsi_debug: Remove dead code scsi: qla2xxx: Use vmalloc_array() and vcalloc() scsi: fnic: Use vmalloc_array() and vcalloc() scsi: qla2xxx: Fix error code in qla2x00_start_sp() scsi: qla2xxx: Silence a static checker warning scsi: lpfc: Fix a possible data race in lpfc_unregister_fcf_rescan()
2023-07-14	Merge tag 'block-6.5-2023-07-14' of git://git.kernel.dk/linux	Linus Torvalds
	Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - Don't require quirk to use duplicate namespace identifiers (Christoph, Sagi) - One more BOGUS_NID quirk (Pankaj) - IO timeout and error hanlding fixes for PCI (Keith) - Enhanced metadata format mask fix (Ankit) - Association race condition fix for fibre channel (Michael) - Correct debugfs error checks (Minjie) - Use PAGE_SECTORS_SHIFT where needed (Damien) - Reduce kernel logs for legacy nguid attribute (Keith) - Use correct dma direction when unmapping metadata (Ming) - Fix for a flush handling regression in this release (Christoph) - Fix for batched request time stamping (Chengming) - Fix for a regression in the mq-deadline position calculation (Bart) - Lockdep fix for blk-crypto (Eric) - Fix for a regression in the Amiga partition handling changes (Michael) * tag 'block-6.5-2023-07-14' of git://git.kernel.dk/linux: block: queue data commands from the flush state machine at the head blk-mq: fix start_time_ns and alloc_time_ns for pre-allocated rq nvme-pci: fix DMA direction of unmapping integrity data nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices block/mq-deadline: Fix a bug in deadline_from_pos() nvme: ensure disabling pairs with unquiesce nvme-fc: fix race between error recovery and creating association nvme-fc: return non-zero status code when fails to create association nvme: fix parameter check in nvme_fault_inject_init() nvme: warn only once for legacy uuid attribute block: remove dead struc request->completion_data field nvme: fix the NVME_ID_NS_NVM_STS_MASK definition nvmet: use PAGE_SECTORS_SHIFT nvme: add BOGUS_NID quirk for Samsung SM953 blk-crypto: use dynamic lock class for blk_crypto_profile::lock block/partition: fix signedness issue for Amiga partitions
2023-07-14	cxl/mem: Fix a double shift bug	Dan Carpenter
	The CXL_FW_CANCEL macro is used with set/test_bit() so it should be a bit number and not the shifted value. The original code is the equivalent of using BIT(BIT(0)) so it's 0x2 instead of 0x1. This has no effect on runtime because it's done consistently and nothing else was using the 0x2 bit. Fixes: 9521875bbe00 ("cxl: add a firmware update mechanism using the sysfs firmware loader") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/a11b0c78-4717-4f4e-90be-f47f300d607c@moroto.mountain Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2023-07-14	cxl: fix CONFIG_FW_LOADER dependency	Arnd Bergmann
	When FW_LOADER is disabled, cxl fails to link: arm-linux-gnueabi-ld: drivers/cxl/core/memdev.o: in function `cxl_memdev_setup_fw_upload': memdev.c:(.text+0x90e): undefined reference to `firmware_upload_register' memdev.c:(.text+0x93c): undefined reference to `firmware_upload_unregister' In order to use the firmware_upload_register() function, both FW_LOADER and FW_UPLOAD have to be enabled, which is a bit confusing. In addition, the dependency is on the wrong symbol, as the caller is part of the cxl_core.ko module, not the cxl_mem.ko module. Fixes: 9521875bbe005 ("cxl: add a firmware update mechanism using the sysfs firmware loader") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230703112928.332321-1-arnd@kernel.org Reviewed-by: Xiao Yang <yangx.jy@fujitsu.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2023-07-14	Merge tag 'riscv-for-linus-6.5-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - fix a formatting error in the hwprobe documentation - fix a spurious warning in the RISC-V PMU driver - fix memory detection on rv32 (problem does not manifest on any known system) - avoid parsing legacy parsing of I in ACPI ISA strings * tag 'riscv-for-linus-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Don't include Zicsr or Zifencei in I from ACPI riscv: mm: fix truncation warning on RV32 perf: RISC-V: Remove PERF_HES_STOPPED flag checking in riscv_pmu_start() Documentation: RISC-V: hwprobe: Fix a formatting error
2023-07-14	Merge tag 'pm-6.5-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix hibernation (after recent changes), frequency QoS and the sparc cpufreq driver. Specifics: - Unbreak the /sys/power/resume interface after recent changes (Azat Khuzhin). - Allow PM_QOS_DEFAULT_VALUE to be used with frequency QoS (Chungkai Yang). - Remove __init from cpufreq callbacks in the sparc driver, because they may be called after initialization too (Viresh Kumar)" * tag 'pm-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: sparc: Don't mark cpufreq callbacks with __init PM: QoS: Restore support for default value on frequency QoS PM: hibernate: Fix writing maj:min to /sys/power/resume
2023-07-14	platform/x86: serial-multi-instantiate: Auto detect IRQ resource for CSC3551	David Xu
	The current code assumes that the CSC3551(multiple cs35l41) always have its interrupt pin connected to GPIO thus the IRQ can be acquired with acpi_dev_gpio_irq_get. However on some newer laptop models this is no longer the case as they have the CSC3551's interrupt pin connected to APIC. This causes smi_i2c_probe to fail on these machines. To support these machines, a new macro IRQ_RESOURCE_AUTO was introduced for cs35l41 smi_node, and smi_get_irq function was modified so it tries to get GPIO irq resource first and if failed, tries to get APIC irq resource for cs35l41. This patch affects only the cs35l41's probing and brings no negative influence on machines that indeed have the cs35l41's interrupt pin connected to GPIO. Signed-off-by: David Xu <xuwd1@hotmail.com> Link: https://lore.kernel.org/r/SY4P282MB18350CD8288687B87FFD2243E037A@SY4P282MB1835.AUSP282.PROD.OUTLOOK.COM Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>