summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2024-10-17xhci: dbc: honor usb transfer size boundaries.Mathias Nyman
Treat each completed full size write to /dev/ttyDBC0 as a separate usb transfer. Make sure the size of the TRBs matches the size of the tty write by first queuing as many max packet size TRBs as possible up to the last TRB which will be cut short to match the size of the tty write. This solves an issue where userspace writes several transfers back to back via /dev/ttyDBC0 into a kfifo before dbgtty can find available request to turn that kfifo data into TRBs on the transfer ring. The boundary between transfer was lost as xhci-dbgtty then turned everyting in the kfifo into as many 'max packet size' TRBs as possible. DbC would then send more data to the host than intended for that transfer, causing host to issue a babble error. Refuse to write more data to kfifo until previous tty write data is turned into properly sized TRBs with data size boundaries matching tty write size Tested-by: Uday M Bhat <uday.m.bhat@intel.com> Tested-by: Łukasz Bartosik <ukaszb@chromium.org> Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17usb: xhci: Fix handling errors mid TD followed by other errorsMichal Pecio
Some host controllers fail to produce the final completion event on an isochronous TD which experienced an error mid TD. We deal with it by flagging such TDs and checking if the next event points at the flagged TD or at the next one, and giving back the flagged TD if the latter. This is not enough, because the next TD may be missed by the xHC. Or there may be no next TD but a ring underrun. We also need to get such TD quickly out of the way, or errors on later TDs may be handled wrong. If the next TD experiences a Missed Service Error, we will set the skip flag on the endpoint and then attempt skipping TDs when yet another event arrives. In such scenario, we ought to report the 'error mid TD' transfer as such rather than skip it. Another problem case are Stopped events. If we see one after an error mid TD, we naively assume that it's a Force Stopped Event because it doesn't match the pending TD, but in reality it might be an ordinary Stopped event for the next TD, which we fail to recognize and handle. Fix this by moving error mid TD handling before the whole TD skipping loop. Remove unnecessary conditions, always give back the TD if the new event points to any TRB outside it or if the pointer is NULL, as may be the case in Ring Underrun and Overrun events on 1st gen hardware. Only if the pending TD isn't flagged, consider other actions like skipping. As a side effect of reordering with skip and FSE cases, error mid TD is reordered with last_td_was_short check. This is harmless, because the two cases are mutually exclusive - only one can happen in any given run of handle_tx_event(). Tested on the NEC host and a USB camera with flaky cable. Dynamic debug confirmed that Transaction Errors are sometimes seen, sometimes mid-TD, sometimes followed by Missed Service. In such cases, they were finished properly before skipping began. [Rebase on 6.12-rc1 -Mathias] Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17xhci: Mitigate failed set dequeue pointer commandsMathias Nyman
Avoid xHC host from processing a cancelled URB by always turning cancelled URB TDs into no-op TRBs before queuing a 'Set TR Deq' command. If the command fails then xHC will start processing the cancelled TD instead of skipping it once endpoint is restarted, causing issues like Babble error. This is not a complete solution as a failed 'Set TR Deq' command does not guarantee xHC TRB caches are cleared. Fixes: 4db356924a50 ("xhci: turn cancelled td cleanup to its own function") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17xhci: Fix incorrect stream context type macroMathias Nyman
The stream contex type (SCT) bitfield is used both in the stream context data structure, and in the 'Set TR Dequeue pointer' command TRB. In both cases it uses bits 3:1 The SCT_FOR_TRB(p) macro used to set the stream context type (SCT) field for the 'Set TR Dequeue pointer' command TRB incorrectly shifts the value 1 bit left before masking the three bits. Fix this by first masking and rshifting, just like the similar SCT_FOR_CTX(p) macro does This issue has not been visibile as the lost bit 3 is only used with secondary stream arrays (SSA). Xhci driver currently only supports using a primary stream array with Linear stream addressing. Fixes: 95241dbdf828 ("xhci: Set SCT field for Set TR dequeue on streams") Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241016140000.783905-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17USB: gadget: dummy-hcd: Fix "task hung" problemAlan Stern
The syzbot fuzzer has been encountering "task hung" problems ever since the dummy-hcd driver was changed to use hrtimers instead of regular timers. It turns out that the problems are caused by a subtle difference between the timer_pending() and hrtimer_active() APIs. The changeover blindly replaced the first by the second. However, timer_pending() returns True when the timer is queued but not when its callback is running, whereas hrtimer_active() returns True when the hrtimer is queued _or_ its callback is running. This difference occasionally caused dummy_urb_enqueue() to think that the callback routine had not yet started when in fact it was almost finished. As a result the hrtimer was not restarted, which made it impossible for the driver to dequeue later the URB that was just enqueued. This caused usb_kill_urb() to hang, and things got worse from there. Since hrtimers have no API for telling when they are queued and the callback isn't running, the driver must keep track of this for itself. That's what this patch does, adding a new "timer_pending" flag and setting or clearing it at the appropriate times. Reported-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-usb/6709234e.050a0220.3e960.0011.GAE@google.com/ Tested-by: syzbot+f342ea16c9d06d80b585@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Fixes: a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler") Cc: Marcello Sylvester Bauer <sylv@sylv.io> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/2dab644e-ef87-4de8-ac9a-26f100b2c609@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-16clk: test: Fix some memory leaksJinjie Ruan
CONFIG_CLK_KUNIT_TEST=y, CONFIG_DEBUG_KMEMLEAK=y and CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, the following memory leak occurs. If the KUNIT_ASSERT_*() fails, the latter (exit() or testcases) clk_put() or clk_hw_unregister() will fail to release the clk resource and cause memory leaks, use new clk_hw_register_kunit() and clk_hw_get_clk_kunit() to automatically release them. unreferenced object 0xffffff80c6af5000 (size 512): comm "kunit_try_catch", pid 371, jiffies 4294896001 hex dump (first 32 bytes): 20 4c c0 86 e1 ff ff ff e0 1a c0 86 e1 ff ff ff L.............. c0 75 e3 c6 80 ff ff ff 00 00 00 00 00 00 00 00 .u.............. backtrace (crc 8ca788fa): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000d1bc850c>] __clk_register+0x80/0x1ecc [<00000000b08c78c5>] clk_hw_register+0xc4/0x110 [<00000000b16d6df8>] clk_multiple_parents_mux_test_init+0x238/0x288 [<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c6e37880 (size 96): comm "kunit_try_catch", pid 371, jiffies 4294896002 hex dump (first 32 bytes): 00 50 af c6 80 ff ff ff 00 00 00 00 00 00 00 00 .P.............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc b4b766dd): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4 [<0000000086e7dd64>] clk_hw_create_clk.part.0.isra.0+0x58/0x2f4 [<00000000dcf1ac31>] clk_hw_get_clk+0x8c/0x114 [<000000006fab5bfa>] clk_test_multiple_parents_mux_set_range_set_parent_get_rate+0x3c/0xa0 [<00000000c97db55a>] kunit_try_run_case+0x13c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c2b56900 (size 96): comm "kunit_try_catch", pid 395, jiffies 4294896107 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 e0 49 c0 86 e1 ff ff ff .........I...... backtrace (crc 2e59b327): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<00000000c6c715a8>] __kmalloc_noprof+0x2bc/0x3c0 [<00000000f04a7951>] __clk_register+0x70c/0x1ecc [<00000000b08c78c5>] clk_hw_register+0xc4/0x110 [<00000000cafa9563>] clk_orphan_transparent_multiple_parent_mux_test_init+0x1a8/0x1dc [<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c87c9400 (size 512): comm "kunit_try_catch", pid 483, jiffies 4294896907 hex dump (first 32 bytes): a0 44 c0 86 e1 ff ff ff e0 1a c0 86 e1 ff ff ff .D.............. 20 05 a8 c8 80 ff ff ff 00 00 00 00 00 00 00 00 ............... backtrace (crc c25b43fb): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000d1bc850c>] __clk_register+0x80/0x1ecc [<00000000b08c78c5>] clk_hw_register+0xc4/0x110 [<000000002688be48>] clk_single_parent_mux_test_init+0x1a0/0x1d4 [<0000000014a7e804>] kunit_try_run_case+0x10c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c6dd2380 (size 96): comm "kunit_try_catch", pid 483, jiffies 4294896908 hex dump (first 32 bytes): 00 94 7c c8 80 ff ff ff 00 00 00 00 00 00 00 00 ..|............. 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 4401212): [<00000000e21852d0>] kmemleak_alloc+0x34/0x40 [<000000009c583f7b>] __kmalloc_cache_noprof+0x26c/0x2f4 [<0000000086e7dd64>] clk_hw_create_clk.part.0.isra.0+0x58/0x2f4 [<00000000dcf1ac31>] clk_hw_get_clk+0x8c/0x114 [<0000000063eb2c90>] clk_test_single_parent_mux_set_range_disjoint_child_last+0x3c/0xa0 [<00000000c97db55a>] kunit_try_run_case+0x13c/0x3ac [<0000000026b41f03>] kunit_generic_run_threadfn_adapter+0x80/0xec [<0000000066619fb8>] kthread+0x2e8/0x374 [<00000000a1157f53>] ret_from_fork+0x10/0x20 ...... Fixes: 02cdeace1e1e ("clk: tests: Add tests for single parent mux") Fixes: 2e9cad1abc71 ("clk: tests: Add some tests for orphan with multiple parents") Fixes: 433fb8a611ca ("clk: tests: Add missing test case for ranges") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20241016022658.2131826-1-ruanjinjie@huawei.com Reviewed-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-10-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Several miscellaneous fixes. A lot of bnxt_re activity, there will be more rc patches there coming. - Many bnxt_re bug fixes - Memory leaks, kasn, NULL pointer deref, soft lockups, error unwinding and some small functional issues - Error unwind bug in rdma netlink - Two issues with incorrect VLAN detection for iWarp - skb_splice_from_iter() splat in siw - Give SRP slab caches unique names to resolve the merge window WARN_ON regression" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/bnxt_re: Fix the GID table length RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages RDMA/bnxt_re: Change the sequence of updating the CQ toggle value RDMA/bnxt_re: Fix an error path in bnxt_re_add_device RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop RDMA/bnxt_re: Fix a possible NULL pointer dereference RDMA/bnxt_re: Return more meaningful error RDMA/bnxt_re: Fix incorrect dereference of srq in async event RDMA/bnxt_re: Fix out of bound check RDMA/bnxt_re: Fix the max CQ WQEs for older adapters RDMA/srpt: Make slab cache names unique RDMA/irdma: Fix misspelling of "accept*" RDMA/cxgb4: Fix RDMA_CM_EVENT_UNREACHABLE error for iWARP RDMA/siw: Add sendpage_ok() check to disable MSG_SPLICE_PAGES RDMA/core: Fix ENODEV error for iWARP test over vlan RDMA/nldev: Fix NULL pointer dereferences issue in rdma_nl_notify_event RDMA/bnxt_re: Fix the max WQEs used in Static WQE mode RDMA/bnxt_re: Add a check for memory allocation RDMA/bnxt_re: Fix incorrect AVID type in WQE structure RDMA/bnxt_re: Fix a possible memory leak
2024-10-16powercap: intel_rapl_msr: Add PL4 support for ArrowLake-HSrinivas Pandruvada
Add ArrowLake-H to the list of processors where PL4 is supported. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Link: https://patch.msgid.link/20241016154851.1293654-1-srinivas.pandruvada@linux.intel.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-10-16Bluetooth: btusb: Fix regression with fake CSR controllers 0a12:0001Luiz Augusto von Dentz
Fake CSR controllers don't seem to handle short-transfer properly which cause command to time out: kernel: usb 1-1: new full-speed USB device number 19 using xhci_hcd kernel: usb 1-1: New USB device found, idVendor=0a12, idProduct=0001, bcdDevice=88.91 kernel: usb 1-1: New USB device strings: Mfr=0, Product=2, SerialNumber=0 kernel: usb 1-1: Product: BT DONGLE10 ... Bluetooth: hci1: Opcode 0x1004 failed: -110 kernel: Bluetooth: hci1: command 0x1004 tx timeout According to USB Spec 2.0 Section 5.7.3 Interrupt Transfer Packet Size Constraints a interrupt transfer is considered complete when the size is 0 (ZPL) or < wMaxPacketSize: 'When an interrupt transfer involves more data than can fit in one data payload of the currently established maximum size, all data payloads are required to be maximum-sized except for the last data payload, which will contain the remaining data. An interrupt transfer is complete when the endpoint does one of the following: • Has transferred exactly the amount of data expected • Transfers a packet with a payload size less than wMaxPacketSize or transfers a zero-length packet' Link: https://bugzilla.kernel.org/show_bug.cgi?id=219365 Fixes: 7b05933340f4 ("Bluetooth: btusb: Fix not handling ZPL/short-transfer") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-10-16Bluetooth: btusb: Fix not being able to reconnect after suspendLuiz Augusto von Dentz
This partially reverts 81b3e33bb054 ("Bluetooth: btusb: Don't fail external suspend requests") as it introduced a call to hci_suspend_dev that assumes the system-suspend which doesn't work well when just the device is being suspended because wakeup flag is only set for remote devices that can wakeup the system. Reported-by: Rafael J. Wysocki <rafael@kernel.org> Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Reported-by: Kenneth Crudup <kenny@panix.com> Fixes: 610712298b11 ("Bluetooth: btusb: Don't fail external suspend requests") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Rafael J. Wysocki <rafael@kernel.org>
2024-10-16drm/amdgpu/swsmu: default to fullscreen 3D profile for dGPUsAlex Deucher
This uses more aggressive hueristics than the the bootup default profile. On windows the OS has a special fullscreen 3D mode where this is used. Since we don't have the equivalent on Linux default to this profile for dGPUs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1500 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3131 Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 336568de918e08c825b3b1cbe2ec809f2fc26d94)
2024-10-16Input: xpad - add support for 8BitDo Ultimate 2C Wireless ControllerStefan Kerkmann
This XBOX360 compatible gamepad uses the new product id 0x310a under the 8BitDo's vendor id 0x2dc8. The change was tested using the gamepad in a wired and wireless dongle configuration. Signed-off-by: Stefan Kerkmann <s.kerkmann@pengutronix.de> Link: https://lore.kernel.org/r/20241015-8bitdo_2c_ultimate_wireless-v1-1-9c9f9db2e995@pengutronix.de Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-10-16Merge tag 'v6.12-p3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Remove bogus testmgr ENOENT error messages - Ensure algorithm is still alive before marking it as tested - Disable buggy hash algorithms in marvell/cesa * tag 'v6.12-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: marvell/cesa - Disable hash algorithms crypto: testmgr - Hide ENOENT errors better crypto: api - Fix liveliness check in crypto_alg_tested
2024-10-16ublk: don't allow user copy for unprivileged deviceMing Lei
UBLK_F_USER_COPY requires userspace to call write() on ublk char device for filling request buffer, and unprivileged device can't be trusted. So don't allow user copy for unprivileged device. Cc: stable@vger.kernel.org Fixes: 1172d5b8beca ("ublk: support user copy") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20241016134847.2911721-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-10-16drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or ↵Juha-Pekka Heikkila
greater On display ver 20 onwards tile4 is not supported with horizontal flip Bspec: 69853 Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Signed-off-by: Mika Kahola <mika.kahola@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007182841.2104740-1-juhapekka.heikkila@gmail.com (cherry picked from commit 73e8e2f9a358caa005ed6e52dcb7fa2bca59d132) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe/bmg: improve cache flushing behaviourMatthew Auld
The BSpec says that EN_L3_RW_CCS_CACHE_FLUSH must be toggled on for manual global invalidation to take effect and actually flush device cache, however this also turns on flushing for things like pipecontrol, which occurs between submissions for compute/render. This sounds like massive overkill for our needs, where we already have the manual flushing on the display side with the global invalidation. Some observations on BMG: 1. Disabling l2 caching for host writes and stubbing out the driver global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has no impact on wb-transient-vs-display IGT, which makes sense since the pipecontrol is now flushing the device cache after the render copy. Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also expected since device cache is now dirty and display engine can't see the writes. 2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global invalidation also has no impact on wb-transient-vs-display. This suggests that the global invalidation still works as expected and is flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on. With that drop EN_L3_RW_CCS_CACHE_FLUSH. This helps some workloads since we no longer flush the device cache between submissions as part of pipecontrol. Edit: We now also have clarification from HW side that BSpec was indeed wrong here. v2: - Rebase and update commit message. BSpec: 71718 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Vitasta Wattal <vitasta.wattal@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007074541.33937-2-matthew.auld@intel.com (cherry picked from commit 67ec9f87bd6c57db1251bb2244d242f7ca5a0b6a) [ Fix conflict due to changed xe_mmio_write32() signature ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe/xe_sync: initialise ufence.signalledMatthew Auld
We can incorrectly think that the fence has signalled, if we get a non-zero value here from the kmalloc, which is quite plausible. Just use kzalloc to prevent stuff like this. Fixes: 977e5b82e090 ("drm/xe: Expose user fence from xe_sync_entry") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011133633.388008-2-matthew.auld@intel.com (cherry picked from commit 26f69e88dcc95fffc62ed2aea30ad7b1fdf31fdb) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe/ufence: ufence can be signaled right after wait_wokenNirmoy Das
do_comapre() can return success after a timedout wait_woken() which was treated as -ETIME. The loop calling wait_woken() sets correct err so there is no need to re-evaluate err. v2: Remove entire check that reevaluate err at the end(Matt) Fixes: e670f0b4ef24 ("drm/xe/uapi: Return correct error code for xe_wait_user_fence_ioctl") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630 Cc: stable@vger.kernel.org # v6.8+ Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011151029.4160630-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit ec7e6a1d527755fc3c7a3303eaa5577aac5cf6be) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe: Use bookkeep slots for external BO's in exec IOCTLMatthew Brost
Fix external BO's dma-resv usage in exec IOCTL using bookkeep slots rather than write slots. This leaves syncing to user space rather than the KMD blindly enforcing write semantics on every external BO. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Kenneth Graunke <kenneth.w.graunke@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reported-by: Simona Vetter <simona.vetter@ffwll.ch> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2673 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240911152622.903058-1-matthew.brost@intel.com (cherry picked from commit b8b1163248759ba18509f7443a2d19b15b4c1df8) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe/query: Increase timestamp widthLucas De Marchi
Starting with Xe2 the timestamp is a full 64 bit counter, contrary to the 36 bit that was available before. Although 36 should be sufficient for any reasonable delta calculation (for Xe2, of about 30min), it's surprising to userspace to get something truncated. Also if the timestamp being compared to is coming from the GPU and the application is not careful enough to apply the width there, a delta calculation would be wrong. Extend it to full 64-bits starting with Xe2. v2: Expand width=64 to media gt, as it's just a wrong tagging in the spec - empirical tests show it goes beyond 36 bits and match the engines for the main gt Bspec: 60411 Cc: Szymon Morek <szymon.morek@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011035618.1057602-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 9d559cdcb21f42188d4c3ff3b4fe42b240f4af5d) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe: Don't free job in TDRMatthew Brost
Freeing job in TDR is not safe as TDR can pass the run_job thread resulting in UAF. It is only safe for free job to naturally be called by the scheduler. Rather free job in TDR, add to pending list. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2811 Cc: Matthew Auld <matthew.auld@intel.com> Fixes: e275d61c5f3f ("drm/xe/guc: Handle timing out of signaled jobs gracefully") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003001657.3517883-3-matthew.brost@intel.com (cherry picked from commit ea2f6a77d0c40d97f4a4dc93fee4afe15d94926d) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe: Take job list lock in xe_sched_add_pending_jobMatthew Brost
A fragile micro optimization in xe_sched_add_pending_job relied on both the GPU scheduler being stopped and fence signaling stopped to safely add a job to the pending list without the job list lock in xe_sched_add_pending_job. Remove this optimization and just take the job list lock. Fixes: 7ddb9403dd74 ("drm/xe: Sample ctx timestamp to determine if jobs have timed out") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003001657.3517883-2-matthew.brost@intel.com (cherry picked from commit 90521df5fc43980e4575bd8c5b1cb62afe1a9f5f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe: fix unbalanced rpm put() with declare_wedged()Matthew Auld
Technically the or_reset() means we call the action on failure, however that would lead to unbalanced rpm put(). Move the get() earlier to fix this. It should be extremely unlikely to ever trigger this in practice. Fixes: 90936a0a4c54 ("drm/xe: Don't suspend device upon wedge") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-4-matthew.auld@intel.com (cherry picked from commit a187c1b0a800565a4db6372268692aff99df7f53) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe: fix unbalanced rpm put() with fence_fini()Matthew Auld
Currently we can call fence_fini() twice if something goes wrong when sending the GuC CT for the tlb request, since we signal the fence and return an error, leading to the caller also calling fini() on the error path in the case of stack version of the flow, which leads to an extra rpm put() which might later cause device to enter suspend when it shouldn't. It looks like we can just drop the fini() call since the fence signaller side will already call this for us. There are known mysterious splats with device going to sleep even with an rpm ref, and this could be one candidate. v2 (Matt B): - Prefer warning if we detect double fini() Fixes: f002702290fc ("drm/xe: Hold a PM ref when GT TLB invalidations are inflight") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-3-matthew.auld@intel.com (cherry picked from commit cfcbc0520d5055825f0647ab922b655688605183) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpgAradhya Bhatia
Add workaround (wa) 15016589081 which applies to Xe2_v3_LPG_MD. Xe2_v3_LPG_MD is a Lunar Lake platform with GFX version: 20.04. This wa is type: permanent, and hence is applicable on all steppings. Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009065542.283151-1-aradhya.bhatia@intel.com (cherry picked from commit 8fb1da9f9bfb02f710a7f826d50781b0b030cf53) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-16drm/i915/dp_mst: Don't require DSC hblank quirk for a non-DSC compatible modeImre Deak
If an MST branch device doesn't support DSC for a given mode, but the MST link has enough BW for the mode, assume that the branch device does support the mode using an uncompressed stream. Fixes: 55eaef164174 ("drm/i915/dp_mst: Handle the Synaptics HBlank expansion quirk") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009110135.1216498-2-imre.deak@intel.com (cherry picked from commit 4e75c3e208a06ad6fd9b3517fb77337460d7c2b0) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-10-16drm/i915/dp_mst: Handle error during DSC BW overhead/slice calculationImre Deak
The MST branch device may not support the number of DSC slices a mode requires, handle the error in this case. Fixes: 4e0837a8d00a ("drm/i915/dp_mst: Account for FEC and DSC overhead during BW allocation") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009110135.1216498-1-imre.deak@intel.com (cherry picked from commit 802a69b6b8a0502a9e2309afec7e1b77f67874f2) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-10-16s390/sclp_vt220: Convert newlines to CRLF instead of LFCRThomas Weißschuh
According to the VT220 specification the possible character combinations sent on RETURN are only CR or CRLF [0]. The Return key sends either a CR character (0/13) or a CR character (0/13) and an LF character (0/10), depending on the set/reset state of line feed/new line mode (LNM). The sclp/vt220 driver however uses LFCR. This can confuse tools, for example the kunit runner. Link: https://vt100.net/docs/vt220-rm/chapter3.html#S3.2 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Link: https://lore.kernel.org/r/20241014-s390-kunit-v1-2-941defa765a6@linutronix.de Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-16s390/sclp: Deactivate sclp after all its usersThomas Weißschuh
On reboot the SCLP interface is deactivated through a reboot notifier. This happens before other components using SCLP have the chance to run their own reboot notifiers. Two of those components are the SCLP console and tty drivers which try to flush the last outstanding messages. At that point the SCLP interface is already unusable and the messages are discarded. Execute sclp_deactivate() as late as possible to avoid this issue. Fixes: 4ae46db99cd8 ("s390/consoles: improve panic notifiers reliability") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Link: https://lore.kernel.org/r/20241014-s390-kunit-v1-1-941defa765a6@linutronix.de Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-16s390/pkey_pckmo: Return with success for valid protected key typesHolger Dengler
The key_to_protkey handler function in module pkey_pckmo should return with success on all known protected key types, including the new types introduced by fd197556eef5 ("s390/pkey: Add AES xts and HMAC clear key token support"). Fixes: fd197556eef5 ("s390/pkey: Add AES xts and HMAC clear key token support") Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-10-16usb: gadget: f_uac2: fix return value for UAC2_ATTRIBUTE_STRING storeKevin Groeneveld
The configfs store callback should return the number of bytes consumed not the total number of bytes we actually stored. These could differ if for example the passed in string had a newline we did not store. If the returned value does not match the number of bytes written the writer might assume a failure or keep trying to write the remaining bytes. For example the following command will hang trying to write the final newline over and over again (tested on bash 2.05b): echo foo > function_name Fixes: 993a44fa85c1 ("usb: gadget: f_uac2: allow changing interface name via configfs") Cc: stable <stable@kernel.org> Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com> Link: https://lore.kernel.org/r/20241006232637.4267-1-kgroeneveld@lenbrook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-16usb: dwc3: core: Fix system suspend on TI AM62 platformsRoger Quadros
Since commit 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init"), system suspend is broken on AM62 TI platforms. Before that commit, both DWC3_GUSB3PIPECTL_SUSPHY and DWC3_GUSB2PHYCFG_SUSPHY bits (hence forth called 2 SUSPHY bits) were being set during core initialization and even during core re-initialization after a system suspend/resume. These bits are required to be set for system suspend/resume to work correctly on AM62 platforms. Since that commit, the 2 SUSPHY bits are not set for DEVICE/OTG mode if gadget driver is not loaded and started. For Host mode, the 2 SUSPHY bits are set before the first system suspend but get cleared at system resume during core re-init and are never set again. This patch resovles these two issues by ensuring the 2 SUSPHY bits are set before system suspend and restored to the original state during system resume. Cc: stable@vger.kernel.org # v6.9+ Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init") Link: https://lore.kernel.org/all/1519dbe7-73b6-4afc-bfe3-23f4f75d772f@kernel.org/ Signed-off-by: Roger Quadros <rogerq@kernel.org> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Tested-by: Markus Schneider-Pargmann <msp@baylibre.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Link: https://lore.kernel.org/r/20241011-am62-lpm-usb-v3-1-562d445625b5@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-16xhci: tegra: fix checked USB2 port numberHenry Lin
If USB virtualizatoin is enabled, USB2 ports are shared between all Virtual Functions. The USB2 port number owned by an USB2 root hub in a Virtual Function may be less than total USB2 phy number supported by the Tegra XUSB controller. Using total USB2 phy number as port number to check all PORTSC values would cause invalid memory access. [ 116.923438] Unable to handle kernel paging request at virtual address 006c622f7665642f ... [ 117.213640] Call trace: [ 117.216783] tegra_xusb_enter_elpg+0x23c/0x658 [ 117.222021] tegra_xusb_runtime_suspend+0x40/0x68 [ 117.227260] pm_generic_runtime_suspend+0x30/0x50 [ 117.232847] __rpm_callback+0x84/0x3c0 [ 117.237038] rpm_suspend+0x2dc/0x740 [ 117.241229] pm_runtime_work+0xa0/0xb8 [ 117.245769] process_scheduled_works+0x24c/0x478 [ 117.251007] worker_thread+0x23c/0x328 [ 117.255547] kthread+0x104/0x1b0 [ 117.259389] ret_from_fork+0x10/0x20 [ 117.263582] Code: 54000222 f9461ae8 f8747908 b4ffff48 (f9400100) Cc: stable@vger.kernel.org # v6.3+ Fixes: a30951d31b25 ("xhci: tegra: USB2 pad power controls") Signed-off-by: Henry Lin <henryl@nvidia.com> Link: https://lore.kernel.org/r/20241014042134.27664-1-henryl@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-16usb: dwc3: Wait for EndXfer completion before restoring GUSB2PHYCFGPrashanth K
DWC3 programming guide mentions that when operating in USB2.0 speeds, if GUSB2PHYCFG[6] or GUSB2PHYCFG[8] is set, it must be cleared prior to issuing commands and may be set again after the command completes. But currently while issuing EndXfer command without CmdIOC set, we wait for 1ms after GUSB2PHYCFG is restored. This results in cases where EndXfer command doesn't get completed and causes SMMU faults since requests are unmapped afterwards. Hence restore GUSB2PHYCFG after waiting for EndXfer command completion. Cc: stable@vger.kernel.org Fixes: 1d26ba0944d3 ("usb: dwc3: Wait unconditionally after issuing EndXfer command") Signed-off-by: Prashanth K <quic_prashk@quicinc.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20240924093208.2524531-1-quic_prashk@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-16usb: typec: qcom-pmic-typec: fix sink status being overwritten with RP_DEFJonathan Marek
This line is overwriting the result of the above switch-case. This fixes the tcpm driver getting stuck in a "Sink TX No Go" loop. Fixes: a4422ff22142 ("usb: typec: qcom: Add Qualcomm PMIC Type-C driver") Cc: stable <stable@kernel.org> Signed-off-by: Jonathan Marek <jonathan@marek.ca> Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241005144146.2345-1-jonathan@marek.ca Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-16usb: typec: altmode should keep reference to parentThadeu Lima de Souza Cascardo
The altmode device release refers to its parent device, but without keeping a reference to it. When registering the altmode, get a reference to the parent and put it in the release function. Before this fix, when using CONFIG_DEBUG_KOBJECT_RELEASE, we see issues like this: [ 43.572860] kobject: 'port0.0' (ffff8880057ba008): kobject_release, parent 0000000000000000 (delayed 3000) [ 43.573532] kobject: 'port0.1' (ffff8880057bd008): kobject_release, parent 0000000000000000 (delayed 1000) [ 43.574407] kobject: 'port0' (ffff8880057b9008): kobject_release, parent 0000000000000000 (delayed 3000) [ 43.575059] kobject: 'port1.0' (ffff8880057ca008): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.575908] kobject: 'port1.1' (ffff8880057c9008): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.576908] kobject: 'typec' (ffff8880062dbc00): kobject_release, parent 0000000000000000 (delayed 4000) [ 43.577769] kobject: 'port1' (ffff8880057bf008): kobject_release, parent 0000000000000000 (delayed 3000) [ 46.612867] ================================================================== [ 46.613402] BUG: KASAN: slab-use-after-free in typec_altmode_release+0x38/0x129 [ 46.614003] Read of size 8 at addr ffff8880057b9118 by task kworker/2:1/48 [ 46.614538] [ 46.614668] CPU: 2 UID: 0 PID: 48 Comm: kworker/2:1 Not tainted 6.12.0-rc1-00138-gedbae730ad31 #535 [ 46.615391] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 [ 46.616042] Workqueue: events kobject_delayed_cleanup [ 46.616446] Call Trace: [ 46.616648] <TASK> [ 46.616820] dump_stack_lvl+0x5b/0x7c [ 46.617112] ? typec_altmode_release+0x38/0x129 [ 46.617470] print_report+0x14c/0x49e [ 46.617769] ? rcu_read_unlock_sched+0x56/0x69 [ 46.618117] ? __virt_addr_valid+0x19a/0x1ab [ 46.618456] ? kmem_cache_debug_flags+0xc/0x1d [ 46.618807] ? typec_altmode_release+0x38/0x129 [ 46.619161] kasan_report+0x8d/0xb4 [ 46.619447] ? typec_altmode_release+0x38/0x129 [ 46.619809] ? process_scheduled_works+0x3cb/0x85f [ 46.620185] typec_altmode_release+0x38/0x129 [ 46.620537] ? process_scheduled_works+0x3cb/0x85f [ 46.620907] device_release+0xaf/0xf2 [ 46.621206] kobject_delayed_cleanup+0x13b/0x17a [ 46.621584] process_scheduled_works+0x4f6/0x85f [ 46.621955] ? __pfx_process_scheduled_works+0x10/0x10 [ 46.622353] ? hlock_class+0x31/0x9a [ 46.622647] ? lock_acquired+0x361/0x3c3 [ 46.622956] ? move_linked_works+0x46/0x7d [ 46.623277] worker_thread+0x1ce/0x291 [ 46.623582] ? __kthread_parkme+0xc8/0xdf [ 46.623900] ? __pfx_worker_thread+0x10/0x10 [ 46.624236] kthread+0x17e/0x190 [ 46.624501] ? kthread+0xfb/0x190 [ 46.624756] ? __pfx_kthread+0x10/0x10 [ 46.625015] ret_from_fork+0x20/0x40 [ 46.625268] ? __pfx_kthread+0x10/0x10 [ 46.625532] ret_from_fork_asm+0x1a/0x30 [ 46.625805] </TASK> [ 46.625953] [ 46.626056] Allocated by task 678: [ 46.626287] kasan_save_stack+0x24/0x44 [ 46.626555] kasan_save_track+0x14/0x2d [ 46.626811] __kasan_kmalloc+0x3f/0x4d [ 46.627049] __kmalloc_noprof+0x1bf/0x1f0 [ 46.627362] typec_register_port+0x23/0x491 [ 46.627698] cros_typec_probe+0x634/0xbb6 [ 46.628026] platform_probe+0x47/0x8c [ 46.628311] really_probe+0x20a/0x47d [ 46.628605] device_driver_attach+0x39/0x72 [ 46.628940] bind_store+0x87/0xd7 [ 46.629213] kernfs_fop_write_iter+0x1aa/0x218 [ 46.629574] vfs_write+0x1d6/0x29b [ 46.629856] ksys_write+0xcd/0x13b [ 46.630128] do_syscall_64+0xd4/0x139 [ 46.630420] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 46.630820] [ 46.630946] Freed by task 48: [ 46.631182] kasan_save_stack+0x24/0x44 [ 46.631493] kasan_save_track+0x14/0x2d [ 46.631799] kasan_save_free_info+0x3f/0x4d [ 46.632144] __kasan_slab_free+0x37/0x45 [ 46.632474] kfree+0x1d4/0x252 [ 46.632725] device_release+0xaf/0xf2 [ 46.633017] kobject_delayed_cleanup+0x13b/0x17a [ 46.633388] process_scheduled_works+0x4f6/0x85f [ 46.633764] worker_thread+0x1ce/0x291 [ 46.634065] kthread+0x17e/0x190 [ 46.634324] ret_from_fork+0x20/0x40 [ 46.634621] ret_from_fork_asm+0x1a/0x30 Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20241004123738.2964524-1-cascardo@igalia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-15cpufreq/amd-pstate: Use nominal perf for limits when boost is disabledMario Limonciello
When boost has been disabled the limit for perf should be nominal perf not the highest perf. Using the latter to do calculations will lead to incorrect values that are still above nominal. Fixes: ad4caad58d91 ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()") Reported-by: Peter Jung <ptr1337@cachyos.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219348 Reviewed-by: Perry Yuan <perry.yuan@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Tested-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Link: https://lore.kernel.org/r/20241012174519.897-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-10-15scsi: target: core: Fix null-ptr-deref in target_alloc_device()Wang Hai
There is a null-ptr-deref issue reported by KASAN: BUG: KASAN: null-ptr-deref in target_alloc_device+0xbc4/0xbe0 [target_core_mod] ... kasan_report+0xb9/0xf0 target_alloc_device+0xbc4/0xbe0 [target_core_mod] core_dev_setup_virtual_lun0+0xef/0x1f0 [target_core_mod] target_core_init_configfs+0x205/0x420 [target_core_mod] do_one_initcall+0xdd/0x4e0 ... entry_SYSCALL_64_after_hwframe+0x76/0x7e In target_alloc_device(), if allocing memory for dev queues fails, then dev will be freed by dev->transport->free_device(), but dev->transport is not initialized at that time, which will lead to a null pointer reference problem. Fixing this bug by freeing dev with hba->backend->ops->free_device(). Fixes: 1526d9f10c61 ("scsi: target: Make state_list per CPU") Signed-off-by: Wang Hai <wanghai38@huawei.com> Link: https://lore.kernel.org/r/20241011113444.40749-1-wanghai38@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-15scsi: mpi3mr: Validate SAS port assignmentsRanjan Kumar
A sanity check on phy_mask was added in commit 3668651def2c ("scsi: mpi3mr: Sanitise num_phys"). This causes warning messages when more than 64 phys are detected and devices connected to phys greater than 64 are dropped. The phy_mask bitmap is only needed for controller phys and not required for expander phys. Controller phys can go up to a maximum of 64 and therefore u64 is good enough to contain phy_mask bitmap. To suppress those warnings and allow devices to be discovered as before the offending commit, restrict the phy_mask setting and lowest phy setting only to the controller phys. Fixes: 3668651def2c ("scsi: mpi3mr: Sanitise num_phys") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410051943.Mp9o5DlF-lkp@intel.com/ Reported-by: Alexander Motin <mav@ixsystems.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241008074353.200379-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-15scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut downSeunghwan Baek
There is a history of deadlock if reboot is performed at the beginning of booting. SDEV_QUIESCE was set for all LU's scsi_devices by UFS shutdown, and at that time the audio driver was waiting on blk_mq_submit_bio() holding a mutex_lock while reading the fw binary. After that, a deadlock issue occurred while audio driver shutdown was waiting for mutex_unlock of blk_mq_submit_bio(). To solve this, set SDEV_OFFLINE for all LUs except WLUN, so that any I/O that comes down after a UFS shutdown will return an error. [ 31.907781]I[0: swapper/0: 0] 1 130705007 1651079834 11289729804 0 D( 2) 3 ffffff882e208000 * init [device_shutdown] [ 31.907793]I[0: swapper/0: 0] Mutex: 0xffffff8849a2b8b0: owner[0xffffff882e28cb00 kworker/6:0 :49] [ 31.907806]I[0: swapper/0: 0] Call trace: [ 31.907810]I[0: swapper/0: 0] __switch_to+0x174/0x338 [ 31.907819]I[0: swapper/0: 0] __schedule+0x5ec/0x9cc [ 31.907826]I[0: swapper/0: 0] schedule+0x7c/0xe8 [ 31.907834]I[0: swapper/0: 0] schedule_preempt_disabled+0x24/0x40 [ 31.907842]I[0: swapper/0: 0] __mutex_lock+0x408/0xdac [ 31.907849]I[0: swapper/0: 0] __mutex_lock_slowpath+0x14/0x24 [ 31.907858]I[0: swapper/0: 0] mutex_lock+0x40/0xec [ 31.907866]I[0: swapper/0: 0] device_shutdown+0x108/0x280 [ 31.907875]I[0: swapper/0: 0] kernel_restart+0x4c/0x11c [ 31.907883]I[0: swapper/0: 0] __arm64_sys_reboot+0x15c/0x280 [ 31.907890]I[0: swapper/0: 0] invoke_syscall+0x70/0x158 [ 31.907899]I[0: swapper/0: 0] el0_svc_common+0xb4/0xf4 [ 31.907909]I[0: swapper/0: 0] do_el0_svc+0x2c/0xb0 [ 31.907918]I[0: swapper/0: 0] el0_svc+0x34/0xe0 [ 31.907928]I[0: swapper/0: 0] el0t_64_sync_handler+0x68/0xb4 [ 31.907937]I[0: swapper/0: 0] el0t_64_sync+0x1a0/0x1a4 [ 31.908774]I[0: swapper/0: 0] 49 0 11960702 11236868007 0 D( 2) 6 ffffff882e28cb00 * kworker/6:0 [__bio_queue_enter] [ 31.908783]I[0: swapper/0: 0] Call trace: [ 31.908788]I[0: swapper/0: 0] __switch_to+0x174/0x338 [ 31.908796]I[0: swapper/0: 0] __schedule+0x5ec/0x9cc [ 31.908803]I[0: swapper/0: 0] schedule+0x7c/0xe8 [ 31.908811]I[0: swapper/0: 0] __bio_queue_enter+0xb8/0x178 [ 31.908818]I[0: swapper/0: 0] blk_mq_submit_bio+0x194/0x67c [ 31.908827]I[0: swapper/0: 0] __submit_bio+0xb8/0x19c Fixes: b294ff3e3449 ("scsi: ufs: core: Enable power management for wlun") Cc: stable@vger.kernel.org Signed-off-by: Seunghwan Baek <sh8267.baek@samsung.com> Link: https://lore.kernel.org/r/20240829093913.6282-2-sh8267.baek@samsung.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-15scsi: ufs: core: Requeue aborted requestPeter Wang
After the SQ cleanup fix, the CQ will receive a response with the corresponding tag marked as OCS: ABORTED. To align with the behavior of Legacy SDB mode, the handling of OCS: ABORTED has been changed to match that of OCS_INVALID_COMMAND_STATUS (SDB), with both returning a SCSI result of DID_REQUEUE. Furthermore, the workaround implemented before the SQ cleanup fix can be removed. Fixes: ab248643d3d6 ("scsi: ufs: core: Add error handling for MCQ mode") Cc: stable@vger.kernel.org Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241001091917.6917-3-peter.wang@mediatek.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-15scsi: ufs: core: Fix the issue of ICU failurePeter Wang
When setting the ICU bit without using read-modify-write, SQRTCy will restart SQ again and receive an RTC return error code 2 (Failure - SQ not stopped). Additionally, the error log has been modified so that this type of error can be observed. Fixes: ab248643d3d6 ("scsi: ufs: core: Add error handling for MCQ mode") Cc: stable@vger.kernel.org Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20241001091917.6917-2-peter.wang@mediatek.com Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-15net: dsa: vsc73xx: fix reception from VLAN-unaware bridgesVladimir Oltean
Similar to the situation described for sja1105 in commit 1f9fc48fd302 ("net: dsa: sja1105: fix reception from VLAN-unaware bridges"), the vsc73xx driver uses tag_8021q and doesn't need the ds->untag_bridge_pvid request. In fact, this option breaks packet reception. The ds->untag_bridge_pvid option strips VLANs from packets received on VLAN-unaware bridge ports. But those VLANs should already be stripped by tag_vsc73xx_8021q.c as part of vsc73xx_rcv() - they are not VLANs in VLAN-unaware mode, but DSA tags. Thus, dsa_software_vlan_untag() tries to untag a VLAN that doesn't exist, corrupting the packet. Fixes: 93e4649efa96 ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges") Tested-by: Pawel Dembicki <paweldembicki@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patch.msgid.link/20241014153041.1110364-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15net: ravb: Only advertise Rx/Tx timestamps if hardware supports itNiklas Söderlund
Recent work moving the reporting of Rx software timestamps to the core [1] highlighted an issue where hardware time stamping was advertised for the platforms where it is not supported. Fix this by covering advertising support for hardware timestamps only if the hardware supports it. Due to the Tx implementation in RAVB software Tx timestamping is also only considered if the hardware supports hardware timestamps. This should be addressed in future, but this fix only reflects what the driver currently implements. 1. Commit 277901ee3a26 ("ravb: Remove setting of RX software timestamp") Fixes: 7e09a052dc4e ("ravb: Exclude gPTP feature support for RZ/G2L") Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com> Tested-by: Paul Barker <paul.barker.ct@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://patch.msgid.link/20241014124343.3875285-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15net: microchip: vcap api: Fix memory leaks in vcap_api_encode_rule_test()Jinjie Ruan
Commit a3c1e45156ad ("net: microchip: vcap: Fix use-after-free error in kunit test") fixed the use-after-free error, but introduced below memory leaks by removing necessary vcap_free_rule(), add it to fix it. unreferenced object 0xffffff80ca58b700 (size 192): comm "kunit_try_catch", pid 1215, jiffies 4294898264 hex dump (first 32 bytes): 00 12 7a 00 05 00 00 00 0a 00 00 00 64 00 00 00 ..z.........d... 00 00 00 00 00 00 00 00 00 04 0b cc 80 ff ff ff ................ backtrace (crc 9c09c3fe): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<0000000040a01b8d>] vcap_alloc_rule+0x3cc/0x9c4 [<000000003fe86110>] vcap_api_encode_rule_test+0x1ac/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cc0b0400 (size 64): comm "kunit_try_catch", pid 1215, jiffies 4294898265 hex dump (first 32 bytes): 80 04 0b cc 80 ff ff ff 18 b7 58 ca 80 ff ff ff ..........X..... 39 00 00 00 02 00 00 00 06 05 04 03 02 01 ff ff 9............... backtrace (crc daf014e9): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528 [<00000000dfdb1e81>] vcap_api_encode_rule_test+0x224/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cc0b0700 (size 64): comm "kunit_try_catch", pid 1215, jiffies 4294898265 hex dump (first 32 bytes): 80 07 0b cc 80 ff ff ff 28 b7 58 ca 80 ff ff ff ........(.X..... 3c 00 00 00 00 00 00 00 01 2f 03 b3 ec ff ff ff <......../...... backtrace (crc 8d877792): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000006eadfab7>] vcap_rule_add_action+0x2d0/0x52c [<00000000323475d1>] vcap_api_encode_rule_test+0x4d4/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cc0b0900 (size 64): comm "kunit_try_catch", pid 1215, jiffies 4294898266 hex dump (first 32 bytes): 80 09 0b cc 80 ff ff ff 80 06 0b cc 80 ff ff ff ................ 7d 00 00 00 01 00 00 00 00 00 00 00 ff 00 00 00 }............... backtrace (crc 34181e56): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528 [<00000000991e3564>] vcap_val_rule+0xcf0/0x13e8 [<00000000fc9868e5>] vcap_api_encode_rule_test+0x678/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80cc0b0980 (size 64): comm "kunit_try_catch", pid 1215, jiffies 4294898266 hex dump (first 32 bytes): 18 b7 58 ca 80 ff ff ff 00 09 0b cc 80 ff ff ff ..X............. 67 00 00 00 00 00 00 00 01 01 74 88 c0 ff ff ff g.........t..... backtrace (crc 275fd9be): [<0000000052a0be73>] kmemleak_alloc+0x34/0x40 [<0000000043605459>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000000ff63fd4>] vcap_rule_add_key+0x2cc/0x528 [<000000001396a1a2>] test_add_def_fields+0xb0/0x100 [<000000006e7621f0>] vcap_val_rule+0xa98/0x13e8 [<00000000fc9868e5>] vcap_api_encode_rule_test+0x678/0x16b0 [<00000000b3595fc4>] kunit_try_run_case+0x13c/0x3ac [<0000000010f5d2bf>] kunit_generic_run_threadfn_adapter+0x80/0xec [<00000000c5d82c9a>] kthread+0x2e8/0x374 [<00000000f4287308>] ret_from_fork+0x10/0x20 ...... Cc: stable@vger.kernel.org Fixes: a3c1e45156ad ("net: microchip: vcap: Fix use-after-free error in kunit test") Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20241014121922.1280583-1-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15net: phy: mdio-bcm-unimac: Add BCM6846 supportLinus Walleij
Add Unimac mdio compatible string for the special BCM6846 variant. This variant has a few extra registers compared to other versions. Suggested-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/linux-devicetree/b542b2e8-115c-4234-a464-e73aa6bece5c@broadcom.com/ Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://patch.msgid.link/20241012-bcm6846-mdio-v1-2-c703ca83e962@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15genetlink: hold RCU in genlmsg_mcast()Eric Dumazet
While running net selftests with CONFIG_PROVE_RCU_LIST=y I saw one lockdep splat [1]. genlmsg_mcast() uses for_each_net_rcu(), and must therefore hold RCU. Instead of letting all callers guard genlmsg_multicast_allns() with a rcu_read_lock()/rcu_read_unlock() pair, do it in genlmsg_mcast(). This also means the @flags parameter is useless, we need to always use GFP_ATOMIC. [1] [10882.424136] ============================= [10882.424166] WARNING: suspicious RCU usage [10882.424309] 6.12.0-rc2-virtme #1156 Not tainted [10882.424400] ----------------------------- [10882.424423] net/netlink/genetlink.c:1940 RCU-list traversed in non-reader section!! [10882.424469] other info that might help us debug this: [10882.424500] rcu_scheduler_active = 2, debug_locks = 1 [10882.424744] 2 locks held by ip/15677: [10882.424791] #0: ffffffffb6b491b0 (cb_lock){++++}-{3:3}, at: genl_rcv (net/netlink/genetlink.c:1219) [10882.426334] #1: ffffffffb6b49248 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg (net/netlink/genetlink.c:61 net/netlink/genetlink.c:57 net/netlink/genetlink.c:1209) [10882.426465] stack backtrace: [10882.426805] CPU: 14 UID: 0 PID: 15677 Comm: ip Not tainted 6.12.0-rc2-virtme #1156 [10882.426919] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [10882.427046] Call Trace: [10882.427131] <TASK> [10882.427244] dump_stack_lvl (lib/dump_stack.c:123) [10882.427335] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822) [10882.427387] genlmsg_multicast_allns (net/netlink/genetlink.c:1940 (discriminator 7) net/netlink/genetlink.c:1977 (discriminator 7)) [10882.427436] l2tp_tunnel_notify.constprop.0 (net/l2tp/l2tp_netlink.c:119) l2tp_netlink [10882.427683] l2tp_nl_cmd_tunnel_create (net/l2tp/l2tp_netlink.c:253) l2tp_netlink [10882.427748] genl_family_rcv_msg_doit (net/netlink/genetlink.c:1115) [10882.427834] genl_rcv_msg (net/netlink/genetlink.c:1195 net/netlink/genetlink.c:1210) [10882.427877] ? __pfx_l2tp_nl_cmd_tunnel_create (net/l2tp/l2tp_netlink.c:186) l2tp_netlink [10882.427927] ? __pfx_genl_rcv_msg (net/netlink/genetlink.c:1201) [10882.427959] netlink_rcv_skb (net/netlink/af_netlink.c:2551) [10882.428069] genl_rcv (net/netlink/genetlink.c:1220) [10882.428095] netlink_unicast (net/netlink/af_netlink.c:1332 net/netlink/af_netlink.c:1357) [10882.428140] netlink_sendmsg (net/netlink/af_netlink.c:1901) [10882.428210] ____sys_sendmsg (net/socket.c:729 (discriminator 1) net/socket.c:744 (discriminator 1) net/socket.c:2607 (discriminator 1)) Fixes: 33f72e6f0c67 ("l2tp : multicast notification to the registered listeners") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Cc: Tom Parkin <tparkin@katalix.com> Cc: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20241011171217.3166614-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15net: dsa: mv88e6xxx: Fix the max_vid definition for the MV88E6361Peter Rashleigh
According to the Marvell datasheet the 88E6361 has two VTU pages (4k VIDs per page) so the max_vid should be 8191, not 4095. In the current implementation mv88e6xxx_vtu_walk() gives unexpected results because of this error. I verified that mv88e6xxx_vtu_walk() works correctly on the MV88E6361 with this patch in place. Fixes: 12899f299803 ("net: dsa: mv88e6xxx: enable support for 88E6361 switch") Signed-off-by: Peter Rashleigh <peter@rashleigh.ca> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241014204342.5852-1-peter@rashleigh.ca Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-15drm/msm/a6xx+: Insert a fence wait before SMMU table updateRob Clark
The CP_SMMU_TABLE_UPDATE _should_ be waiting for idle, but on some devices (x1-85, possibly others), it seems to pass that barrier while there are still things in the event completion FIFO waiting to be written back to memory. Work around that by adding a fence wait before context switch. The CP_EVENT_WRITE that writes the fence is the last write from a submit, so seeing this value hit memory is a reliable indication that it is safe to proceed with the context switch. v2: Only emit CP_WAIT_TIMESTAMP on a7xx, as it is not supported on a6xx. Conversely, I've not been able to reproduce this issue on a6xx, so hopefully it is limited to a7xx, or perhaps just certain a7xx devices. Fixes: af66706accdf ("drm/msm/a6xx: Add skeleton A7xx support") Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/63 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
2024-10-15net: bcmasp: fix potential memory leak in bcmasp_xmit()Wang Hai
The bcmasp_xmit() returns NETDEV_TX_OK without freeing skb in case of mapping fails, add dev_kfree_skb() to fix it. Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20241014145901.48940-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>