pm24.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2023-06-26	Merge tag 'v6.5/vfs.mount' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: "This contains the work to extend move_mount() to allow adding a mount beneath the topmost mount of a mount stack. There are two LWN articles about this. One covers the original patch series in [1]. The other in [2] summarizes the session and roughly the discussion between Al and me at LSFMM. The second article also goes into some good questions from attendees. Since all details are found in the relevant commit with a technical dive into semantics and locking at the end I'm only adding the motivation and core functionality for this from commit message and leave out the invasive details. The code is also heavily commented and annotated as well which was explicitly requested. TL;DR: > mount -t ext4 /dev/sda /mnt \| └─/mnt /dev/sda ext4 > mount --beneath -t xfs /dev/sdb /mnt \| └─/mnt /dev/sdb xfs └─/mnt /dev/sda ext4 > umount /mnt \| └─/mnt /dev/sdb xfs The longer motivation is that various distributions are adding or are in the process of adding support for system extensions and in the future configuration extensions through various tools. A more detailed explanation on system and configuration extensions can be found on the manpage which is listed below at [3]. System extension images may – dynamically at runtime — extend the /usr/ and /opt/ directory hierarchies with additional files. This is particularly useful on immutable system images where a /usr/ and/or /opt/ hierarchy residing on a read-only file system shall be extended temporarily at runtime without making any persistent modifications. When one or more system extension images are activated, their /usr/ and /opt/ hierarchies are combined via overlayfs with the same hierarchies of the host OS, and the host /usr/ and /opt/ overmounted with it ("merging"). When they are deactivated, the mount point is disassembled — again revealing the unmodified original host version of the hierarchy ("unmerging"). Merging thus makes the extension's resources suddenly appear below the /usr/ and /opt/ hierarchies as if they were included in the base OS image itself. Unmerging makes them disappear again, leaving in place only the files that were shipped with the base OS image itself. System configuration images are similar but operate on directories containing system or service configuration. On nearly all modern distributions mount propagation plays a crucial role and the rootfs of the OS is a shared mount in a peer group (usually with peer group id 1): TARGET SOURCE FSTYPE PROPAGATION MNT_ID PARENT_ID / / ext4 shared:1 29 1 On such systems all services and containers run in a separate mount namespace and are pivot_root()ed into their rootfs. A separate mount namespace is almost always used as it is the minimal isolation mechanism services have. But usually they are even much more isolated up to the point where they almost become indistinguishable from containers. Mount propagation again plays a crucial role here. The rootfs of all these services is a slave mount to the peer group of the host rootfs. This is done so the service will receive mount propagation events from the host when certain files or directories are updated. In addition, the rootfs of each service, container, and sandbox is also a shared mount in its separate peer group: TARGET SOURCE FSTYPE PROPAGATION MNT_ID PARENT_ID / / ext4 shared:24 master:1 71 47 For people not too familiar with mount propagation, the master:1 means that this is a slave mount to peer group 1. Which as one can see is the host rootfs as indicated by shared:1 above. The shared:24 indicates that the service rootfs is a shared mount in a separate peer group with peer group id 24. A service may run other services. Such nested services will also have a rootfs mount that is a slave to the peer group of the outer service rootfs mount. For containers things are just slighly different. A container's rootfs isn't a slave to the service's or host rootfs' peer group. The rootfs mount of a container is simply a shared mount in its own peer group: TARGET SOURCE FSTYPE PROPAGATION MNT_ID PARENT_ID /home/ubuntu/debian-tree / ext4 shared:99 61 60 So whereas services are isolated OS components a container is treated like a separate world and mount propagation into it is restricted to a single well known mount that is a slave to the peer group of the shared mount /run on the host: TARGET SOURCE FSTYPE PROPAGATION MNT_ID PARENT_ID /propagate/debian-tree /run/host/incoming tmpfs master:5 71 68 Here, the master:5 indicates that this mount is a slave to the peer group with peer group id 5. This allows to propagate mounts into the container and served as a workaround for not being able to insert mounts into mount namespaces directly. But the new mount api does support inserting mounts directly. For the interested reader the blogpost in [4] might be worth reading where I explain the old and the new approach to inserting mounts into mount namespaces. Containers of course, can themselves be run as services. They often run full systems themselves which means they again run services and containers with the exact same propagation settings explained above. The whole system is designed so that it can be easily updated, including all services in various fine-grained ways without having to enter every single service's mount namespace which would be prohibitively expensive. The mount propagation layout has been carefully chosen so it is possible to propagate updates for system extensions and configurations from the host into all services. The simplest model to update the whole system is to mount on top of /usr, /opt, or /etc on the host. The new mount on /usr, /opt, or /etc will then propagate into every service. This works cleanly the first time. However, when the system is updated multiple times it becomes necessary to unmount the first update on /opt, /usr, /etc and then propagate the new update. But this means, there's an interval where the old base system is accessible. This has to be avoided to protect against downgrade attacks. The vfs already exposes a mechanism to userspace whereby mounts can be mounted beneath an existing mount. Such mounts are internally referred to as "tucked". The patch series exposes the ability to mount beneath a top mount through the new MOVE_MOUNT_BENEATH flag for the move_mount() system call. This allows userspace to seamlessly upgrade mounts. After this series the only thing that will have changed is that mounting beneath an existing mount can be done explicitly instead of just implicitly. The crux is that the proposed mechanism already exists and that it is so powerful as to cover cases where mounts are supposed to be updated with new versions. Crucially, it offers an important flexibility. Namely that updates to a system may either be forced or can be delayed and the umount of the top mount be left to a service if it is a cooperative one" Link: https://lwn.net/Articles/927491 [1] Link: https://lwn.net/Articles/934094 [2] Link: https://man7.org/linux/man-pages/man8/systemd-sysext.8.html [3] Link: https://brauner.io/2023/02/28/mounting-into-mount-namespaces.html [4] Link: https://github.com/flatcar/sysext-bakery Link: https://fedoraproject.org/wiki/Changes/Unified_Kernel_Support_Phase_1 Link: https://fedoraproject.org/wiki/Changes/Unified_Kernel_Support_Phase_2 Link: https://github.com/systemd/systemd/pull/26013 * tag 'v6.5/vfs.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: allow to mount beneath top mount fs: use a for loop when locking a mount fs: properly document __lookup_mnt() fs: add path_mounted()
2023-06-26	Merge tag 'v6.5/vfs.file' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs file handling updates from Christian Brauner: "This contains Amir's work to fix a long-standing problem where an unprivileged overlayfs mount can be used to avoid fanotify permission events that were requested for an inode or superblock on the underlying filesystem. Some background about files opened in overlayfs. If a file is opened in overlayfs @file->f_path will refer to a "fake" path. What this means is that while @file->f_inode will refer to inode of the underlying layer, @file->f_path refers to an overlayfs {dentry,vfsmount} pair. The reasons for doing this are out of scope here but it is the reason why the vfs has been providing the open_with_fake_path() helper for overlayfs for very long time now. So nothing new here. This is for sure not very elegant and everyone including the overlayfs maintainers agree. Improving this significantly would involve more fragile and potentially rather invasive changes. In various codepaths access to the path of the underlying filesystem is needed for such hybrid file. The best example is fsnotify where this becomes security relevant. Passing the overlayfs @file->f_path->dentry will cause fsnotify to skip generating fsnotify events registered on the underlying inode or superblock. To fix this we extend the vfs provided open_with_fake_path() concept for overlayfs to create a backing file container that holds the real path and to expose a helper that can be used by relevant callers to get access to the path of the underlying filesystem through the new file_real_path() helper. This pattern is similar to what we do in d_real() and d_real_inode(). The first beneficiary is fsnotify and fixes the security sensitive problem mentioned above. There's a couple of nice cleanups included as well. Over time, the old open_with_fake_path() helper added specifically for overlayfs a long time ago started to get used in other places such as cachefiles. Even though cachefiles have nothing to do with hybrid files. The only reason cachefiles used that concept was that files opened with open_with_fake_path() aren't charged against the caller's open file limit by raising FMODE_NOACCOUNT. It's just mere coincidence that both overlayfs and cachefiles need to ensure to not overcharge the caller for their internal open calls. So this work disentangles FMODE_NOACCOUNT use cases and backing file use-cases by adding the FMODE_BACKING flag which indicates that the file can be used to retrieve the backing file of another filesystem. (Fyi, Jens will be sending you a really nice cleanup from Christoph that gets rid of 3 FMODE_* flags otherwise this would be the last fmode_t bit we'd be using.) So now overlayfs becomes the sole user of the renamed open_with_fake_path() helper which is now named backing_file_open(). For internal kernel users such as cachefiles that are only interested in FMODE_NOACCOUNT but not in FMODE_BACKING we add a new kernel_file_open() helper which opens a file without being charged against the caller's open file limit. All new helpers are properly documented and clearly annotated to mention their special uses. We also rename vfs_tmpfile_open() to kernel_tmpfile_open() to clearly distinguish it from vfs_tmpfile() and align it the other kernel_() internal helpers" tag 'v6.5/vfs.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ovl: enable fsnotify events on underlying real files fs: use backing_file container for internal files with "fake" f_path fs: move kmem_cache_zalloc() into alloc_empty_file*() helpers fs: use a helper for opening kernel internal files fs: rename {vfs,kernel}_tmpfile_open()
2023-06-26	Merge tag 'v6.5/vfs.rename.locking' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs rename locking updates from Christian Brauner: "This contains the work from Jan to fix problems with cross-directory renames originally reported in [1]. To quickly sum it up some filesystems (so far we know at least about ext4, udf, f2fs, ocfs2, likely also reiserfs, gfs2 and others) need to lock the directory when it is being renamed into another directory. This is because we need to update the parent pointer in the directory in that case and if that races with other operations on the directory, in particular a conversion from one directory format into another, bad things can happen. So far we've done the locking in the filesystem code but recently Darrick pointed out in [2] that the RENAME_EXCHANGE case was missing. That one is particularly nasty because RENAME_EXCHANGE can arbitrarily mix regular files and directories and proper lock ordering is not achievable in the filesystems alone. This patch set adds locking into vfs_rename() so that not only parent directories but also moved inodes, regardless of whether they are directories or not, are locked when calling into the filesystem. This means establishing a locking order for unrelated directories. New helpers are added for this purpose and our documentation is updated to cover this in detail. The locking is now actually easier to follow as we now always lock source and target. We've always locked the target independent of whether it was a directory or file and we've always locked source if it was a regular file. The exact details for why this came about can be found in [3] and [4]" Link: https://lore.kernel.org/all/20230117123735.un7wbamlbdihninm@quack3 [1] Link: https://lore.kernel.org/all/20230517045836.GA11594@frogsfrogsfrogs [2] Link: https://lore.kernel.org/all/20230526-schrebergarten-vortag-9cd89694517e@brauner [3] Link: https://lore.kernel.org/all/20230530-seenotrettung-allrad-44f4b00139d4@brauner [4] * tag 'v6.5/vfs.rename.locking' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: Restrict lock_two_nondirectories() to non-directory inodes fs: Lock moved directories fs: Establish locking order for unrelated directories Revert "f2fs: fix potential corruption when moving a directory" Revert "udf: Protect rename against modification of moved directory" ext4: Remove ext4 locking of moved directory
2023-06-26	Merge tag 'v6.5/vfs.misc' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "Miscellaneous features, cleanups, and fixes for vfs and individual fs Features: - Use mode 0600 for file created by cachefilesd so it can be run by unprivileged users. This aligns them with directories which are already created with mode 0700 by cachefilesd - Reorder a few members in struct file to prevent some false sharing scenarios - Indicate that an eventfd is used a semaphore in the eventfd's fdinfo procfs file - Add a missing uapi header for eventfd exposing relevant uapi defines - Let the VFS protect transitions of a superblock from read-only to read-write in addition to the protection it already provides for transitions from read-write to read-only. Protecting read-only to read-write transitions allows filesystems such as ext4 to perform internal writes, keeping writers away until the transition is completed Cleanups: - Arnd removed the architecture specific arch_report_meminfo() prototypes and added a generic one into procfs.h. Note, we got a report about a warning in amdpgpu codepaths that suggested this was bisectable to this change but we concluded it was a false positive - Remove unused parameters from split_fs_names() - Rename put_and_unmap_page() to unmap_and_put_page() to let the name reflect the order of the cleanup operation that has to unmap before the actual put - Unexport buffer_check_dirty_writeback() as it is not used outside of block device aops - Stop allocating aio rings from highmem - Protecting read-{only,write} transitions in the VFS used open-coded barriers in various places. Replace them with proper little helpers and document both the helpers and all barrier interactions involved when transitioning between read-{only,write} states - Use flexible array members in old readdir codepaths Fixes: - Use the correct type __poll_t for epoll and eventfd - Replace all deprecated strlcpy() invocations, whose return value isn't checked with an equivalent strscpy() call - Fix some kernel-doc warnings in fs/open.c - Reduce the stack usage in jffs2's xattr codepaths finally getting rid of this: fs/jffs2/xattr.c:887:1: error: the frame size of 1088 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] royally annoying compilation warning - Use __FMODE_NONOTIFY instead of FMODE_NONOTIFY where an int and not fmode_t is required to avoid fmode_t to integer degradation warnings - Create coredumps with O_WRONLY instead of O_RDWR. There's a long explanation in that commit how O_RDWR is actually a bug which we found out with the help of Linus and git archeology - Fix "no previous prototype" warnings in the pipe codepaths - Add overflow calculations for remap_verify_area() as a signed addition overflow could be triggered in xfstests - Fix a null pointer dereference in sysv - Use an unsigned variable for length calculations in jfs avoiding compilation warnings with gcc 13 - Fix a dangling pipe pointer in the watch queue codepath - The legacy mount option parser provided as a fallback by the VFS for filesystems not yet converted to the new mount api did prefix the generated mount option string with a leading ',' causing issues for some filesystems - Fix a repeated word in a comment in fs.h - autofs: Update the ctime when mtime is updated as mandated by POSIX" * tag 'v6.5/vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (27 commits) readdir: Replace one-element arrays with flexible-array members fs: Provide helpers for manipulating sb->s_readonly_remount fs: Protect reconfiguration of sb read-write from racing writes eventfd: add a uapi header for eventfd userspace APIs autofs: set ctime as well when mtime changes on a dir eventfd: show the EFD_SEMAPHORE flag in fdinfo fs/aio: Stop allocating aio rings from HIGHMEM fs: Fix comment typo fs: unexport buffer_check_dirty_writeback fs: avoid empty option when generating legacy mount string watch_queue: prevent dangling pipe pointer fs.h: Optimize file struct to prevent false sharing highmem: Rename put_and_unmap_page() to unmap_and_put_page() cachefiles: Allow the cache to be non-root init: remove unused names parameter in split_fs_names() jfs: Use unsigned variable for length calculations fs/sysv: Null check to prevent null-ptr-deref bug fs: use UB-safe check for signed addition overflow in remap_verify_area procfs: consolidate arch_report_meminfo declaration fs: pipe: reveal missing function protoypes ...
2023-06-26	Merge tag 'v6.5/fs.ntfs' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull ntfs updates from Christian Brauner: "A pile of various smaller fixes for ntfs" * tag 'v6.5/fs.ntfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ntfs: do not dereference a null ctx on error ntfs: Remove unneeded semicolon ntfs: Correct spelling ntfs: remove redundant initialization to pointer cb_sb_start
2023-06-26	Merge tag 'auxdisplay-6.5' of https://github.com/ojeda/linux	Linus Torvalds
	Pull auxdisplay update from Miguel Ojeda: "A single cleanup for i2c drivers to switch them back to use '.probe()'" * tag 'auxdisplay-6.5' of https://github.com/ojeda/linux: auxdisplay: Switch i2c drivers back to use .probe()
2023-06-26	Merge tag 'rust-6.5' of https://github.com/Rust-for-Linux/linux	Linus Torvalds
	Pull rust updates from Miguel Ojeda: "A fairly small one in terms of feature additions. Most of the changes in terms of lines come from the upgrade to the new version of the toolchain (which in turn is big due to the vendored 'alloc' crate). Upgrade to Rust 1.68.2: - This is the first such upgrade, and we will try to update it often from now on, in order to remain close to the latest release, until a minimum version (which is "in the future") can be established. The upgrade brings the stabilization of 4 features we used (and 2 more that we used in our old 'rust' branch). Commit 3ed03f4da06e ("rust: upgrade to Rust 1.68.2") contains the details and rationale. pin-init API: - Several internal improvements and fixes to the pin-init API, e.g. allowing to use 'Self' in a struct definition with '#[pin_data]'. 'error' module: - New 'name()' method for the 'Error' type (with 'errname()' integration), used to implement the 'Debug' trait for 'Error'. - Add error codes from 'include/linux/errno.h' to the list of Rust 'Error' constants. - Allow specifying error type on the 'Result' type (with the default still being our usual 'Error' type). 'str' module: - 'TryFrom' implementation for 'CStr', and new 'to_cstring()' method based on it. 'sync' module: - Implement 'AsRef' trait for 'Arc', allowing to use 'Arc' in code that is generic over smart pointer types. - Add 'ptr_eq' method to 'Arc' for easier, less error prone comparison between two 'Arc' pointers. - Reword the 'Send' safety comment for 'Arc', and avoid referencing it from the 'Sync' one. 'task' module: - Implement 'Send' marker for 'Task'. 'types' module: - Implement 'Send' and 'Sync' markers for 'ARef<T>' when 'T' is 'AlwaysRefCounted', 'Send' and 'Sync'. Other changes: - Documentation improvements and '.gitattributes' change to start using the Rust diff driver" * tag 'rust-6.5' of https://github.com/Rust-for-Linux/linux: rust: error: `impl Debug` for `Error` with `errname()` integration rust: task: add `Send` marker to `Task` rust: specify when `ARef` is thread safe rust: sync: reword the `Arc` safety comment for `Sync` rust: sync: reword the `Arc` safety comment for `Send` rust: sync: implement `AsRef<T>` for `Arc<T>` rust: sync: add `Arc::ptr_eq` rust: error: add missing error codes rust: str: add conversion from `CStr` to `CString` rust: error: allow specifying error type on `Result` rust: init: update macro expansion example in docs rust: macros: replace Self with the concrete type in #[pin_data] rust: macros: refactor generics parsing of `#[pin_data]` into its own function rust: macros: fix usage of `#[allow]` in `quote!` docs: rust: point directly to the standalone installers .gitattributes: set diff driver for Rust source code files rust: upgrade to Rust 1.68.2 rust: arc: fix intra-doc link in `Arc<T>::init` rust: alloc: clarify what is the upstream version
2023-06-26	Merge tag 's390-6.4-4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Use correct type for size of memory allocated for ELF core header on kernel crash. - Fix insecure W+X mapping warning when KASAN shadow memory range is not aligned on page boundary. - Avoid allocation of short by one page KASAN shadow memory when the original memory range is less than (PAGE_SIZE << 3). - Fix virtual vs physical address confusion in physical memory enumerator. It is not a real issue, since virtual and physical addresses are currently the same. - Set CONFIG_NET_TC_SKB_EXT=y in s390 config files as it is required for offloading TC as well as bridges on switchdev capable ConnectX devices. * tag 's390-6.4-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/defconfigs: set CONFIG_NET_TC_SKB_EXT=y s390/boot: fix physmem_info virtual vs physical address confusion s390/kasan: avoid short by one page shadow memory s390/kasan: fix insecure W+X mapping warning s390/crash: use the correct type for memory allocation
2023-06-26	Merge tag 'nios2_updates_for_v6.5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux Pull nios2 updates from Dinh Nguyen: - Convert pgtable constructor/destructors to ptdesc - Replace strlcpy with strscpy * tag 'nios2_updates_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: Replace all non-returning strlcpy with strscpy nios2: Convert __pte_free_tlb() to use ptdescs
2023-06-25	Linux 6.4v6.4	Linus Torvalds

2023-06-25	Merge tag 'i2c-for-6.4-rc8' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Nothing fancy. Two driver and one DT binding fix" * tag 'i2c-for-6.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: imx-lpi2c: fix type char overflow issue when calculating the clock cycle i2c: qup: Add missing unwind goto in qup_i2c_probe() dt-bindings: i2c: opencores: Add missing type for "regstep"
2023-06-25	Merge tag 'perf_urgent_for_v6.4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Drop the __weak attribute from a function prototype as it otherwise leads to the function getting replaced by a dummy stub - Fix the umask value setup of the frontend event as former is different on two Intel cores * tag 'perf_urgent_for_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix the FRONTEND encoding on GNR and MTL perf/core: Drop __weak attribute from arch_perf_update_userpage() prototype
2023-06-25	Merge tag 'objtool_urgent_for_v6.4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Borislav Petkov: - Add a ORC format hash to vmlinux and modules in order for other tools which use it, to detect changes to it and adapt accordingly * tag 'objtool_urgent_for_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/unwind/orc: Add ELF section with ORC version identifier
2023-06-25	Merge tag 'x86_urgent_for_v6.4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Do not use set_pgd() when updating the KASLR trampoline pgd entry because that updates the user PGD too on KPTI builds, resulting in memory corruption - Prevent a panic in the IO-APIC setup code due to conflicting command line parameters * tag 'x86_urgent_for_v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Fix kernel panic when booting with intremap=off and x2apic_phys x86/mm: Avoid using set_pgd() outside of real PGD pages
2023-06-23	Merge tag 'drm-fixes-2023-06-23' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds
	Pull drm fixes from Dave Airlie: "Very quiet last week, just two misc fixes, one dp-mst and one qaic: qaic: - dma-buf import fix dp-mst: - fix NULL ptr deref" [ It turns out it was a quiet week because Alex Deucher hadn't sent in his pending AMD changes. So they are coming next - Linus ] * tag 'drm-fixes-2023-06-23' of git://anongit.freedesktop.org/drm/drm: drm: use mgr->dev in drm_dbg_kms in drm_dp_add_payload_part2 accel/qaic: Call DRM helper function to destroy prime GEM
2023-06-23	Merge tag 'arm-fixes-6.4-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "The final bug fixes for Qualcomm and Rockchips came in, all of them for devicetree files: - Devices on Qualcomm SC7180/SC7280 that are cache coherent are now marked so correctly to fix a regression after a change in kernel behavior - Rockchips has a few minor changes for correctness of regulator and cache properties, as well as fixes for incorrect behavior of the RK3568 PCI controller and reset pins on two boards" * tag 'arm-fixes-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devices arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdor arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDP dt-bindings: firmware: qcom,scm: Document that SCM can be dma-coherent arm64: dts: rockchip: Fix rk356x PCIe register and range mappings arm64: dts: rockchip: fix button reset pin for nanopi r5c arm64: dts: rockchip: fix nEXTRST on SOQuartz arm64: dts: rockchip: add missing cache properties arm64: dts: rockchip: fix USB regulator on ROCK64
2023-06-23	Merge tag 'for-6.4-rc7-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "Unfortunately the recent u32 overflow fix was not complete, there was one conversion left, assertion not triggered by my tests but caught by Qu's fstests case. The "cleanup for later" has been promoted to a proper fix and wraps all uses of the stripe left shift so the diffstat has grown but leaves no potentially problematic uses. We should have done it that way before, sorry" * tag 'for-6.4-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix remaining u32 overflows when left shifting stripe_nr
2023-06-23	Merge tag 'block-6.4-2023-06-23' of git://git.kernel.dk/linux	Linus Torvalds
	Pull block fix from Jens Axboe: "It's apparently the week of 'fixup something from last week', because the same is true for this block pull request. Fix up a lock grab that needs to be IRQ saving, rather than just IRQ disabling, in the block cgroup code" * tag 'block-6.4-2023-06-23' of git://git.kernel.dk/linux: block: make sure local irq is disabled when calling __blkcg_rstat_flush
2023-06-23	Merge tag 'iommu-fix-v6.4-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fix from Joerg Roedel: - Fix potential memory leak in AMD IOMMU domain allocation path * tag 'iommu-fix-v6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix possible memory leak of 'domain'
2023-06-23	Merge tag 'sound-6.4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Three oneliner fixes: one for a thinko in SOF SoundWire code and two HD-audio quirks for ASUS laptops. All device-specific and should be safe to apply" * tag 'sound-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Add quirk for ASUS ROG GV601V ALSA: hda/realtek: Add quirk for ASUS ROG G634Z ASoC: intel: sof_sdw: Fixup typo in device link checking
2023-06-23	Merge tag 'gpio-fixes-for-v6.4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix IRQ initialization in gpiochip_irqchip_add_domain() - add a missing return value check for platform_get_irq() in gpio-sifive - don't free irq_domains which GPIOLIB does not manage * tag 'gpio-fixes-for-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpiolib: Fix irq_domain resource tracking for gpiochip_irqchip_add_domain() gpio: sifive: add missing check for platform_get_irq gpiolib: Fix GPIO chip IRQ initialization restriction
2023-06-23	Merge tag 'qcom-arm64-fixes-for-6.4-2' of ↵	Arnd Bergmann
	https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes One last Qualcomm ARM64 DeviceTree fix for v6.4 Changes related to cache management for DMA memory caused WiFi to stop work on SC7180 and SC7280 based products, using TF-A. These changes marks the relevant device dma-coherent to correct the behavior. * tag 'qcom-arm64-fixes-for-6.4-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devices arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdor arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDP dt-bindings: firmware: qcom,scm: Document that SCM can be dma-coherent Link: https://lore.kernel.org/r/20230622203248.106422-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-23	workqueue: clean up WORK_* constant types, clarify masking	Linus Torvalds
	Dave Airlie reports that gcc-13.1.1 has started complaining about some of the workqueue code in 32-bit arm builds: kernel/workqueue.c: In function ‘get_work_pwq’: kernel/workqueue.c:713:24: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 713 \| return (void )(data & WORK_STRUCT_WQ_DATA_MASK); \| ^ [ ... a couple of other cases ... ] and while it's not immediately clear exactly why gcc started complaining about it now, I suspect it's some C23-induced enum type handlign fixup in gcc-13 is the cause. Whatever the reason for starting to complain, the code and data types are indeed disgusting enough that the complaint is warranted. The wq code ends up creating various "helper constants" (like that WORK_STRUCT_WQ_DATA_MASK) using an enum type, which is all kinds of confused. The mask needs to be 'unsigned long', not some unspecified enum type. To make matters worse, the actual "mask and cast to a pointer" is repeated a couple of times, and the cast isn't even always done to the right pointer, but - as the error case above - to a 'void ' with then the compiler finishing the job. That's now how we roll in the kernel. So create the masks using the proper types rather than some ambiguous enumeration, and use a nice helper that actually does the type conversion in one well-defined place. Incidentally, this magically makes clang generate better code. That, admittedly, is really just a sign of clang having been seriously confused before, and cleaning up the typing unconfuses the compiler too. Reported-by: Dave Airlie <airlied@gmail.com> Link: https://lore.kernel.org/lkml/CAPM=9twNnV4zMCvrPkw3H-ajZOH-01JVh_kDrxdPYQErz8ZTdA@mail.gmail.com/ Cc: Arnd Bergmann <arnd@arndb.de> Cc: Tejun Heo <tj@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-23	i2c: imx-lpi2c: fix type char overflow issue when calculating the clock cycle	Clark Wang
	Claim clkhi and clklo as integer type to avoid possible calculation errors caused by data overflow. Fixes: a55fa9d0e42e ("i2c: imx-lpi2c: add low power i2c bus driver") Signed-off-by: Clark Wang <xiaoning.wang@nxp.com> Signed-off-by: Carlos Song <carlos.song@nxp.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-06-23	i2c: qup: Add missing unwind goto in qup_i2c_probe()	Shuai Jiang
	Smatch Warns: drivers/i2c/busses/i2c-qup.c:1784 qup_i2c_probe() warn: missing unwind goto? The goto label "fail_runtime" and "fail" will disable qup->pclk, but here qup->pclk failed to obtain, in order to be consistent, change the direct return to goto label "fail_dma". Fixes: 9cedf3b2f099 ("i2c: qup: Add bam dma capabilities") Signed-off-by: Shuai Jiang <d202180596@hust.edu.cn> Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org> Cc: <stable@vger.kernel.org> # v4.6+
2023-06-23	dt-bindings: i2c: opencores: Add missing type for "regstep"	Rob Herring
	"regstep" may be deprecated, but it still needs a type. Fixes: 8ad69f490516 ("dt-bindings: i2c: convert ocores binding to yaml") Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Peter Korsgaard <peter@korsgaard.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2023-06-23	Merge tag 'drm-misc-fixes-2023-06-21' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes for v6.4: - Qaic imported dma-buf fix. - Fix null pointer deref when printing a dp-mst message. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <dev@lankhorst.se> Link: https://patchwork.freedesktop.org/patch/msgid/e96b1965-ba67-7cc5-2358-826eb5b9b998@lankhorst.se
2023-06-22	Merge tag 'net-6.4-rc8' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from ipsec, bpf, mptcp and netfilter. Current release - regressions: - netfilter: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain - eth: mlx5e: - fix scheduling of IPsec ASO query while in atomic - free IRQ rmap and notifier on kernel shutdown Current release - new code bugs: - phy: manual remove LEDs to ensure correct ordering Previous releases - regressions: - mptcp: fix possible divide by zero in recvmsg() - dsa: revert "net: phy: dp83867: perform soft reset and retain established link" Previous releases - always broken: - sched: netem: acquire qdisc lock in netem_change() - bpf: - fix verifier id tracking of scalars on spill - fix NULL dereference on exceptions - accept function names that contain dots - netfilter: disallow element updates of bound anonymous sets - mptcp: ensure listener is unhashed before updating the sk status - xfrm: - add missed call to delete offloaded policies - fix inbound ipv4/udp/esp packets to UDPv6 dualstack sockets - selftests: fixes for FIPS mode - dsa: mt7530: fix multiple CPU ports, BPDU and LLDP handling - eth: sfc: use budget for TX completions Misc: - wifi: iwlwifi: add support for SO-F device with PCI id 0x7AF0" * tag 'net-6.4-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (74 commits) revert "net: align SO_RCVMARK required privileges with SO_MARK" net: wwan: iosm: Convert single instance struct member to flexible array sch_netem: acquire qdisc lock in netem_change() selftests: forwarding: Fix race condition in mirror installation wifi: mac80211: report all unusable beacon frames mptcp: ensure listener is unhashed before updating the sk status mptcp: drop legacy code around RX EOF mptcp: consolidate fallback and non fallback state machine mptcp: fix possible list corruption on passive MPJ mptcp: fix possible divide by zero in recvmsg() mptcp: handle correctly disconnect() failures bpf: Force kprobe multi expected_attach_type for kprobe_multi link bpf/btf: Accept function names that contain dots Revert "net: phy: dp83867: perform soft reset and retain established link" net: mdio: fix the wrong parameters netfilter: nf_tables: Fix for deleting base chains with payload netfilter: nfnetlink_osf: fix module autoload netfilter: nf_tables: drop module reference after updating chain netfilter: nf_tables: disallow timeout for anonymous sets netfilter: nf_tables: disallow updates of anonymous sets ...
2023-06-22	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm fixes from Paolo Bonzini: "ARM: - Correctly save/restore PMUSERNR_EL0 when host userspace is using PMU counters directly - Fix GICv2 emulation on GICv3 after the locking rework - Don't use smp_processor_id() in kvm_pmu_probe_armpmu(), and document why Generic: - Avoid setting page table entries pointing to a deleted memslot if a host page table entry is changed concurrently with the deletion" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Avoid illegal stage2 mapping on invalid memory slot KVM: arm64: Use raw_smp_processor_id() in kvm_pmu_probe_armpmu() KVM: arm64: Restore GICv2-on-GICv3 functionality KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loaded KVM: arm64: PMU: Restore the host's PMUSERENR_EL0
2023-06-22	Merge tag 'powerpc-6.4-5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - Disable IRQs when switching mm in exit_lazy_flush_tlb() called from exit_mmap() Thanks to Nicholas Piggin and Sachin Sant. * tag 'powerpc-6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/radix: Fix exit lazy tlb mm switch with irqs enabled
2023-06-22	Merge tag 'pci-v6.4-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fix from Bjorn Helgaas: - Transfer Intel LGM GW PCIe maintenance from Rahul Tanwar to Chuanhua Lei (Zhu YiXin) * tag 'pci-v6.4-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: MAINTAINERS: Add Chuanhua Lei as Intel LGM GW PCIe maintainer
2023-06-22	Merge tag 'mmc-v6.4-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: - Fix support for deferred probing for several host drivers - litex_mmc: Use async probe as it's common for all mmc hosts - meson-gx: Fix bug when scheduling while atomic - mmci_stm32: Fix max busy timeout calculation - sdhci-msm: Disable broken 64-bit DMA on MSM8916 * tag 'mmc-v6.4-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: usdhi60rol0: fix deferred probing mmc: sunxi: fix deferred probing mmc: sh_mmcif: fix deferred probing mmc: sdhci-spear: fix deferred probing mmc: sdhci-acpi: fix deferred probing mmc: owl: fix deferred probing mmc: omap_hsmmc: fix deferred probing mmc: omap: fix deferred probing mmc: mvsdio: fix deferred probing mmc: mtk-sd: fix deferred probing mmc: meson-gx: fix deferred probing mmc: bcm2835: fix deferred probing mmc: litex_mmc: set PROBE_PREFER_ASYNCHRONOUS mmc: meson-gx: remove redundant mmc_request_done() call from irq context mmc: mmci: stm32: fix max busy timeout calculation mmc: sdhci-msm: Disable broken 64-bit DMA on MSM8916
2023-06-22	Merge tag 'platform-drivers-x86-v6.4-5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fix from Hans de Goede: "One small fix for an AMD PMF driver issue which is causing issues for users of just released AMD laptop models" * tag 'platform-drivers-x86-v6.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/amd/pmf: Register notify handler only if SPS is enabled
2023-06-22	Merge tag 'io_uring-6.4-2023-06-21' of git://git.kernel.dk/linux	Linus Torvalds
	Pull io_uring fixes from Jens Axboe: "A fix for a race condition with poll removal and linked timeouts, and then a few followup fixes/tweaks for the msg_control patch from last week. Not super important, particularly the sparse fixup, as it was broken before that recent commit. But let's get it sorted for real for this release, rather than just have it broken a bit differently" * tag 'io_uring-6.4-2023-06-21' of git://git.kernel.dk/linux: io_uring/net: use the correct msghdr union member in io_sendmsg_copy_hdr io_uring/net: disable partial retries for recvmsg with cmsg io_uring/net: clear msg_controllen on partial sendmsg retry io_uring/poll: serialize poll linked timer start with poll removal
2023-06-22	Merge tag 'cgroup-for-6.4-rc7-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "It's late but here are two bug fixes. Both fix problems which can be severe but are very confined in scope. The risk to most use cases should be minimal. - Fix for an old bug which triggers if a cgroup subsystem is remounted to a different hierarchy while someone is reading its cgroup.procs/tasks file. The risk is pretty low given how seldom cgroup subsystems are moved across hierarchies. - We moved cpus_read_lock() outside of cgroup internal locks a while ago but forgot to update the legacy_freezer leading to lockdep triggers. Fixed" * tag 'cgroup-for-6.4-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Do not corrupt task iteration when rebinding subsystem cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex in freezer_css_{online,offline}()
2023-06-22	Merge tag 'kvmarm-fixes-6.4-4' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.4, take #4 - Correctly save/restore PMUSERNR_EL0 when host userspace is using PMU counters directly - Fix GICv2 emulation on GICv3 after the locking rework - Don't use smp_processor_id() in kvm_pmu_probe_armpmu(), and document why...
2023-06-22	arm64: dts: qcom: sc7280: Mark SCM as dma-coherent for chrome devices	Douglas Anderson
	Just like for sc7180 devices using the Chrome bootflow (AKA trogdor and IDP), sc7280 devices using the Chrome bootflow also need their firmware marked dma-coherent. On sc7280 this wasn't causing WiFi to fail to startup, since WiFi works differently there. However, on sc7280 devices we were still getting the message at bootup after commit 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"""): qcom_scm firmware:scm: Assign memory protection call failed -22 qcom_rmtfs_mem 9c900000.memory: assign memory failed qcom_rmtfs_mem: probe of 9c900000.memory failed with error -22 We should mark SCM properly just like we did for trogdor. Fixes: 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""") Fixes: 7a1f4e7f740d ("arm64: dts: qcom: sc7280: Add basic dts/dtsi files for sc7280 soc") Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230616081440.v2.4.I21dc14a63327bf81c6bb58fe8ed91dbdc9849ee2@changeid Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-06-22	arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for trogdor	Douglas Anderson
	Trogdor devices use firmware backed by TF-A instead of Qualcomm's normal TZ. On TF-A we end up mapping memory as cacheable. Specifically, you can see in Trogdor's TF-A code [1] in qti_sip_mem_assign() that we call qti_mmap_add_dynamic_region() with MT_RO_DATA. This translates down to MT_MEMORY instead of MT_NON_CACHEABLE or MT_DEVICE. Apparently Qualcomm's normal TZ implementation maps the memory as non-cacheable. Let's add the "dma-coherent" attribute to the SCM for trogdor. Adding "dma-coherent" like this fixes WiFi on sc7180-trogdor devices. WiFi was broken as of commit 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"""). Specifically at bootup we'd get: qcom_scm firmware:scm: Assign memory protection call failed -22 qcom_rmtfs_mem 94600000.memory: assign memory failed qcom_rmtfs_mem: probe of 94600000.memory failed with error -22 From discussion on the mailing lists [2] and over IRC [3], it was determined that we should always have been tagging the SCM as dma-coherent on trogdor but that the old "invalidate" happened to make things work most of the time. Tagging it properly like this is a much more robust solution. [1] https://chromium.googlesource.com/chromiumos/third_party/arm-trusted-firmware/+/refs/heads/firmware-trogdor-13577.B/plat/qti/common/src/qti_syscall.c [2] https://lore.kernel.org/r/20230614165904.1.I279773c37e2c1ed8fbb622ca6d1397aea0023526@changeid [3] https://oftc.irclog.whitequark.org/linux-msm/2023-06-15 Fixes: 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""") Fixes: 7ec3e67307f8 ("arm64: dts: qcom: sc7180-trogdor: add initial trogdor and lazor dt") Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230616081440.v2.3.Ic62daa649b47b656b313551d646c4de9a7da4bd4@changeid Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-06-22	arm64: dts: qcom: sc7180: Mark SCM as dma-coherent for IDP	Douglas Anderson
	sc7180-idp is, for most intents and purposes, a trogdor device. Specifically, sc7180-idp is designed to run the same style of firmware as trogdor devices. This can be seen from the fact that IDP has the same "Reserved memory changes" in its device tree that trogdor has. Recently it was realized that we need to mark SCM as dma-coherent to match what trogdor's style of firmware (based on TF-A) does [1]. That means we need this dma-coherent tag on IDP as well. Without this, on newer versions of Linux, specifically those with commit 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"""), WiFi will fail to work. At bootup you'll see: qcom_scm firmware:scm: Assign memory protection call failed -22 qcom_rmtfs_mem 94600000.memory: assign memory failed qcom_rmtfs_mem: probe of 94600000.memory failed with error -22 [1] https://lore.kernel.org/r/20230615145253.1.Ic62daa649b47b656b313551d646c4de9a7da4bd4@changeid Fixes: 7bd6680b47fa ("Revert "Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()""") Fixes: f5ab220d162c ("arm64: dts: qcom: sc7180: Add remoteproc enablers") Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230616081440.v2.2.I3c17d546d553378aa8a0c68c3fe04bccea7cba17@changeid Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-06-22	dt-bindings: firmware: qcom,scm: Document that SCM can be dma-coherent	Douglas Anderson
	Trogdor devices use firmware backed by TF-A instead of Qualcomm's normal TZ. On TF-A we end up mapping memory as cacheable. Specifically, you can see in Trogdor's TF-A code [1] in qti_sip_mem_assign() that we call qti_mmap_add_dynamic_region() with MT_RO_DATA. This translates down to MT_MEMORY instead of MT_NON_CACHEABLE or MT_DEVICE. Let's allow devices like trogdor to be described properly by allowing "dma-coherent" in the SCM node. Signed-off-by: Douglas Anderson <dianders@chromium.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230616081440.v2.1.Ie79b5f0ed45739695c9970df121e11d724909157@changeid Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-06-22	KVM: Avoid illegal stage2 mapping on invalid memory slot	Gavin Shan
	We run into guest hang in edk2 firmware when KSM is kept as running on the host. The edk2 firmware is waiting for status 0x80 from QEMU's pflash device (TYPE_PFLASH_CFI01) during the operation of sector erasing or buffered write. The status is returned by reading the memory region of the pflash device and the read request should have been forwarded to QEMU and emulated by it. Unfortunately, the read request is covered by an illegal stage2 mapping when the guest hang issue occurs. The read request is completed with QEMU bypassed and wrong status is fetched. The edk2 firmware runs into an infinite loop with the wrong status. The illegal stage2 mapping is populated due to same page sharing by KSM at (C) even the associated memory slot has been marked as invalid at (B) when the memory slot is requested to be deleted. It's notable that the active and inactive memory slots can't be swapped when we're in the middle of kvm_mmu_notifier_change_pte() because kvm->mn_active_invalidate_count is elevated, and kvm_swap_active_memslots() will busy loop until it reaches to zero again. Besides, the swapping from the active to the inactive memory slots is also avoided by holding &kvm->srcu in __kvm_handle_hva_range(), corresponding to synchronize_srcu_expedited() in kvm_swap_active_memslots(). CPU-A CPU-B ----- ----- ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION) kvm_vm_ioctl_set_memory_region kvm_set_memory_region __kvm_set_memory_region kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE) kvm_invalidate_memslot kvm_copy_memslot kvm_replace_memslot kvm_swap_active_memslots (A) kvm_arch_flush_shadow_memslot (B) same page sharing by KSM kvm_mmu_notifier_invalidate_range_start : kvm_mmu_notifier_change_pte kvm_handle_hva_range __kvm_handle_hva_range kvm_set_spte_gfn (C) : kvm_mmu_notifier_invalidate_range_end Fix the issue by skipping the invalid memory slot at (C) to avoid the illegal stage2 mapping so that the read request for the pflash's status is forwarded to QEMU and emulated by it. In this way, the correct pflash's status can be returned from QEMU to break the infinite loop in the edk2 firmware. We tried a git-bisect and the first problematic commit is cd4c71835228 (" KVM: arm64: Convert to the gfn-based MMU notifier callbacks"). With this, clean_dcache_guest_page() is called after the memory slots are iterated in kvm_mmu_notifier_change_pte(). clean_dcache_guest_page() is called before the iteration on the memory slots before this commit. This change literally enlarges the racy window between kvm_mmu_notifier_change_pte() and memory slot removal so that we're able to reproduce the issue in a practical test case. However, the issue exists since commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup"). Cc: stable@vger.kernel.org # v3.9+ Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup") Reported-by: Shuai Hu <hshuai@redhat.com> Reported-by: Zhenyu Zhang <zhenyzha@redhat.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Message-Id: <20230615054259.14911-1-gshan@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-22	btrfs: fix remaining u32 overflows when left shifting stripe_nr	Qu Wenruo
	There was regression caused by a97699d1d610 ("btrfs: replace map_lookup->stripe_len by BTRFS_STRIPE_LEN") and supposedly fixed by a7299a18a179 ("btrfs: fix u32 overflows when left shifting stripe_nr"). To avoid code churn the fix was open coding the type casts but unfortunately missed one which was still possible to hit [1]. The missing place was assignment of bioc->full_stripe_logical inside btrfs_map_block(). Fix it by adding a helper that does the safe calculation of the offset and use it everywhere even though it may not be strictly necessary due to already using u64 types. This replaces all remaining "<< BTRFS_STRIPE_LEN_SHIFT" calls. [1] https://lore.kernel.org/linux-btrfs/20230622065438.86402-1-wqu@suse.com/ Fixes: a7299a18a179 ("btrfs: fix u32 overflows when left shifting stripe_nr") Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-22	block: make sure local irq is disabled when calling __blkcg_rstat_flush	Ming Lei
	When __blkcg_rstat_flush() is called from cgroup_rstat_flush*() code path, interrupt is always disabled. When we start to flush blkcg per-cpu stats list in __blkg_release() for avoiding to leak blkcg_gq's reference in commit 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq"), local irq isn't disabled yet, then lockdep warning may be triggered because the dependent cgroup locks may be acquired from irq(soft irq) handler. Fix the issue by disabling local irq always. Fixes: 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq") Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/linux-block/pz2wzwnmn5tk3pwpskmjhli6g3qly7eoknilb26of376c7kwxy@qydzpvt6zpis/T/#u Cc: stable@vger.kernel.org Cc: Jay Shin <jaeshin@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20230622084249.1208005-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-22	s390/defconfigs: set CONFIG_NET_TC_SKB_EXT=y	Niklas Schnelle
	As made explicit by commit 03a283cdc8c8 ("net/mlx5: Kconfig: Make tc offload depend on tc skb extension") tc skb extension is required for offloading tc as well as bridges on switchdev capable ConnectX devices. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-22	Merge tag 'nf-23-06-21' of ↵	Paolo Abeni
	git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net This is v3, including a crash fix for patch 01/14. The following patchset contains Netfilter/IPVS fixes for net: 1) Fix UDP segmentation with IPVS tunneled traffic, from Terin Stock. 2) Fix chain binding transaction logic, add a bound flag to rule transactions. Remove incorrect logic in nft_data_hold() and nft_data_release(). 3) Add a NFT_TRANS_PREPARE_ERROR deactivate state to deal with releasing the set/chain as a follow up to 1240eb93f061 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE") 4) Drop map element references from preparation phase instead of set destroy path, otherwise bogus EBUSY with transactions such as: flush chain ip x y delete chain ip x w where chain ip x y contains jump/goto from set elements. 5) Pipapo set type does not regard generation mask from the walk iteration. 6) Fix reference count underflow in set element reference to stateful object. 7) Several patches to tighten the nf_tables API: - disallow set element updates of bound anonymous set - disallow unbound anonymous set/chain at the end of transaction. - disallow updates of anonymous set. - disallow timeout configuration for anonymous sets. 8) Fix module reference leak in chain updates. 9) Fix nfnetlink_osf module autoload. 10) Fix deletion of basechain when NFTA_CHAIN_HOOK is specified as in iptables-nft. This Netfilter batch is larger than usual at this stage, I am aware we are fairly late in the -rc cycle, if you prefer to route them through net-next, please let me know. netfilter pull request 23-06-21 * tag 'nf-23-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: Fix for deleting base chains with payload netfilter: nfnetlink_osf: fix module autoload netfilter: nf_tables: drop module reference after updating chain netfilter: nf_tables: disallow timeout for anonymous sets netfilter: nf_tables: disallow updates of anonymous sets netfilter: nf_tables: reject unbound chain set before commit phase netfilter: nf_tables: reject unbound anonymous set before commit phase netfilter: nf_tables: disallow element updates of bound anonymous sets netfilter: nf_tables: fix underflow in object reference counter netfilter: nft_set_pipapo: .walk does not deal with generations netfilter: nf_tables: drop map element references from preparation phase netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain netfilter: nf_tables: fix chain binding transaction logic ipvs: align inner_mac_header for encapsulation ==================== Link: https://lore.kernel.org/r/20230621100731.68068-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	revert "net: align SO_RCVMARK required privileges with SO_MARK"	Maciej Żenczykowski
	This reverts commit 1f86123b9749 ("net: align SO_RCVMARK required privileges with SO_MARK") because the reasoning in the commit message is not really correct: SO_RCVMARK is used for 'reading' incoming skb mark (via cmsg), as such it is more equivalent to 'getsockopt(SO_MARK)' which has no priv check and retrieves the socket mark, rather than 'setsockopt(SO_MARK) which sets the socket mark and does require privs. Additionally incoming skb->mark may already be visible if sysctl_fwmark_reflect and/or sysctl_tcp_fwmark_accept are enabled. Furthermore, it is easier to block the getsockopt via bpf (either cgroup setsockopt hook, or via syscall filters) then to unblock it if it requires CAP_NET_RAW/ADMIN. On Android the socket mark is (among other things) used to store the network identifier a socket is bound to. Setting it is privileged, but retrieving it is not. We'd like unprivileged userspace to be able to read the network id of incoming packets (where mark is set via iptables [to be moved to bpf])... An alternative would be to add another sysctl to control whether setting SO_RCVMARK is privilged or not. (or even a MASK of which bits in the mark can be exposed) But this seems like over-engineering... Note: This is a non-trivial revert, due to later merged commit e42c7beee71d ("bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt()") which changed both 'ns_capable' into 'sockopt_ns_capable' calls. Fixes: 1f86123b9749 ("net: align SO_RCVMARK required privileges with SO_MARK") Cc: Larysa Zaremba <larysa.zaremba@intel.com> Cc: Simon Horman <simon.horman@corigine.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Eyal Birger <eyal.birger@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Patrick Rohr <prohr@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230618103130.51628-1-maze@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	net: wwan: iosm: Convert single instance struct member to flexible array	Kees Cook
	struct mux_adth actually ends with multiple struct mux_adth_dg members. This is seen both in the comments about the member: /** * struct mux_adth - Structure of the Aggregated Datagram Table Header. ... * @dg: datagramm table with variable length / and in the preparation for populating it: adth_dg_size = offsetof(struct mux_adth, dg) + ul_adb->dg_count[i] sizeof(*dg); ... adth_dg_size -= offsetof(struct mux_adth, dg); memcpy(&adth->dg, ul_adb->dg[i], adth_dg_size); This was reported as a run-time false positive warning: memcpy: detected field-spanning write (size 16) of single field "&adth->dg" at drivers/net/wwan/iosm/iosm_ipc_mux_codec.c:852 (size 8) Adjust the struct mux_adth definition and associated sizeof() math; no binary output differences are observed in the resulting object file. Reported-by: Florian Klink <flokli@flokli.de> Closes: https://lore.kernel.org/lkml/dbfa25f5-64c8-5574-4f5d-0151ba95d232@gmail.com/ Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Cc: M Chetan Kumar <m.chetan.kumar@intel.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Cc: Intel Corporation <linuxwwan@intel.com> Cc: Loic Poulain <loic.poulain@linaro.org> Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: netdev@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230620194234.never.023-kees@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	sch_netem: acquire qdisc lock in netem_change()	Eric Dumazet
	syzbot managed to trigger a divide error [1] in netem. It could happen if q->rate changes while netem_enqueue() is running, since q->rate is read twice. It turns out netem_change() always lacked proper synchronization. [1] divide error: 0000 [#1] SMP KASAN CPU: 1 PID: 7867 Comm: syz-executor.1 Not tainted 6.1.30-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 RIP: 0010:div64_u64 include/linux/math64.h:69 [inline] RIP: 0010:packet_time_ns net/sched/sch_netem.c:357 [inline] RIP: 0010:netem_enqueue+0x2067/0x36d0 net/sched/sch_netem.c:576 Code: 89 e2 48 69 da 00 ca 9a 3b 42 80 3c 28 00 4c 8b a4 24 88 00 00 00 74 0d 4c 89 e7 e8 c3 4f 3b fd 48 8b 4c 24 18 48 89 d8 31 d2 <49> f7 34 24 49 01 c7 4c 8b 64 24 48 4d 01 f7 4c 89 e3 48 c1 eb 03 RSP: 0018:ffffc9000dccea60 EFLAGS: 00010246 RAX: 000001a442624200 RBX: 000001a442624200 RCX: ffff888108a4f000 RDX: 0000000000000000 RSI: 000000000000070d RDI: 000000000000070d RBP: ffffc9000dcceb90 R08: ffffffff849c5e26 R09: fffffbfff10e1297 R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888108a4f358 R13: dffffc0000000000 R14: 0000001a8cd9a7ec R15: 0000000000000000 FS: 00007fa73fe18700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa73fdf7718 CR3: 000000011d36e000 CR4: 0000000000350ee0 Call Trace: <TASK> [<ffffffff84714385>] __dev_xmit_skb net/core/dev.c:3931 [inline] [<ffffffff84714385>] __dev_queue_xmit+0xcf5/0x3370 net/core/dev.c:4290 [<ffffffff84d22df2>] dev_queue_xmit include/linux/netdevice.h:3030 [inline] [<ffffffff84d22df2>] neigh_hh_output include/net/neighbour.h:531 [inline] [<ffffffff84d22df2>] neigh_output include/net/neighbour.h:545 [inline] [<ffffffff84d22df2>] ip_finish_output2+0xb92/0x10d0 net/ipv4/ip_output.c:235 [<ffffffff84d21e63>] __ip_finish_output+0xc3/0x2b0 [<ffffffff84d10a81>] ip_finish_output+0x31/0x2a0 net/ipv4/ip_output.c:323 [<ffffffff84d10f14>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff84d10f14>] ip_output+0x224/0x2a0 net/ipv4/ip_output.c:437 [<ffffffff84d123b5>] dst_output include/net/dst.h:444 [inline] [<ffffffff84d123b5>] ip_local_out net/ipv4/ip_output.c:127 [inline] [<ffffffff84d123b5>] __ip_queue_xmit+0x1425/0x2000 net/ipv4/ip_output.c:542 [<ffffffff84d12fdc>] ip_queue_xmit+0x4c/0x70 net/ipv4/ip_output.c:556 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230620184425.1179809-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-22	platform/x86/amd/pmf: Register notify handler only if SPS is enabled	Shyam Sundar S K
	Power source notify handler is getting registered even when none of the PMF feature in enabled leading to a crash. ... [ 22.592162] Call Trace: [ 22.592164] <TASK> [ 22.592164] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592166] ? __warn+0x81/0x130 [ 22.592171] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592172] ? report_bug+0x171/0x1a0 [ 22.592175] ? prb_read_valid+0x1b/0x30 [ 22.592177] ? handle_bug+0x3c/0x80 [ 22.592178] ? exc_invalid_op+0x17/0x70 [ 22.592179] ? asm_exc_invalid_op+0x1a/0x20 [ 22.592182] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592183] ? acpi_ut_delete_object_desc+0x86/0xb0 [ 22.592186] ? acpi_ut_update_ref_count.part.0+0x22d/0x930 [ 22.592187] __schedule+0xc0/0x1410 [ 22.592189] ? ktime_get+0x3c/0xa0 [ 22.592191] ? lapic_next_event+0x1d/0x30 [ 22.592193] ? hrtimer_start_range_ns+0x25b/0x350 [ 22.592196] schedule+0x5e/0xd0 [ 22.592197] schedule_hrtimeout_range_clock+0xbe/0x140 [ 22.592199] ? __pfx_hrtimer_wakeup+0x10/0x10 [ 22.592200] usleep_range_state+0x64/0x90 [ 22.592203] amd_pmf_send_cmd+0x106/0x2a0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592207] amd_pmf_update_slider+0x56/0x1b0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592210] amd_pmf_set_sps_power_limits+0x72/0x80 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592213] amd_pmf_pwr_src_notify_call+0x49/0x90 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592216] notifier_call_chain+0x5a/0xd0 [ 22.592218] atomic_notifier_call_chain+0x32/0x50 ... Fix this by moving the registration of source change notify handler only when SPS(Static Slider) is advertised as supported. Reported-by: Allen Zhong <allen@atr.me> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217571 Fixes: 4c71ae414474 ("platform/x86/amd/pmf: Add support SPS PMF feature") Tested-by: Patil Rajesh Reddy <Patil.Reddy@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Link: https://lore.kernel.org/r/20230622060309.310001-1-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-06-22	selftests: forwarding: Fix race condition in mirror installation	Danielle Ratson
	When mirroring to a gretap in hardware the device expects to be programmed with the egress port and all the encapsulating headers. This requires the driver to resolve the path the packet will take in the software data path and program the device accordingly. If the path cannot be resolved (in this case because of an unresolved neighbor), then mirror installation fails until the path is resolved. This results in a race that causes the test to sometimes fail. Fix this by setting the neighbor's state to permanent in a couple of tests, so that it is always valid. Fixes: 35c31d5c323f ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1d") Fixes: 239e754af854 ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1q") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/268816ac729cb6028c7a34d4dda6f4ec7af55333.1687264607.git.petrm@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>