summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2024-12-08Merge tag 'mm-hotfixes-stable-2024-12-07-22-39' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "24 hotfixes. 17 are cc:stable. 15 are MM and 9 are non-MM. The usual bunch of singletons - please see the relevant changelogs for details" * tag 'mm-hotfixes-stable-2024-12-07-22-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (24 commits) iio: magnetometer: yas530: use signed integer type for clamp limits sched/numa: fix memory leak due to the overwritten vma->numab_state mm/damon: fix order of arguments in damos_before_apply tracepoint lib: stackinit: hide never-taken branch from compiler mm/filemap: don't call folio_test_locked() without a reference in next_uptodate_folio() scatterlist: fix incorrect func name in kernel-doc mm: correct typo in MMAP_STATE() macro mm: respect mmap hint address when aligning for THP mm: memcg: declare do_memsw_account inline mm/codetag: swap tags when migrate pages ocfs2: update seq_file index in ocfs2_dlm_seq_next stackdepot: fix stack_depot_save_flags() in NMI context mm: open-code page_folio() in dump_page() mm: open-code PageTail in folio_flags() and const_folio_flags() mm: fix vrealloc()'s KASAN poisoning logic Revert "readahead: properly shorten readahead when falling back to do_page_cache_ra()" selftests/damon: add _damon_sysfs.py to TEST_FILES selftest: hugetlb_dio: fix test naming ocfs2: free inode when ocfs2_get_init_inode() fails nilfs2: fix potential out-of-bounds memory access in nilfs_find_entry() ...
2024-12-07Merge tag '6.13-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - DFS fix (for race with tree disconnect and dfs cache worker) - Four fixes for SMB3.1.1 posix extensions: - improve special file support e.g. to Samba, retrieving the file type earlier - reduce roundtrips (e.g. on ls -l, in some cases) * tag '6.13-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix potential race in cifs_put_tcon() smb3.1.1: fix posix mounts to older servers fs/smb/client: cifs_prime_dcache() for SMB3 POSIX reparse points fs/smb/client: Implement new SMB3 POSIX type fs/smb/client: avoid querying SMB2_OP_QUERY_WSL_EA for SMB3 POSIX
2024-12-07Merge tag 'ubifs-for-linus-6.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull jffs2 fix from Richard Weinberger: - Fixup rtime compressor bounds checking * tag 'ubifs-for-linus-6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: jffs2: Fix rtime decompressor
2024-12-06Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: "Nothing major, some left-overs from the recent merging window (MTE, coco) and some newly found issues like the ptrace() ones. - MTE/hugetlbfs: - Set VM_MTE_ALLOWED in the arch code and remove it from the core code for hugetlbfs mappings - Fix copy_highpage() warning when the source is a huge page but not MTE tagged, taking the wrong small page path - drivers/virt/coco: - Add the pKVM and Arm CCA drivers under the arm64 maintainership - Fix the pkvm driver to fall back to ioremap() (and warn) if the MMIO_GUARD hypercall fails - Keep the Arm CCA driver default 'n' rather than 'm' - A series of fixes for the arm64 ptrace() implementation, potentially leading to the kernel consuming uninitialised stack variables when PTRACE_SETREGSET is invoked with a length of 0 - Fix zone_dma_limit calculation when RAM starts below 4GB and ZONE_DMA is capped to this limit - Fix early boot warning with CONFIG_DEBUG_VIRTUAL=y triggered by a call to page_to_phys() (from patch_map()) which checks pfn_valid() before vmemmap has been set up - Do not clobber bits 15:8 of the ASID used for TTBR1_EL1 and TLBI ops when the kernel assumes 8-bit ASIDs but running under a hypervisor on a system that implements 16-bit ASIDs (found running Linux under Parallels on Apple M4) - ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A as it is using the same SMMU PMCG as HIP09 and suffers from the same errata - Add GCS to cpucap_is_possible(), missed in the recent merge" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: ptrace: fix partial SETREGSET for NT_ARM_GCS arm64: ptrace: fix partial SETREGSET for NT_ARM_POE arm64: ptrace: fix partial SETREGSET for NT_ARM_FPMR arm64: ptrace: fix partial SETREGSET for NT_ARM_TAGGED_ADDR_CTRL arm64: cpufeature: Add GCS to cpucap_is_possible() coco: virt: arm64: Do not enable cca guest driver by default arm64: mte: Fix copy_highpage() warning on hugetlb folios arm64: Ensure bits ASID[15:8] are masked out when the kernel uses 8-bit ASIDs ACPI/IORT: Add PMCG platform information for HiSilicon HIP09A MAINTAINERS: Add CCA and pKVM CoCO guest support to the ARM64 entry drivers/virt: pkvm: Don't fail ioremap() call if MMIO_GUARD fails arm64: patching: avoid early page_to_phys() arm64: mm: Fix zone_dma_limit calculation arm64: mte: set VM_MTE_ALLOWED for hugetlbfs at correct place
2024-12-06smb: client: fix potential race in cifs_put_tcon()Paulo Alcantara
dfs_cache_refresh() delayed worker could race with cifs_put_tcon(), so make sure to call list_replace_init() on @tcon->dfs_ses_list after kworker is cancelled or finished. Fixes: 4f42a8b54b5c ("smb: client: fix DFS interlink failover") Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-06smb3.1.1: fix posix mounts to older serversSteve French
Some servers which implement the SMB3.1.1 POSIX extensions did not set the file type in the mode in the infolevel 100 response. With the recent changes for checking the file type via the mode field, this can cause the root directory to be reported incorrectly and mounts (e.g. to ksmbd) to fail. Fixes: 6a832bc8bbb2 ("fs/smb/client: Implement new SMB3 POSIX type") Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Cc: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-05ocfs2: update seq_file index in ocfs2_dlm_seq_nextWengang Wang
The following INFO level message was seen: seq_file: buggy .next function ocfs2_dlm_seq_next [ocfs2] did not update position index Fix: Update *pos (so m->index) to make seq_read_iter happy though the index its self makes no sense to ocfs2_dlm_seq_next. Link: https://lkml.kernel.org/r/20241119174500.9198-1-wen.gang.wang@oracle.com Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-05ocfs2: free inode when ocfs2_get_init_inode() failsTetsuo Handa
syzbot is reporting busy inodes after unmount, for commit 9c89fe0af826 ("ocfs2: Handle error from dquot_initialize()") forgot to call iput() when new_inode() succeeded and dquot_initialize() failed. Link: https://lkml.kernel.org/r/e68c0224-b7c6-4784-b4fa-a9fc8c675525@I-love.SAKURA.ne.jp Fixes: 9c89fe0af826 ("ocfs2: Handle error from dquot_initialize()") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot+0af00f6a2cba2058b5db@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0af00f6a2cba2058b5db Tested-by: syzbot+0af00f6a2cba2058b5db@syzkaller.appspotmail.com Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-05nilfs2: fix potential out-of-bounds memory access in nilfs_find_entry()Ryusuke Konishi
Syzbot reported that when searching for records in a directory where the inode's i_size is corrupted and has a large value, memory access outside the folio/page range may occur, or a use-after-free bug may be detected if KASAN is enabled. This is because nilfs_last_byte(), which is called by nilfs_find_entry() and others to calculate the number of valid bytes of directory data in a page from i_size and the page index, loses the upper 32 bits of the 64-bit size information due to an inappropriate type of local variable to which the i_size value is assigned. This caused a large byte offset value due to underflow in the end address calculation in the calling nilfs_find_entry(), resulting in memory access that exceeds the folio/page size. Fix this issue by changing the type of the local variable causing the bit loss from "unsigned int" to "u64". The return value of nilfs_last_byte() is also of type "unsigned int", but it is truncated so as not to exceed PAGE_SIZE and no bit loss occurs, so no change is required. Link: https://lkml.kernel.org/r/20241119172403.9292-1-konishi.ryusuke@gmail.com Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+96d5d14c47d97015c624@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=96d5d14c47d97015c624 Tested-by: syzbot+96d5d14c47d97015c624@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-05fs/proc/vmcore.c: fix warning when CONFIG_MMU=nAndrew Morton
>> fs/proc/vmcore.c:424:19: warning: 'mmap_vmcore_fault' defined but not used [-Wunused-function] 424 | static vm_fault_t mmap_vmcore_fault(struct vm_fault *vmf) Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202411140156.2o0nS4fl-lkp@intel.com/ Cc: Qi Xi <xiqi2@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-12-05Merge tag 'v6.13-rc1-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: - Three fixes for potential out of bound accesses in read and write paths (e.g. when alternate data streams enabled) - GCC 15 build fix * tag 'v6.13-rc1-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: align aux_payload_buf to avoid OOB reads in cryptographic operations ksmbd: fix Out-of-Bounds Write in ksmbd_vfs_stream_write ksmbd: fix Out-of-Bounds Read in ksmbd_vfs_stream_read smb: server: Fix building with GCC 15
2024-12-05jffs2: Fix rtime decompressorRichard Weinberger
The fix for a memory corruption contained a off-by-one error and caused the compressor to fail in legit cases. Cc: Kinsey Moore <kinsey.moore@oarcorp.com> Cc: stable@vger.kernel.org Fixes: fe051552f5078 ("jffs2: Prevent rtime decompress memory corruption") Signed-off-by: Richard Weinberger <richard@nod.at>
2024-12-04ksmbd: align aux_payload_buf to avoid OOB reads in cryptographic operationsNorbert Szetei
The aux_payload_buf allocation in SMB2 read is performed without ensuring alignment, which could result in out-of-bounds (OOB) reads during cryptographic operations such as crypto_xor or ghash. This patch aligns the allocation of aux_payload_buf to prevent these issues. (Note that to add this patch to stable would require modifications due to recent patch "ksmbd: use __GFP_RETRY_MAYFAIL") Signed-off-by: Norbert Szetei <norbert@doyensec.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-04fs/smb/client: cifs_prime_dcache() for SMB3 POSIX reparse pointsRalph Boehme
Spares an extra revalidation request Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-04fs/smb/client: Implement new SMB3 POSIX typeRalph Boehme
Fixes special files against current Samba. On the Samba server: insgesamt 20 131958 brw-r--r-- 1 root root 0, 0 15. Nov 12:04 blockdev 131965 crw-r--r-- 1 root root 1, 1 15. Nov 12:04 chardev 131966 prw-r--r-- 1 samba samba 0 15. Nov 12:05 fifo 131953 -rw-rwxrw-+ 2 samba samba 4 18. Nov 11:37 file 131953 -rw-rwxrw-+ 2 samba samba 4 18. Nov 11:37 hardlink 131957 lrwxrwxrwx 1 samba samba 4 15. Nov 12:03 symlink -> file 131954 -rwxrwxr-x+ 1 samba samba 0 18. Nov 15:28 symlinkoversmb Before: ls: cannot access '/mnt/smb3unix/posix/blockdev': No data available ls: cannot access '/mnt/smb3unix/posix/chardev': No data available ls: cannot access '/mnt/smb3unix/posix/symlinkoversmb': No data available ls: cannot access '/mnt/smb3unix/posix/fifo': No data available ls: cannot access '/mnt/smb3unix/posix/symlink': No data available total 16 ? -????????? ? ? ? ? ? blockdev ? -????????? ? ? ? ? ? chardev ? -????????? ? ? ? ? ? fifo 131953 -rw-rwxrw- 2 root samba 4 Nov 18 11:37 file 131953 -rw-rwxrw- 2 root samba 4 Nov 18 11:37 hardlink ? -????????? ? ? ? ? ? symlink ? -????????? ? ? ? ? ? symlinkoversmb After: insgesamt 21 131958 brw-r--r-- 1 root root 0, 0 15. Nov 12:04 blockdev 131965 crw-r--r-- 1 root root 1, 1 15. Nov 12:04 chardev 131966 prw-r--r-- 1 root samba 0 15. Nov 12:05 fifo 131953 -rw-rwxrw- 2 root samba 4 18. Nov 11:37 file 131953 -rw-rwxrw- 2 root samba 4 18. Nov 11:37 hardlink 131957 lrwxrwxrwx 1 root samba 4 15. Nov 12:03 symlink -> file 131954 lrwxrwxr-x 1 root samba 23 18. Nov 15:28 symlinkoversmb -> mnt/smb3unix/posix/file Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-04fs/smb/client: avoid querying SMB2_OP_QUERY_WSL_EA for SMB3 POSIXRalph Boehme
Avoid extra roundtrip Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Ralph Boehme <slow@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-03Merge tag 'for-6.13-rc1-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - add lockdep annotations for io_uring/encoded read integration, inode lock is held when returning to userspace - properly reflect experimental config option to sysfs - handle NULL root in case the rescue mode accepts invalid/damaged tree roots (rescue=ibadroot) - regression fix of a deadlock between transaction and extent locks - fix pending bio accounting bug in encoded read ioctl - fix NOWAIT mode when checking references for NOCOW files - fix use-after-free in a rb-tree cleanup in ref-verify debugging tool * tag 'for-6.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix lockdep warnings on io_uring encoded reads btrfs: ref-verify: fix use-after-free after invalid ref action btrfs: add a sanity check for btrfs root in btrfs_search_slot() btrfs: don't loop for nowait writes when checking for cross references btrfs: sysfs: advertise experimental features only if CONFIG_BTRFS_EXPERIMENTAL=y btrfs: fix deadlock between transaction commits and extent locks btrfs: fix use-after-free in btrfs_encoded_read_endio()
2024-12-03Merge tag 'fs_for_v6.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota and udf fixes from Jan Kara: "Two small UDF fixes for better handling of corrupted filesystem and a quota fix to fix handling of filesystem freezing" * tag 'fs_for_v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Verify inode link counts before performing rename udf: Skip parent dir link count update if corrupted quota: flush quota_release_work upon quota writeback
2024-12-03Merge tag 'xfs-fixes-6.13-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Carlos Maiolino: - Use xchg() in xlog_cil_insert_pcp_aggregate() - Fix ABBA deadlock on a race between mount and log shutdown - Fix quota softlimit incoherency on delalloc - Fix sparse inode limits on runt AG - remove unknown compat feature checks in SB write valdation - Eliminate a lockdep false positive * tag 'xfs-fixes-6.13-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't call xfs_bmap_same_rtgroup in xfs_bmap_add_extent_hole_delay xfs: Use xchg() in xlog_cil_insert_pcp_aggregate() xfs: prevent mount and log shutdown race xfs: delalloc and quota softlimit timers are incoherent xfs: fix sparse inode limits on runt AG xfs: remove unknown compat feature check in superblock write validation xfs: eliminate lockdep false positives in xfs_attr_shortform_list
2024-12-02module: Convert symbol namespace to string literalPeter Zijlstra
Clean up the existing export namespace code along the same lines of commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify\(ns\)/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ && $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]*\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02arm64: mte: set VM_MTE_ALLOWED for hugetlbfs at correct placeYang Shi
The commit 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour") moved vm flags validation before fop->mmap for file mappings. But when commit 25c17c4b55de ("hugetlb: arm64: add mte support") was rebased on top of it, the hugetlbfs part was missed. Mmapping hugetlbfs file may not have MAP_HUGETLB set. Fixes: 25c17c4b55de ("hugetlb: arm64: add mte support") Signed-off-by: Yang Shi <yang@os.amperecomputing.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241119200914.1145249-1-yang@os.amperecomputing.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-01ksmbd: fix Out-of-Bounds Write in ksmbd_vfs_stream_writeJordy Zomer
An offset from client could be a negative value, It could allows to write data outside the bounds of the allocated buffer. Note that this issue is coming when setting 'vfs objects = streams_xattr parameter' in ksmbd.conf. Cc: stable@vger.kernel.org # v5.15+ Reported-by: Jordy Zomer <jordyzomer@google.com> Signed-off-by: Jordy Zomer <jordyzomer@google.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-01ksmbd: fix Out-of-Bounds Read in ksmbd_vfs_stream_readJordy Zomer
An offset from client could be a negative value, It could lead to an out-of-bounds read from the stream_buf. Note that this issue is coming when setting 'vfs objects = streams_xattr parameter' in ksmbd.conf. Cc: stable@vger.kernel.org # v5.15+ Reported-by: Jordy Zomer <jordyzomer@google.com> Signed-off-by: Jordy Zomer <jordyzomer@google.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-01smb: server: Fix building with GCC 15Brahmajit Das
GCC 15 introduces -Werror=unterminated-string-initialization by default, this results in the following build error fs/smb/server/smb_common.c:21:35: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-ini tialization] 21 | static const char basechars[43] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ_-!@#$%"; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors To this we are replacing char basechars[43] with a character pointer and then using strlen to get the length. Signed-off-by: Brahmajit Das <brahmajit.xyz@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-01Get rid of 'remove_new' relic from platform driver structLinus Torvalds
The continual trickle of small conversion patches is grating on me, and is really not helping. Just get rid of the 'remove_new' member function, which is just an alias for the plain 'remove', and had a comment to that effect: /* * .remove_new() is a relic from a prototype conversion of .remove(). * New drivers are supposed to implement .remove(). Once all drivers are * converted to not use .remove_new any more, it will be dropped. */ This was just a tree-wide 'sed' script that replaced '.remove_new' with '.remove', with some care taken to turn a subsequent tab into two tabs to make things line up. I did do some minimal manual whitespace adjustment for places that used spaces to line things up. Then I just removed the old (sic) .remove_new member function, and this is the end result. No more unnecessary conversion noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-30Merge tag 'uml-for-linus-6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Lots of cleanups, mostly from Benjamin Berg and Tiwei Bie - Removal of unused code - Fix for sparse warnings - Cleanup around stub_exe() * tag 'uml-for-linus-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (68 commits) hostfs: Fix the NULL vs IS_ERR() bug for __filemap_get_folio() um: move thread info into task um: Always dump trace for specified task in show_stack um: vector: Do not use drvdata in release um: net: Do not use drvdata in release um: ubd: Do not use drvdata in release um: ubd: Initialize ubd's disk pointer in ubd_add um: virtio_uml: query the number of vqs if supported um: virtio_uml: fix call_fd IRQ allocation um: virtio_uml: send SET_MEM_TABLE message with the exact size um: remove broken double fault detection um: remove duplicate UM_NSEC_PER_SEC definition um: remove file sync for stub data um: always include kconfig.h and compiler-version.h um: set DONTDUMP and DONTFORK flags on KASAN shadow memory um: fix sparse warnings in signal code um: fix sparse warnings from regset refactor um: Remove double zero check um: fix stub exe build with CONFIG_GCOV um: Use os_set_pdeathsig helper in winch thread/process ...
2024-11-30Merge tag 'ubifs-for-linus-6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull JFFS2, UBI and UBIFS updates from Richard Weinberger: "JFFS2: - Bug fix for rtime compression - Various cleanups UBI: - Cleanups for fastmap and wear leveling UBIFS: - Add support for FS_IOC_GETFSSYSFSPATH - Remove dead ioctl code - Fix UAF in ubifs_tnc_end_commit()" * tag 'ubifs-for-linus-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: (25 commits) ubifs: Fix uninitialized use of err in ubifs_jnl_write_inode() jffs2: Prevent rtime decompress memory corruption jffs2: remove redundant check on outpos > pos fs: jffs2: Fix inconsistent indentation in jffs2_mark_node_obsolete jffs2: Correct some typos in comments jffs2: fix use of uninitialized variable jffs2: Use str_yes_no() helper function mtd: ubi: remove redundant check on bytes_left at end of function mtd: ubi: fix unreleased fwnode_handle in find_volume_fwnode() ubifs: authentication: Fix use-after-free in ubifs_tnc_end_commit ubi: fastmap: Fix duplicate slab cache names while attaching ubifs: xattr: remove unused anonymous enum ubifs: Reduce kfree() calls in ubifs_purge_xattrs() ubifs: Call iput(xino) only once in ubifs_purge_xattrs() ubi: wl: Close down wear-leveling before nand is suspended mtd: ubi: Rmove unused declaration in header file ubifs: Correct the total block count by deducting journal reservation ubifs: Convert to use ERR_CAST() ubifs: add support for FS_IOC_GETFSSYSFSPATH ubifs: remove unused ioctl flags GETFLAGS/SETFLAGS ...
2024-11-30Merge tag '9p-for-6.13-rc1' of https://github.com/martinetd/linuxLinus Torvalds
Pull 9p updates from Dominique Martinet: - usbg: fix alloc failure handling & build-as-module - xen: couple of fixes - v9fs_cache_register/unregister code cleanup * tag '9p-for-6.13-rc1' of https://github.com/martinetd/linux: net/9p/usbg: allow building as standalone module 9p/xen: fix release of IRQ 9p/xen: fix init sequence net/9p/usbg: fix handling of the failed kzalloc() memory allocation fs/9p: replace functions v9fs_cache_{register|unregister} with direct calls
2024-11-30Merge tag 'ceph-for-6.13-rc1' of https://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph updates from Ilya Dryomov: "A fix for the mount "device" string parser from Patrick and two cred reference counting fixups from Max, marked for stable. Also included a number of cleanups and a tweak to MAINTAINERS to avoid unnecessarily CCing netdev list" * tag 'ceph-for-6.13-rc1' of https://github.com/ceph/ceph-client: ceph: fix cred leak in ceph_mds_check_access() ceph: pass cred pointer to ceph_mds_auth_match() ceph: improve caps debugging output ceph: correct ceph_mds_cap_peer field name ceph: correct ceph_mds_cap_item field name ceph: miscellaneous spelling fixes ceph: Use strscpy() instead of strcpy() in __get_snap_name() ceph: Use str_true_false() helper in status_show() ceph: requalify some char pointers as const ceph: extract entity name from device id MAINTAINERS: exclude net/ceph from networking ceph: Remove fs/ceph deadcode libceph: Remove unused ceph_crypto_key_encode libceph: Remove unused ceph_osdc_watch_check libceph: Remove unused pagevec functions libceph: Remove unused ceph_pagelist functions
2024-11-30Merge tag 'nfs-for-6.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client updates from Trond Myklebust: "Bugfixes: - nfs/localio: fix for a memory corruption in nfs_local_read_done - Revert "nfs: don't reuse partially completed requests in nfs_lock_and_join_requests" - nfsv4: - ignore SB_RDONLY when mounting nfs - Fix a use-after-free problem in open() - sunrpc: - clear XPRT_SOCK_UPD_TIMEOUT when reseting the transport - timeout and cancel TLS handshake with -ETIMEDOUT - fix one UAF issue caused by sunrpc kernel tcp socket - Fix a hang in TLS sock_close if sk_write_pending - pNFS/blocklayout: Fix device registration issues Features and cleanups: - localio cleanups from Mike Snitzer - Clean up refcounting on the nfs version modules - __counted_by() annotations - nfs: make processes that are waiting for an I/O lock killable" * tag 'nfs-for-6.13-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (24 commits) fs/nfs/io: make nfs_start_io_*() killable nfs/blocklayout: Limit repeat device registration on failure nfs/blocklayout: Don't attempt unregister for invalid block device sunrpc: fix one UAF issue caused by sunrpc kernel tcp socket SUNRPC: timeout and cancel TLS handshake with -ETIMEDOUT sunrpc: clear XPRT_SOCK_UPD_TIMEOUT when reset transport nfs: ignore SB_RDONLY when mounting nfs Revert "nfs: don't reuse partially completed requests in nfs_lock_and_join_requests" Revert "fs: nfs: fix missing refcnt by replacing folio_set_private by folio_attach_private" nfs/localio: must clear res.replen in nfs_local_read_done NFSv4.0: Fix a use-after-free problem in the asynchronous open() NFSv4.0: Fix the wake up of the next waiter in nfs_release_seqid() SUNRPC: Fix a hang in TLS sock_close if sk_write_pending sunrpc: remove newlines from tracepoints nfs: Annotate struct pnfs_commit_array with __counted_by() nfs/localio: eliminate need for nfs_local_fsync_work forward declaration nfs/localio: remove extra indirect nfs_to call to check {read,write}_iter nfs/localio: eliminate unnecessary kref in nfs_local_fsync_ctx nfs/localio: remove redundant suid/sgid handling NFS: Implement get_nfs_version() ...
2024-11-30Merge tag '6.13-rc-part2-smb3-client-fixes' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - directory lease fixes - password rotation fixes - reconnect fix - fix for SMB3.02 mounts - DFS (global namespace) fixes - fixes for special file handling (most relating to better handling various types of symlinks) - two minor cleanups * tag '6.13-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: (22 commits) cifs: update internal version number cifs: unlock on error in smb3_reconfigure() cifs: during remount, make sure passwords are in sync cifs: support mounting with alternate password to allow password rotation smb: Initialize cfid->tcon before performing network ops smb: During unmount, ensure all cached dir instances drop their dentry smb: client: fix noisy message when mounting shares smb: client: don't try following DFS links in cifs_tree_connect() smb: client: allow reconnect when sending ioctl smb: client: get rid of @nlsc param in cifs_tree_connect() smb: client: allow more DFS referrals to be cached cifs: Fix parsing reparse point with native symlink in SMB1 non-UNICODE session cifs: Validate content of WSL reparse point buffers cifs: Improve guard for excluding $LXDEV xattr cifs: Add support for parsing WSL-style symlinks cifs: Validate content of native symlink cifs: Fix parsing native symlinks relative to the export smb: client: fix NULL ptr deref in crypto_aead_setkey() Update misleading comment in cifs_chan_update_iface smb: client: change return value in open_cached_dir_by_dentry() if !cfids ...
2024-11-30Merge tag '6.13-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server updates from Steve French: - fix use after free due to race in ksmd workqueue handler - debugging improvements - fix incorrectly formatted response when client attempts SMB1 - improve memory allocation to reduce chance of OOM - improve delays between retries when killing sessions * tag '6.13-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: fix use-after-free in SMB request handling ksmbd: add debug print for pending request during server shutdown ksmbd: add netdev-up/down event debug print ksmbd: add debug prints to know what smb2 requests were received ksmbd: add debug print for rdma capable ksmbd: use msleep instaed of schedule_timeout_interruptible() ksmbd: use __GFP_RETRY_MAYFAIL ksmbd: fix malformed unsupported smb1 negotiate response
2024-11-29Merge tag 'driver-core-6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is a small set of driver core changes for 6.13-rc1. Nothing major for this merge cycle, except for the two simple merge conflicts are here just to make life interesting. Included in here are: - sysfs core changes and preparations for more sysfs api cleanups that can come through all driver trees after -rc1 is out - fw_devlink fixes based on many reports and debugging sessions - list_for_each_reverse() removal, no one was using it! - last-minute seq_printf() format string bug found and fixed in many drivers all at once. - minor bugfixes and changes full details in the shortlog" * tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits) Fix a potential abuse of seq_printf() format string in drivers cpu: Remove spurious NULL in attribute_group definition s390/con3215: Remove spurious NULL in attribute_group definition perf: arm-ni: Remove spurious NULL in attribute_group definition driver core: Constify bin_attribute definitions sysfs: attribute_group: allow registration of const bin_attribute firmware_loader: Fix possible resource leak in fw_log_firmware_info() drivers: core: fw_devlink: Fix excess parameter description in docstring driver core: class: Correct WARN() message in APIs class_(for_each|find)_device() cacheinfo: Use of_property_present() for non-boolean properties cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap() drivers: core: fw_devlink: Make the error message a bit more useful phy: tegra: xusb: Set fwnode for xusb port devices drm: display: Set fwnode for aux bus devices driver core: fw_devlink: Stop trying to optimize cycle detection logic driver core: Constify attribute arguments of binary attributes sysfs: bin_attribute: add const read/write callback variants sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR() sysfs: treewide: constify attribute callback of bin_attribute::llseek() sysfs: treewide: constify attribute callback of bin_attribute::mmap() ...
2024-11-29btrfs: fix lockdep warnings on io_uring encoded readsMark Harmstone
Lockdep doesn't like the fact that btrfs_uring_read_extent() returns to userspace still holding the inode lock, even though we release it once the I/O finishes. Add calls to rwsem_release() and rwsem_acquire_read() to work round this. Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> 34310c442e17 ("btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)") Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-29btrfs: ref-verify: fix use-after-free after invalid ref actionFilipe Manana
At btrfs_ref_tree_mod() after we successfully inserted the new ref entry (local variable 'ref') into the respective block entry's rbtree (local variable 'be'), if we find an unexpected action of BTRFS_DROP_DELAYED_REF, we error out and free the ref entry without removing it from the block entry's rbtree. Then in the error path of btrfs_ref_tree_mod() we call btrfs_free_ref_cache(), which iterates over all block entries and then calls free_block_entry() for each one, and there we will trigger a use-after-free when we are called against the block entry to which we added the freed ref entry to its rbtree, since the rbtree still points to the block entry, as we didn't remove it from the rbtree before freeing it in the error path at btrfs_ref_tree_mod(). Fix this by removing the new ref entry from the rbtree before freeing it. Syzbot report this with the following stack traces: BTRFS error (device loop0 state EA): Ref action 2, root 5, ref_root 0, parent 8564736, owner 0, offset 0, num_refs 18446744073709551615 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_insert_empty_items+0x9c/0x1a0 fs/btrfs/ctree.c:4314 btrfs_insert_empty_item fs/btrfs/ctree.h:669 [inline] btrfs_insert_orphan_item+0x1f1/0x320 fs/btrfs/orphan.c:23 btrfs_orphan_add+0x6d/0x1a0 fs/btrfs/inode.c:3482 btrfs_unlink+0x267/0x350 fs/btrfs/inode.c:4293 vfs_unlink+0x365/0x650 fs/namei.c:4469 do_unlinkat+0x4ae/0x830 fs/namei.c:4533 __do_sys_unlinkat fs/namei.c:4576 [inline] __se_sys_unlinkat fs/namei.c:4569 [inline] __x64_sys_unlinkat+0xcc/0xf0 fs/namei.c:4569 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f BTRFS error (device loop0 state EA): Ref action 1, root 5, ref_root 5, parent 0, owner 260, offset 0, num_refs 1 __btrfs_mod_ref+0x76b/0xac0 fs/btrfs/extent-tree.c:2521 update_ref_for_cow+0x96a/0x11f0 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 BTRFS error (device loop0 state EA): Ref action 2, root 5, ref_root 0, parent 8564736, owner 0, offset 0, num_refs 18446744073709551615 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 ================================================================== BUG: KASAN: slab-use-after-free in rb_first+0x69/0x70 lib/rbtree.c:473 Read of size 8 at addr ffff888042d1af38 by task syz.0.0/5329 CPU: 0 UID: 0 PID: 5329 Comm: syz.0.0 Not tainted 6.12.0-rc7-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 rb_first+0x69/0x70 lib/rbtree.c:473 free_block_entry+0x78/0x230 fs/btrfs/ref-verify.c:248 btrfs_free_ref_cache+0xa3/0x100 fs/btrfs/ref-verify.c:917 btrfs_ref_tree_mod+0x139f/0x15e0 fs/btrfs/ref-verify.c:898 btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f996df7e719 RSP: 002b:00007f996ede7038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f996e135f80 RCX: 00007f996df7e719 RDX: 0000000020000180 RSI: 00000000c4009420 RDI: 0000000000000004 RBP: 00007f996dff139e R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f996e135f80 R15: 00007fff79f32e68 </TASK> Allocated by task 5329: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 poison_kmalloc_redzone mm/kasan/common.c:377 [inline] __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:394 kasan_kmalloc include/linux/kasan.h:257 [inline] __kmalloc_cache_noprof+0x19c/0x2c0 mm/slub.c:4295 kmalloc_noprof include/linux/slab.h:878 [inline] kzalloc_noprof include/linux/slab.h:1014 [inline] btrfs_ref_tree_mod+0x264/0x15e0 fs/btrfs/ref-verify.c:701 btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 5329: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_hook mm/slub.c:2342 [inline] slab_free mm/slub.c:4579 [inline] kfree+0x1a0/0x440 mm/slub.c:4727 btrfs_ref_tree_mod+0x136c/0x15e0 btrfs_free_extent+0x33c/0x380 fs/btrfs/extent-tree.c:3544 __btrfs_mod_ref+0x7dd/0xac0 fs/btrfs/extent-tree.c:2523 update_ref_for_cow+0x9cd/0x11f0 fs/btrfs/ctree.c:512 btrfs_force_cow_block+0x9f6/0x1da0 fs/btrfs/ctree.c:594 btrfs_cow_block+0x35e/0xa40 fs/btrfs/ctree.c:754 btrfs_search_slot+0xbdd/0x30d0 fs/btrfs/ctree.c:2116 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:411 __btrfs_update_delayed_inode+0x1e7/0xb90 fs/btrfs/delayed-inode.c:1030 btrfs_update_delayed_inode fs/btrfs/delayed-inode.c:1114 [inline] __btrfs_commit_inode_delayed_items+0x2318/0x24a0 fs/btrfs/delayed-inode.c:1137 __btrfs_run_delayed_items+0x213/0x490 fs/btrfs/delayed-inode.c:1171 btrfs_commit_transaction+0x8a8/0x3740 fs/btrfs/transaction.c:2313 prepare_to_relocate+0x3c4/0x4c0 fs/btrfs/relocation.c:3586 relocate_block_group+0x16c/0xd40 fs/btrfs/relocation.c:3611 btrfs_relocate_block_group+0x77d/0xd90 fs/btrfs/relocation.c:4081 btrfs_relocate_chunk+0x12c/0x3b0 fs/btrfs/volumes.c:3377 __btrfs_balance+0x1b0f/0x26b0 fs/btrfs/volumes.c:4161 btrfs_balance+0xbdc/0x10c0 fs/btrfs/volumes.c:4538 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff888042d1af00 which belongs to the cache kmalloc-64 of size 64 The buggy address is located 56 bytes inside of freed 64-byte region [ffff888042d1af00, ffff888042d1af40) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x42d1a anon flags: 0x4fff00000000000(node=1|zone=1|lastcpupid=0x7ff) page_type: f5(slab) raw: 04fff00000000000 ffff88801ac418c0 0000000000000000 dead000000000001 raw: 0000000000000000 0000000000200020 00000001f5000000 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 5055, tgid 5055 (dhcpcd-run-hook), ts 40377240074, free_ts 40376848335 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1541 prep_new_page mm/page_alloc.c:1549 [inline] get_page_from_freelist+0x3649/0x3790 mm/page_alloc.c:3459 __alloc_pages_noprof+0x292/0x710 mm/page_alloc.c:4735 alloc_pages_mpol_noprof+0x3e8/0x680 mm/mempolicy.c:2265 alloc_slab_page+0x6a/0x140 mm/slub.c:2412 allocate_slab+0x5a/0x2f0 mm/slub.c:2578 new_slab mm/slub.c:2631 [inline] ___slab_alloc+0xcd1/0x14b0 mm/slub.c:3818 __slab_alloc+0x58/0xa0 mm/slub.c:3908 __slab_alloc_node mm/slub.c:3961 [inline] slab_alloc_node mm/slub.c:4122 [inline] __do_kmalloc_node mm/slub.c:4263 [inline] __kmalloc_noprof+0x25a/0x400 mm/slub.c:4276 kmalloc_noprof include/linux/slab.h:882 [inline] kzalloc_noprof include/linux/slab.h:1014 [inline] tomoyo_encode2 security/tomoyo/realpath.c:45 [inline] tomoyo_encode+0x26f/0x540 security/tomoyo/realpath.c:80 tomoyo_realpath_from_path+0x59e/0x5e0 security/tomoyo/realpath.c:283 tomoyo_get_realpath security/tomoyo/file.c:151 [inline] tomoyo_check_open_permission+0x255/0x500 security/tomoyo/file.c:771 security_file_open+0x777/0x990 security/security.c:3109 do_dentry_open+0x369/0x1460 fs/open.c:945 vfs_open+0x3e/0x330 fs/open.c:1088 do_open fs/namei.c:3774 [inline] path_openat+0x2c84/0x3590 fs/namei.c:3933 page last free pid 5055 tgid 5055 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1112 [inline] free_unref_page+0xcfb/0xf20 mm/page_alloc.c:2642 free_pipe_info+0x300/0x390 fs/pipe.c:860 put_pipe_info fs/pipe.c:719 [inline] pipe_release+0x245/0x320 fs/pipe.c:742 __fput+0x23f/0x880 fs/file_table.c:431 __do_sys_close fs/open.c:1567 [inline] __se_sys_close fs/open.c:1552 [inline] __x64_sys_close+0x7f/0x110 fs/open.c:1552 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Memory state around the buggy address: ffff888042d1ae00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff888042d1ae80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc >ffff888042d1af00: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ^ ffff888042d1af80: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc ffff888042d1b000: 00 00 00 00 00 fc fc 00 00 00 00 00 fc fc 00 00 Reported-by: syzbot+7325f164162e200000c1@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/673723eb.050a0220.1324f8.00a8.GAE@google.com/T/#u Fixes: fd708b81d972 ("Btrfs: add a extent ref verify tool") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-29btrfs: add a sanity check for btrfs root in btrfs_search_slot()Lizhi Xu
Syzbot reports a null-ptr-deref in btrfs_search_slot(). The reproducer is using rescue=ibadroots, and the extent tree root is corrupted thus the extent tree is NULL. When scrub tries to search the extent tree to gather the needed extent info, btrfs_search_slot() doesn't check if the target root is NULL or not, resulting the null-ptr-deref. Add sanity check for btrfs root before using it in btrfs_search_slot(). Reported-by: syzbot+3030e17bd57a73d39bd7@syzkaller.appspotmail.com Fixes: 42437a6386ff ("btrfs: introduce mount option rescue=ignorebadroots") Link: https://syzkaller.appspot.com/bug?extid=3030e17bd57a73d39bd7 CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Qu Wenruo <wqu@suse.com> Tested-by: syzbot+3030e17bd57a73d39bd7@syzkaller.appspotmail.com Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-29btrfs: don't loop for nowait writes when checking for cross referencesFilipe Manana
When checking for delayed refs when verifying if there are cross references for a data extent, we stop if the path has nowait set and we can't try lock the delayed ref head's mutex, returning -EAGAIN with the goal of making a write fallback to a blocking context. However we ignore the -EAGAIN at btrfs_cross_ref_exist() when check_delayed_ref() returns it, and keep looping instead of immediately returning the -EAGAIN to the caller. Fix this by not looping if we get -EAGAIN and we have a nowait path. Fixes: 26ce91144631 ("btrfs: make can_nocow_extent nowait compatible") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-28btrfs: sysfs: advertise experimental features only if ↵Filipe Manana
CONFIG_BTRFS_EXPERIMENTAL=y We are advertising experimental features through sysfs if CONFIG_BTRFS_DEBUG is set, without looking if CONFIG_BTRFS_EXPERIMENTAL is set. This is wrong as it will result in reporting experimental features as supported when CONFIG_BTRFS_EXPERIMENTAL is not set but CONFIG_BTRFS_DEBUG is set. Fix this by checking for CONFIG_BTRFS_EXPERIMENTAL instead of CONFIG_BTRFS_DEBUG. Fixes: 67cd3f221769 ("btrfs: split out CONFIG_BTRFS_EXPERIMENTAL from CONFIG_BTRFS_DEBUG") Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Qu Wenruo <wqu@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-28btrfs: fix deadlock between transaction commits and extent locksFilipe Manana
When running a workload with fsstress and duperemove (generic/561) we can hit a deadlock related to transaction commits and locking extent ranges, as described below. Task A hanging during a transaction commit, waiting for all other writers to complete: [178317.334817] INFO: task fsstress:555623 blocked for more than 120 seconds. [178317.335693] Not tainted 6.12.0-rc6-btrfs-next-179+ #1 [178317.336528] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [178317.337673] task:fsstress state:D stack:0 pid:555623 tgid:555623 ppid:555620 flags:0x00004002 [178317.337679] Call Trace: [178317.337681] <TASK> [178317.337685] __schedule+0x364/0xbe0 [178317.337691] schedule+0x26/0xa0 [178317.337695] btrfs_commit_transaction+0x5c5/0x1050 [btrfs] [178317.337769] ? start_transaction+0xc4/0x800 [btrfs] [178317.337815] ? __pfx_autoremove_wake_function+0x10/0x10 [178317.337819] btrfs_mksubvol+0x381/0x640 [btrfs] [178317.337878] btrfs_mksnapshot+0x7a/0xb0 [btrfs] [178317.337935] __btrfs_ioctl_snap_create+0x1bb/0x1d0 [btrfs] [178317.337995] btrfs_ioctl_snap_create_v2+0x103/0x130 [btrfs] [178317.338053] btrfs_ioctl+0x29b/0x2a90 [btrfs] [178317.338118] ? kmem_cache_alloc_noprof+0x5f/0x2c0 [178317.338126] ? getname_flags+0x45/0x1f0 [178317.338133] ? _raw_spin_unlock+0x15/0x30 [178317.338145] ? __x64_sys_ioctl+0x88/0xc0 [178317.338149] __x64_sys_ioctl+0x88/0xc0 [178317.338152] do_syscall_64+0x4a/0x110 [178317.338160] entry_SYSCALL_64_after_hwframe+0x76/0x7e [178317.338190] RIP: 0033:0x7f13c28e271b Which corresponds to line 2361 of transaction.c: $ cat -n fs/btrfs/transaction.c (...) 2162 int btrfs_commit_transaction(struct btrfs_trans_handle *trans) 2163 { (...) 2349 spin_lock(&fs_info->trans_lock); 2350 add_pending_snapshot(trans); 2351 cur_trans->state = TRANS_STATE_COMMIT_DOING; 2352 spin_unlock(&fs_info->trans_lock); 2353 2354 /* 2355 * The thread has started/joined the transaction thus it holds the 2356 * lockdep map as a reader. It has to release it before acquiring the 2357 * lockdep map as a writer. 2358 */ 2359 btrfs_lockdep_release(fs_info, btrfs_trans_num_writers); 2360 btrfs_might_wait_for_event(fs_info, btrfs_trans_num_writers); 2361 wait_event(cur_trans->writer_wait, 2362 atomic_read(&cur_trans->num_writers) == 1); (...) The transaction is in the TRANS_STATE_COMMIT_DOING state and so it's waiting for all other existing writers to complete and release their transaction handle. Task B is running ordered extent completion and blocked waiting to lock an extent range in an inode's io tree: [178317.327411] INFO: task kworker/u48:8:554545 blocked for more than 120 seconds. [178317.328630] Not tainted 6.12.0-rc6-btrfs-next-179+ #1 [178317.329635] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [178317.330872] task:kworker/u48:8 state:D stack:0 pid:554545 tgid:554545 ppid:2 flags:0x00004000 [178317.330878] Workqueue: btrfs-endio-write btrfs_work_helper [btrfs] [178317.330944] Call Trace: [178317.330945] <TASK> [178317.330947] __schedule+0x364/0xbe0 [178317.330952] schedule+0x26/0xa0 [178317.330955] __lock_extent+0x337/0x3a0 [btrfs] [178317.331014] ? __pfx_autoremove_wake_function+0x10/0x10 [178317.331017] btrfs_finish_one_ordered+0x47a/0xaa0 [btrfs] [178317.331074] ? psi_group_change+0x132/0x2d0 [178317.331078] btrfs_work_helper+0xbd/0x370 [btrfs] [178317.331140] process_scheduled_works+0xd3/0x460 [178317.331144] ? __pfx_worker_thread+0x10/0x10 [178317.331146] worker_thread+0x121/0x250 [178317.331149] ? __pfx_worker_thread+0x10/0x10 [178317.331151] kthread+0xe9/0x120 [178317.331154] ? __pfx_kthread+0x10/0x10 [178317.331157] ret_from_fork+0x2d/0x50 [178317.331159] ? __pfx_kthread+0x10/0x10 [178317.331162] ret_from_fork_asm+0x1a/0x30 This extent range locking happens after joining the current transaction, so task A is waiting for task B to release its transaction handle (decrementing the transaction's num_writers counter). Task C while doing a fiemap it tries to join the current transaction: [242682.812815] task:pool state:D stack:0 pid:560767 tgid:560724 ppid:555622 flags:0x00004006 [242682.812827] Call Trace: [242682.812856] <TASK> [242682.812864] __schedule+0x364/0xbe0 [242682.812879] ? _raw_spin_unlock_irqrestore+0x23/0x40 [242682.812897] schedule+0x26/0xa0 [242682.812909] wait_current_trans+0xd6/0x130 [btrfs] [242682.813148] ? __pfx_autoremove_wake_function+0x10/0x10 [242682.813162] start_transaction+0x3d4/0x800 [btrfs] [242682.813399] btrfs_is_data_extent_shared+0xd2/0x440 [btrfs] [242682.813723] fiemap_process_hole+0x2a2/0x300 [btrfs] [242682.813995] extent_fiemap+0x9b8/0xb80 [btrfs] [242682.814249] btrfs_fiemap+0x78/0xc0 [btrfs] [242682.814501] do_vfs_ioctl+0x2db/0xa50 [242682.814519] __x64_sys_ioctl+0x6a/0xc0 [242682.814531] do_syscall_64+0x4a/0x110 [242682.814544] entry_SYSCALL_64_after_hwframe+0x76/0x7e [242682.814556] RIP: 0033:0x7efff595e71b It tries to join the current transaction, but it can't because the transaction is in the TRANS_STATE_COMMIT_DOING state, so join_transaction() returns -EBUSY to start_transaction() and makes it wait for the current transaction to complete. And while it's waiting for the transaction to complete, it's holding an extent range locked in the same inode that task B is operating, which causes a deadlock between these 3 tasks. The extent range for the inode was locked at the start of the fiemap operation, early at extent_fiemap(). In short these tasks deadlock because: 1) Task A is waiting for task B to release its transaction handle; 2) Task B is waiting to lock an extent range for an inode while holding a transaction handle open; 3) Task C is waiting for the current transaction to complete (for task A to finish the transaction commit) while holding the extent range for the inode locked, so task B can't progress and release its transaction handle. This results in an ABBA deadlock involving transaction commits and extent locks. Extent locks are higher level locks, like inode VFS locks, and should always be acquired before joining or starting a transaction, but recently commit 2206265f41e9 ("btrfs: remove code duplication in ordered extent finishing") accidentally changed btrfs_finish_one_ordered() to do the transaction join before locking the extent range. Fix this by making sure that btrfs_finish_one_ordered() always locks the extent before joining a transaction and add an explicit comment about the need for this order. Fixes: 2206265f41e9 ("btrfs: remove code duplication in ordered extent finishing") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-28btrfs: fix use-after-free in btrfs_encoded_read_endio()Johannes Thumshirn
Shinichiro reported the following use-after free that sometimes is happening in our CI system when running fstests' btrfs/284 on a TCMU runner device: BUG: KASAN: slab-use-after-free in lock_release+0x708/0x780 Read of size 8 at addr ffff888106a83f18 by task kworker/u80:6/219 CPU: 8 UID: 0 PID: 219 Comm: kworker/u80:6 Not tainted 6.12.0-rc6-kts+ #15 Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020 Workqueue: btrfs-endio btrfs_end_bio_work [btrfs] Call Trace: <TASK> dump_stack_lvl+0x6e/0xa0 ? lock_release+0x708/0x780 print_report+0x174/0x505 ? lock_release+0x708/0x780 ? __virt_addr_valid+0x224/0x410 ? lock_release+0x708/0x780 kasan_report+0xda/0x1b0 ? lock_release+0x708/0x780 ? __wake_up+0x44/0x60 lock_release+0x708/0x780 ? __pfx_lock_release+0x10/0x10 ? __pfx_do_raw_spin_lock+0x10/0x10 ? lock_is_held_type+0x9a/0x110 _raw_spin_unlock_irqrestore+0x1f/0x60 __wake_up+0x44/0x60 btrfs_encoded_read_endio+0x14b/0x190 [btrfs] btrfs_check_read_bio+0x8d9/0x1360 [btrfs] ? lock_release+0x1b0/0x780 ? trace_lock_acquire+0x12f/0x1a0 ? __pfx_btrfs_check_read_bio+0x10/0x10 [btrfs] ? process_one_work+0x7e3/0x1460 ? lock_acquire+0x31/0xc0 ? process_one_work+0x7e3/0x1460 process_one_work+0x85c/0x1460 ? __pfx_process_one_work+0x10/0x10 ? assign_work+0x16c/0x240 worker_thread+0x5e6/0xfc0 ? __pfx_worker_thread+0x10/0x10 kthread+0x2c3/0x3a0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 3661: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0xaa/0xb0 btrfs_encoded_read_regular_fill_pages+0x16c/0x6d0 [btrfs] send_extent_data+0xf0f/0x24a0 [btrfs] process_extent+0x48a/0x1830 [btrfs] changed_cb+0x178b/0x2ea0 [btrfs] btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] _btrfs_ioctl_send+0x117/0x330 [btrfs] btrfs_ioctl+0x184a/0x60a0 [btrfs] __x64_sys_ioctl+0x12e/0x1a0 do_syscall_64+0x95/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 3661: kasan_save_stack+0x30/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x70 __kasan_slab_free+0x4f/0x70 kfree+0x143/0x490 btrfs_encoded_read_regular_fill_pages+0x531/0x6d0 [btrfs] send_extent_data+0xf0f/0x24a0 [btrfs] process_extent+0x48a/0x1830 [btrfs] changed_cb+0x178b/0x2ea0 [btrfs] btrfs_ioctl_send+0x3bf9/0x5c20 [btrfs] _btrfs_ioctl_send+0x117/0x330 [btrfs] btrfs_ioctl+0x184a/0x60a0 [btrfs] __x64_sys_ioctl+0x12e/0x1a0 do_syscall_64+0x95/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e The buggy address belongs to the object at ffff888106a83f00 which belongs to the cache kmalloc-rnd-07-96 of size 96 The buggy address is located 24 bytes inside of freed 96-byte region [ffff888106a83f00, ffff888106a83f60) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888106a83800 pfn:0x106a83 flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000000 ffff888100053680 ffffea0004917200 0000000000000004 raw: ffff888106a83800 0000000080200019 00000001f5000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888106a83e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888106a83e80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc >ffff888106a83f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff888106a83f80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888106a84000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Further analyzing the trace and the crash dump's vmcore file shows that the wake_up() call in btrfs_encoded_read_endio() is calling wake_up() on the wait_queue that is in the private data passed to the end_io handler. Commit 4ff47df40447 ("btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages()") moved 'struct btrfs_encoded_read_private' off the stack. Before that commit one can see a corruption of the private data when analyzing the vmcore after a crash: *(struct btrfs_encoded_read_private *)0xffff88815626eec8 = { .wait = (wait_queue_head_t){ .lock = (spinlock_t){ .rlock = (struct raw_spinlock){ .raw_lock = (arch_spinlock_t){ .val = (atomic_t){ .counter = (int)-2005885696, }, .locked = (u8)0, .pending = (u8)157, .locked_pending = (u16)40192, .tail = (u16)34928, }, .magic = (unsigned int)536325682, .owner_cpu = (unsigned int)29, .owner = (void *)__SCT__tp_func_btrfs_transaction_commit+0x0 = 0x0, .dep_map = (struct lockdep_map){ .key = (struct lock_class_key *)0xffff8881575a3b6c, .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, .name = (const char *)0xffff88815626f100 = "", .wait_type_outer = (u8)37, .wait_type_inner = (u8)178, .lock_type = (u8)154, }, }, .__padding = (u8 [24]){ 0, 157, 112, 136, 50, 174, 247, 31, 29 }, .dep_map = (struct lockdep_map){ .key = (struct lock_class_key *)0xffff8881575a3b6c, .class_cache = (struct lock_class *[2]){ 0xffff8882a71985c0, 0xffffea00066f5d40 }, .name = (const char *)0xffff88815626f100 = "", .wait_type_outer = (u8)37, .wait_type_inner = (u8)178, .lock_type = (u8)154, }, }, .head = (struct list_head){ .next = (struct list_head *)0x112cca, .prev = (struct list_head *)0x47, }, }, .pending = (atomic_t){ .counter = (int)-1491499288, }, .status = (blk_status_t)130, } Here we can see several indicators of in-memory data corruption, e.g. the large negative atomic values of ->pending or ->wait->lock->rlock->raw_lock->val, as well as the bogus spinlock magic 0x1ff7ae32 (decimal 536325682 above) instead of 0xdead4ead or the bogus pointer values for ->wait->head. To fix this, change atomic_dec_return() to atomic_dec_and_test() to fix the corruption, as atomic_dec_return() is defined as two instructions on x86_64, whereas atomic_dec_and_test() is defined as a single atomic operation. This can lead to a situation where counter value is already decremented but the if statement in btrfs_encoded_read_endio() is not completely processed, i.e. the 0 test has not completed. If another thread continues executing btrfs_encoded_read_regular_fill_pages() the atomic_dec_return() there can see an already updated ->pending counter and continues by freeing the private data. Continuing in the endio handler the test for 0 succeeds and the wait_queue is woken up, resulting in a use-after-free. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Suggested-by: Damien Le Moal <Damien.LeMoal@wdc.com> Fixes: 1881fba89bd5 ("btrfs: add BTRFS_IOC_ENCODED_READ ioctl") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-28fs/nfs/io: make nfs_start_io_*() killableMax Kellermann
This allows killing processes that wait for a lock when one process is stuck waiting for the NFS server. This aims to complete the coverage of NFS operations being killable, like nfs_direct_wait() does, for example. Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-11-28nfs/blocklayout: Limit repeat device registration on failureBenjamin Coddington
Every pNFS SCSI IO wants to do LAYOUTGET, then within the layout find the device which can drive GETDEVINFO, then finally may need to prep the device with a reservation. This slow work makes a mess of IO latencies if one of the later steps is going to fail for awhile. If we're unable to register a SCSI device, ensure we mark the device as unavailable so that it will timeout and be re-added via GETDEVINFO. This avoids repeated doomed attempts to register a device in the IO path. Add some clarifying comments as well. Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-11-28nfs/blocklayout: Don't attempt unregister for invalid block deviceBenjamin Coddington
Since commit d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") an unmount of a pNFS SCSI layout-enabled NFS may dereference a NULL block_device in: bl_unregister_scsi+0x16/0xe0 [blocklayoutdriver] bl_free_device+0x70/0x80 [blocklayoutdriver] bl_free_deviceid_node+0x12/0x30 [blocklayoutdriver] nfs4_put_deviceid_node+0x60/0xc0 [nfsv4] nfs4_deviceid_purge_client+0x132/0x190 [nfsv4] unset_pnfs_layoutdriver+0x59/0x60 [nfsv4] nfs4_destroy_server+0x36/0x70 [nfsv4] nfs_free_server+0x23/0xe0 [nfs] deactivate_locked_super+0x30/0xb0 cleanup_mnt+0xba/0x150 task_work_run+0x59/0x90 syscall_exit_to_user_mode+0x217/0x220 do_syscall_64+0x8e/0x160 This happens because even though we were able to create the nfs4_deviceid_node, the lookup for the device was unable to attach the block device to the pnfs_block_dev. If we never found a block device to register, we can avoid this case with the PNFS_BDEV_REGISTERED flag. Move the deref behind the test for the flag. Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-11-28nfs: ignore SB_RDONLY when mounting nfsLi Lingfeng
When exporting only one file system with fsid=0 on the server side, the client alternately uses the ro/rw mount options to perform the mount operation, and a new vfsmount is generated each time. It can be reproduced as follows: [root@localhost ~]# mount /dev/sda /mnt2 [root@localhost ~]# echo "/mnt2 *(rw,no_root_squash,fsid=0)" >/etc/exports [root@localhost ~]# systemctl restart nfs-server [root@localhost ~]# mount -t nfs -o ro,vers=4 127.0.0.1:/ /mnt/sdaa [root@localhost ~]# mount -t nfs -o rw,vers=4 127.0.0.1:/ /mnt/sdaa [root@localhost ~]# mount -t nfs -o ro,vers=4 127.0.0.1:/ /mnt/sdaa [root@localhost ~]# mount -t nfs -o rw,vers=4 127.0.0.1:/ /mnt/sdaa [root@localhost ~]# mount | grep nfs4 127.0.0.1:/ on /mnt/sdaa type nfs4 (ro,relatime,vers=4.2,rsize=1048576,... 127.0.0.1:/ on /mnt/sdaa type nfs4 (rw,relatime,vers=4.2,rsize=1048576,... 127.0.0.1:/ on /mnt/sdaa type nfs4 (ro,relatime,vers=4.2,rsize=1048576,... 127.0.0.1:/ on /mnt/sdaa type nfs4 (rw,relatime,vers=4.2,rsize=1048576,... [root@localhost ~]# We expected that after mounting with the ro option, using the rw option to mount again would return EBUSY, but the actual situation was not the case. As shown above, when mounting for the first time, a superblock with the ro flag will be generated, and at the same time, in do_new_mount_fc --> do_add_mount, it detects that the superblock corresponding to the current target directory is inconsistent with the currently generated one (path->mnt->mnt_sb != newmnt->mnt.mnt_sb), and a new vfsmount will be generated. When mounting with the rw option for the second time, since no matching superblock can be found in the fs_supers list, a new superblock with the rw flag will be generated again. The superblock in use (ro) is different from the newly generated superblock (rw), and a new vfsmount will be generated again. When mounting with the ro option for the third time, the superblock (ro) is found in fs_supers, the superblock in use (rw) is different from the found superblock (ro), and a new vfsmount will be generated again. We can switch between ro/rw through remount, and only one superblock needs to be generated, thus avoiding the problem of repeated generation of vfsmount caused by switching superblocks. Furthermore, This can also resolve the issue described in the link. Fixes: 275a5d24bf56 ("NFS: Error when mounting the same filesystem with different options") Link: https://lore.kernel.org/all/20240604112636.236517-3-lilingfeng@huaweicloud.com/ Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-11-28Merge tag 'ntfs3_for_6.13' of ↵Linus Torvalds
https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 updates from Konstantin Komarov: - additional checks to address issues identified by syzbot - continuation of the transition from 'page' to 'folio' * tag 'ntfs3_for_6.13' of https://github.com/Paragon-Software-Group/linux-ntfs3: fs/ntfs3: Accumulated refactoring changes fs/ntfs3: Switch to folio to release resources fs/ntfs3: Add check in ntfs_extend_initialized_size fs/ntfs3: Add more checks in mi_enum_attr (part 2) fs/ntfs3: Equivalent transition from page to folio fs/ntfs3: Fix case when unmarked clusters intersect with zone fs/ntfs3: Fix warning in ni_fiemap
2024-11-28Merge tag 'exfat-for-6.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - If the start cluster of stream entry is invalid, treat it as the empty directory - Valid size of steam entry cannot be greater than data size. If valid_size is invalid, use data_size - Move Direct-IO alignment check to before extending the valid size - Fix uninit-value issue reported by syzbot - Optimize finding directory entry-set in write_inode, rename, unlink * tag 'exfat-for-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: reduce FAT chain traversal exfat: code cleanup for exfat_readdir() exfat: remove argument 'p_dir' from exfat_add_entry() exfat: move exfat_chain_set() out of __exfat_resolve_path() exfat: add exfat_get_dentry_set_by_ei() helper exfat: rename argument name for exfat_move_file and exfat_rename_file exfat: remove unnecessary read entry in __exfat_rename() exfat: fix file being changed by unaligned direct write exfat: fix uninit-value in __exfat_get_dentry_set exfat: fix out-of-bounds access of directory entries
2024-11-28cifs: update internal version numberSteve French
To 2.52 Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-28cifs: unlock on error in smb3_reconfigure()Dan Carpenter
Unlock before returning if smb3_sync_session_ctx_passwords() fails. Fixes: 7e654ab7da03 ("cifs: during remount, make sure passwords are in sync") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-28cifs: during remount, make sure passwords are in syncShyam Prasad N
This fixes scenarios where remount can overwrite the only currently working password, breaking reconnect. We recently introduced a password2 field in both ses and ctx structs. This was done so as to allow the client to rotate passwords for a mount without any downtime. However, when the client transparently handles password rotation, it can swap the values of the two password fields in the ses struct, but not in smb3_fs_context struct that hangs off cifs_sb. This can lead to a situation where a remount unintentionally overwrites a working password in the ses struct. In order to fix this, we first get the passwords in ctx struct in-sync with ses struct, before replacing them with what the passwords that could be passed as a part of remount. Also, in order to avoid race condition between smb2_reconnect and smb3_reconfigure, we make sure to lock session_mutex before changing password and password2 fields of the ses structure. Fixes: 35f834265e0d ("smb3: fix broken reconnect when password changing on the server by allowing password rotation") Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-28cifs: support mounting with alternate password to allow password rotationMeetakshi Setiya
Fixes the case for example where the password specified on mount is a recently expired password, but password2 is valid. Without this patch this mount scenario would fail. This patch introduces the following changes to support password rotation on mount: 1. If an existing session is not found and the new session setup results in EACCES, EKEYEXPIRED or EKEYREVOKED, swap password and password2 (if available), and retry the mount. 2. To match the new mount with an existing session, add conditions to check if a) password and password2 of the new mount and the existing session are the same, or b) password of the new mount is the same as the password2 of the existing session, and password2 of the new mount is the same as the password of the existing session. 3. If an existing session is found, but needs reconnect, retry the session setup after swapping password and password2 (if available), in case the previous attempt results in EACCES, EKEYEXPIRED or EKEYREVOKED. Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>