summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2024-11-12Merge tag 'incore-rtgroups-6.13_2024-11-05' of ↵Carlos Maiolino
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into staging-merge xfs: create incore rt allocation groups [v5.5 04/10] Add in-memory data structures for sharding the realtime volume into independent allocation groups. For existing filesystems, the entire rt volume is modelled as having a single large group, with (potentially) a number of rt extents exceeding 2^32 blocks, though these are not likely to exist because the codebase has been a bit broken for decades. The next series fills in the ondisk format and other supporting structures. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-11-12Merge tag 'metadata-directory-tree-6.13_2024-11-05' of ↵Carlos Maiolino
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into staging-merge xfs: metadata inode directory trees [v5.5 03/10] This series delivers a new feature -- metadata inode directories. This is a separate directory tree (rooted in the superblock) that contains only inodes that contain filesystem metadata. Different metadata objects can be looked up with regular paths. Start by creating xfs_imeta{dir,file}* functions to mediate access to the metadata directory tree. By the end of this mega series, all existing metadata inodes (rt+quota) will use this directory tree instead of the superblock. Next, define the metadir on-disk format, which consists of marking inodes with a new iflag that says they're metadata. This prevents bulkstat and friends from ever getting their hands on fs metadata files. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-11-12Merge tag 'generic-groups-6.13_2024-11-05' of ↵Carlos Maiolino
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into staging-merge xfs: create a generic allocation group structure [v5.5 02/10] Soon we'll be sharding the realtime volume into separate allocation groups. These rt groups will /mostly/ behave the same as the ones on the data device, but since rt groups don't have quite the same set of struct fields as perags, let's hoist the parts that will be shared by both into a common xfs_group object. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-11-12Merge tag 'perag-xarray-6.13_2024-11-05' of ↵Carlos Maiolino
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into staging-merge xfs: convert perag to use xarrays [v5.5 01/10] Convert the xfs_mount perag tree to use an xarray instead of a radix tree. There should be no functional changes here. With a bit of luck, this should all go splendidly. Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2024-11-12bcachefs: Fix assertion pop in bch2_ptr_swab()Kent Overstreet
This runs on extents that haven't yet been validated, so we don't want to assert that we have a valid entry type. Reported-by: syzbot+4f29c3f12f864d8a8d17@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-11-12bcachefs: Fix journal_entry_dev_usage_to_text() overrunKent Overstreet
If the jset_entry_dev_usage is malformed, and too small, our nr_entries calculation will be incorrect - just bail out. Reported-by: syzbot+05d7520be047c9be86e0@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-11-11eventpoll: Control irq suspension for prefer_busy_pollMartin Karsten
When events are reported to userland and prefer_busy_poll is set, irqs are temporarily suspended using napi_suspend_irqs. If no events are found and ep_poll would go to sleep, irq suspension is cancelled using napi_resume_irqs. Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> Co-developed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://patch.msgid.link/20241109050245.191288-5-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11eventpoll: Trigger napi_busy_loop, if prefer_busy_poll is setMartin Karsten
Setting prefer_busy_poll now leads to an effectively nonblocking iteration though napi_busy_loop, even when busy_poll_usecs is 0. Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> Co-developed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://patch.msgid.link/20241109050245.191288-4-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12erofs: add SEEK_{DATA,HOLE} supportGao Xiang
Many userspace programs (including erofs-utils itself) use SEEK_DATA / SEEK_HOLE to parse hole extents in addition to FIEMAP. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20241011065128.2097377-1-hsiangkao@linux.alibaba.com
2024-11-11mm/list_lru: simplify the list_lru walk callback functionKairui Song
Now isolation no longer takes the list_lru global node lock, only use the per-cgroup lock instead. And this lock is inside the list_lru_one being walked, no longer needed to pass the lock explicitly. Link: https://lkml.kernel.org/r/20241104175257.60853-7-ryncsn@gmail.com Signed-off-by: Kairui Song <kasong@tencent.com> Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11mm/list_lru: split the lock to per-cgroup scopeKairui Song
Currently, every list_lru has a per-node lock that protects adding, deletion, isolation, and reparenting of all list_lru_one instances belonging to this list_lru on this node. This lock contention is heavy when multiple cgroups modify the same list_lru. This lock can be split into per-cgroup scope to reduce contention. To achieve this, we need a stable list_lru_one for every cgroup. This commit adds a lock to each list_lru_one and introduced a helper function lock_list_lru_of_memcg, making it possible to pin the list_lru of a memcg. Then reworked the reparenting process. Reparenting will switch the list_lru_one instances one by one. By locking each instance and marking it dead using the nr_items counter, reparenting ensures that all items in the corresponding cgroup (on-list or not, because items have a stable cgroup, see below) will see the list_lru_one switch synchronously. Objcg reparent is also moved after list_lru reparent so items will have a stable mem cgroup until all list_lru_one instances are drained. The only caller that doesn't work the *_obj interfaces are direct calls to list_lru_{add,del}. But it's only used by zswap and that's also based on objcg, so it's fine. This also changes the bahaviour of the isolation function when LRU_RETRY or LRU_REMOVED_RETRY is returned, because now releasing the lock could unblock reparenting and free the list_lru_one, isolation function will have to return withoug re-lock the lru. prepare() { mkdir /tmp/test-fs modprobe brd rd_nr=1 rd_size=33554432 mkfs.xfs -f /dev/ram0 mount -t xfs /dev/ram0 /tmp/test-fs for i in $(seq 1 512); do mkdir "/tmp/test-fs/$i" for j in $(seq 1 10240); do echo TEST-CONTENT > "/tmp/test-fs/$i/$j" done & done; wait } do_test() { read_worker() { sleep 1 tar -cv "$1" &>/dev/null } read_in_all() { cd "/tmp/test-fs" && ls for i in $(seq 1 512); do (exec sh -c 'echo "$PPID"') > "/sys/fs/cgroup/benchmark/$i/cgroup.procs" read_worker "$i" & done; wait } for i in $(seq 1 512); do mkdir -p "/sys/fs/cgroup/benchmark/$i" done echo +memory > /sys/fs/cgroup/benchmark/cgroup.subtree_control echo 512M > /sys/fs/cgroup/benchmark/memory.max echo 3 > /proc/sys/vm/drop_caches time read_in_all } Above script simulates compression of small files in multiple cgroups with memory pressure. Run prepare() then do_test for 6 times: Before: real 0m7.762s user 0m11.340s sys 3m11.224s real 0m8.123s user 0m11.548s sys 3m2.549s real 0m7.736s user 0m11.515s sys 3m11.171s real 0m8.539s user 0m11.508s sys 3m7.618s real 0m7.928s user 0m11.349s sys 3m13.063s real 0m8.105s user 0m11.128s sys 3m14.313s After this commit (about ~15% faster): real 0m6.953s user 0m11.327s sys 2m42.912s real 0m7.453s user 0m11.343s sys 2m51.942s real 0m6.916s user 0m11.269s sys 2m43.957s real 0m6.894s user 0m11.528s sys 2m45.346s real 0m6.911s user 0m11.095s sys 2m43.168s real 0m6.773s user 0m11.518s sys 2m40.774s Link: https://lkml.kernel.org/r/20241104175257.60853-6-ryncsn@gmail.com Signed-off-by: Kairui Song <kasong@tencent.com> Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11ocfs2: fix UBSAN warning in ocfs2_verify_volume()Dmitry Antipov
Syzbot has reported the following splat triggered by UBSAN: UBSAN: shift-out-of-bounds in fs/ocfs2/super.c:2336:10 shift exponent 32768 is too large for 32-bit type 'int' CPU: 2 UID: 0 PID: 5255 Comm: repro Not tainted 6.12.0-rc4-syzkaller-00047-gc2ee9f594da8 #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x241/0x360 ? __pfx_dump_stack_lvl+0x10/0x10 ? __pfx__printk+0x10/0x10 ? __asan_memset+0x23/0x50 ? lockdep_init_map_type+0xa1/0x910 __ubsan_handle_shift_out_of_bounds+0x3c8/0x420 ocfs2_fill_super+0xf9c/0x5750 ? __pfx_ocfs2_fill_super+0x10/0x10 ? __pfx_validate_chain+0x10/0x10 ? __pfx_validate_chain+0x10/0x10 ? validate_chain+0x11e/0x5920 ? __lock_acquire+0x1384/0x2050 ? __pfx_validate_chain+0x10/0x10 ? string+0x26a/0x2b0 ? widen_string+0x3a/0x310 ? string+0x26a/0x2b0 ? bdev_name+0x2b1/0x3c0 ? pointer+0x703/0x1210 ? __pfx_pointer+0x10/0x10 ? __pfx_format_decode+0x10/0x10 ? __lock_acquire+0x1384/0x2050 ? vsnprintf+0x1ccd/0x1da0 ? snprintf+0xda/0x120 ? __pfx_lock_release+0x10/0x10 ? do_raw_spin_lock+0x14f/0x370 ? __pfx_snprintf+0x10/0x10 ? set_blocksize+0x1f9/0x360 ? sb_set_blocksize+0x98/0xf0 ? setup_bdev_super+0x4e6/0x5d0 mount_bdev+0x20c/0x2d0 ? __pfx_ocfs2_fill_super+0x10/0x10 ? __pfx_mount_bdev+0x10/0x10 ? vfs_parse_fs_string+0x190/0x230 ? __pfx_vfs_parse_fs_string+0x10/0x10 legacy_get_tree+0xf0/0x190 ? __pfx_ocfs2_mount+0x10/0x10 vfs_get_tree+0x92/0x2b0 do_new_mount+0x2be/0xb40 ? __pfx_do_new_mount+0x10/0x10 __se_sys_mount+0x2d6/0x3c0 ? __pfx___se_sys_mount+0x10/0x10 ? do_syscall_64+0x100/0x230 ? __x64_sys_mount+0x20/0xc0 do_syscall_64+0xf3/0x230 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f37cae96fda Code: 48 8b 0d 51 ce 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1e ce 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007fff6c1aa228 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00007fff6c1aa240 RCX: 00007f37cae96fda RDX: 00000000200002c0 RSI: 0000000020000040 RDI: 00007fff6c1aa240 RBP: 0000000000000004 R08: 00007fff6c1aa280 R09: 0000000000000000 R10: 00000000000008c0 R11: 0000000000000206 R12: 00000000000008c0 R13: 00007fff6c1aa280 R14: 0000000000000003 R15: 0000000001000000 </TASK> For a really damaged superblock, the value of 'i_super.s_blocksize_bits' may exceed the maximum possible shift for an underlying 'int'. So add an extra check whether the aforementioned field represents the valid block size, which is 512 bytes, 1K, 2K, or 4K. Link: https://lkml.kernel.org/r/20241106092100.2661330-1-dmantipov@yandex.ru Fixes: ccd979bdbce9 ("[PATCH] OCFS2: The Second Oracle Cluster Filesystem") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+56f7cd1abe4b8e475180@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=56f7cd1abe4b8e475180 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11nilfs2: fix null-ptr-deref in block_dirty_buffer tracepointRyusuke Konishi
When using the "block:block_dirty_buffer" tracepoint, mark_buffer_dirty() may cause a NULL pointer dereference, or a general protection fault when KASAN is enabled. This happens because, since the tracepoint was added in mark_buffer_dirty(), it references the dev_t member bh->b_bdev->bd_dev regardless of whether the buffer head has a pointer to a block_device structure. In the current implementation, nilfs_grab_buffer(), which grabs a buffer to read (or create) a block of metadata, including b-tree node blocks, does not set the block device, but instead does so only if the buffer is not in the "uptodate" state for each of its caller block reading functions. However, if the uptodate flag is set on a folio/page, and the buffer heads are detached from it by try_to_free_buffers(), and new buffer heads are then attached by create_empty_buffers(), the uptodate flag may be restored to each buffer without the block device being set to bh->b_bdev, and mark_buffer_dirty() may be called later in that state, resulting in the bug mentioned above. Fix this issue by making nilfs_grab_buffer() always set the block device of the super block structure to the buffer head, regardless of the state of the buffer's uptodate flag. Link: https://lkml.kernel.org/r/20241106160811.3316-3-konishi.ryusuke@gmail.com Fixes: 5305cb830834 ("block: add block_{touch|dirty}_buffer tracepoint") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Ubisectech Sirius <bugreport@valiantsec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11nilfs2: fix null-ptr-deref in block_touch_buffer tracepointRyusuke Konishi
Patch series "nilfs2: fix null-ptr-deref bugs on block tracepoints". This series fixes null pointer dereference bugs that occur when using nilfs2 and two block-related tracepoints. This patch (of 2): It has been reported that when using "block:block_touch_buffer" tracepoint, touch_buffer() called from __nilfs_get_folio_block() causes a NULL pointer dereference, or a general protection fault when KASAN is enabled. This happens because since the tracepoint was added in touch_buffer(), it references the dev_t member bh->b_bdev->bd_dev regardless of whether the buffer head has a pointer to a block_device structure. In the current implementation, the block_device structure is set after the function returns to the caller. Here, touch_buffer() is used to mark the folio/page that owns the buffer head as accessed, but the common search helper for folio/page used by the caller function was optimized to mark the folio/page as accessed when it was reimplemented a long time ago, eliminating the need to call touch_buffer() here in the first place. So this solves the issue by eliminating the touch_buffer() call itself. Link: https://lkml.kernel.org/r/20241106160811.3316-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20241106160811.3316-2-konishi.ryusuke@gmail.com Fixes: 5305cb830834 ("block: add block_{touch|dirty}_buffer tracepoint") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: Ubisectech Sirius <bugreport@valiantsec.com> Closes: https://lkml.kernel.org/r/86bd3013-887e-4e38-960f-ca45c657f032.bugreport@valiantsec.com Reported-by: syzbot+9982fb8d18eba905abe2@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9982fb8d18eba905abe2 Tested-by: syzbot+9982fb8d18eba905abe2@syzkaller.appspotmail.com Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11ocfs2: fix uninitialized value in ocfs2_file_read_iter()Dmitry Antipov
Syzbot has reported the following KMSAN splat: BUG: KMSAN: uninit-value in ocfs2_file_read_iter+0x9a4/0xf80 ocfs2_file_read_iter+0x9a4/0xf80 __io_read+0x8d4/0x20f0 io_read+0x3e/0xf0 io_issue_sqe+0x42b/0x22c0 io_wq_submit_work+0xaf9/0xdc0 io_worker_handle_work+0xd13/0x2110 io_wq_worker+0x447/0x1410 ret_from_fork+0x6f/0x90 ret_from_fork_asm+0x1a/0x30 Uninit was created at: __alloc_pages_noprof+0x9a7/0xe00 alloc_pages_mpol_noprof+0x299/0x990 alloc_pages_noprof+0x1bf/0x1e0 allocate_slab+0x33a/0x1250 ___slab_alloc+0x12ef/0x35e0 kmem_cache_alloc_bulk_noprof+0x486/0x1330 __io_alloc_req_refill+0x84/0x560 io_submit_sqes+0x172f/0x2f30 __se_sys_io_uring_enter+0x406/0x41c0 __x64_sys_io_uring_enter+0x11f/0x1a0 x64_sys_call+0x2b54/0x3ba0 do_syscall_64+0xcd/0x1e0 entry_SYSCALL_64_after_hwframe+0x77/0x7f Since an instance of 'struct kiocb' may be passed from the block layer with 'private' field uninitialized, introduce 'ocfs2_iocb_init_rw_locked()' and use it from where 'ocfs2_dio_end_io()' might take care, i.e. in 'ocfs2_file_read_iter()' and 'ocfs2_file_write_iter()'. Link: https://lkml.kernel.org/r/20241029091736.1501946-1-dmantipov@yandex.ru Fixes: 7cdfc3a1c397 ("ocfs2: Remember rw lock level during direct io") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+a73e253cca4f0230a5a5@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a73e253cca4f0230a5a5 Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11nfsd: have nfsd4_deleg_getattr_conflict pass back write deleg pointerJeff Layton
Currently we pass back the size and whether it has been modified, but those just mirror values tracked inside the delegation. In a later patch, we'll need to get at the timestamps in the delegation too, so just pass back a reference to the write delegation, and use that to properly override values in the iattr. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11nfsd: drop the nfsd4_fattr_args "size" fieldJeff Layton
We already have a slot for this in the kstat structure. Just overwrite that instead of keeping a copy. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11nfsd: drop the ncf_cb_bmap fieldJeff Layton
This is always the same value, and in a later patch we're going to need to set bits in WORD2. We can simplify this code and save a little space in the delegation too. Just hardcode the bitmap in the callback encode function. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11nfsd: drop inode parameter from nfsd4_change_attribute()Jeff Layton
The inode that nfs4_open_delegation() passes to this function is wrong, which throws off the result. The inode will end up getting a directory-style change attr instead of a regular-file-style one. Fix up nfs4_delegation_stat() to fetch STATX_MODE, and then drop the inode parameter from nfsd4_change_attribute(), since it's no longer needed. Fixes: c5967721e106 ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11NFSD: Remove unused function parameterChuck Lever
Clean up: Commit 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd") moved the fh_verify() call site out of nfsd_open(). That was the only user of nfsd_open's @rqstp parameter, so that parameter can be removed. Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11NFSD: Remove unnecessary posix_acl_entry pointer initializationThorsten Blum
The posix_acl_entry pointer pe is already initialized by the FOREACH_ACL_ENTRY() macro. Remove the unnecessary initialization. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11NFSD: Prevent a potential integer overflowChuck Lever
If the tag length is >= U32_MAX - 3 then the "length + 4" addition can result in an integer overflow. Address this by splitting the decoding into several steps so that decode_cb_compound4res() does not have to perform arithmetic on the unsafe length value. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-11-11netfs/fscache: Add a memory barrier for FSCACHE_VOLUME_CREATINGZizhi Wo
In fscache_create_volume(), there is a missing memory barrier between the bit-clearing operation and the wake-up operation. This may cause a situation where, after a wake-up, the bit-clearing operation hasn't been detected yet, leading to an indefinite wait. The triggering process is as follows: [cookie1] [cookie2] [volume_work] fscache_perform_lookup fscache_create_volume fscache_perform_lookup fscache_create_volume fscache_create_volume_work cachefiles_acquire_volume clear_and_wake_up_bit test_and_set_bit test_and_set_bit goto maybe_wait goto no_wait In the above process, cookie1 and cookie2 has the same volume. When cookie1 enters the -no_wait- process, it will clear the bit and wake up the waiting process. If a barrier is missing, it may cause cookie2 to remain in the -wait- process indefinitely. In commit 3288666c7256 ("fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work()"), barriers were added to similar operations in fscache_create_volume_work(), but fscache_create_volume() was missed. By combining the clear and wake operations into clear_and_wake_up_bit() to fix this issue. Fixes: bfa22da3ed65 ("fscache: Provide and use cache methods to lookup/create/free a volume") Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Link: https://lore.kernel.org/r/20241107110649.3980193-6-wozizhi@huawei.com Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-11cachefiles: Fix NULL pointer dereference in object->fileZizhi Wo
At present, the object->file has the NULL pointer dereference problem in ondemand-mode. The root cause is that the allocated fd and object->file lifetime are inconsistent, and the user-space invocation to anon_fd uses object->file. Following is the process that triggers the issue: [write fd] [umount] cachefiles_ondemand_fd_write_iter fscache_cookie_state_machine cachefiles_withdraw_cookie if (!file) return -ENOBUFS cachefiles_clean_up_object cachefiles_unmark_inode_in_use fput(object->file) object->file = NULL // file NULL pointer dereference! __cachefiles_write(..., file, ...) Fix this issue by add an additional reference count to the object->file before write/llseek, and decrement after it finished. Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Link: https://lore.kernel.org/r/20241107110649.3980193-5-wozizhi@huawei.com Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-11cachefiles: Clean up in cachefiles_commit_tmpfile()Zizhi Wo
Currently, cachefiles_commit_tmpfile() will only be called if object->flags is set to CACHEFILES_OBJECT_USING_TMPFILE. Only cachefiles_create_file() and cachefiles_invalidate_cookie() set this flag. Both of these functions replace object->file with the new tmpfile, and both are called by fscache_cookie_state_machine(), so there are no concurrency issues. So the equation "d_backing_inode(dentry) == file_inode(object->file)" in cachefiles_commit_tmpfile() will never hold true according to the above conditions. This patch removes this part of the redundant code and does not involve any other logical changes. Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Link: https://lore.kernel.org/r/20241107110649.3980193-4-wozizhi@huawei.com Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-11cachefiles: Fix missing pos updates in cachefiles_ondemand_fd_write_iter()Zizhi Wo
In the erofs on-demand loading scenario, read and write operations are usually delivered through "off" and "len" contained in read req in user mode. Naturally, pwrite is used to specify a specific offset to complete write operations. However, if the write(not pwrite) syscall is called multiple times in the read-ahead scenario, we need to manually update ki_pos after each write operation to update file->f_pos. This step is currently missing from the cachefiles_ondemand_fd_write_iter function, added to address this issue. Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Link: https://lore.kernel.org/r/20241107110649.3980193-3-wozizhi@huawei.com Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-11cachefiles: Fix incorrect length return value in ↵Zizhi Wo
cachefiles_ondemand_fd_write_iter() cachefiles_ondemand_fd_write_iter() function first aligns "pos" and "len" to block boundaries. When calling __cachefiles_write(), the aligned "pos" is passed in, but "len" is the original unaligned value(iter->count). Additionally, the returned length of the write operation is the modified "len" aligned by block size, which is unreasonable. The alignment of "pos" and "len" is intended only to check whether the cache has enough space. But the modified len should not be used as the return value of cachefiles_ondemand_fd_write_iter() because the length we passed to __cachefiles_write() is the previous "len". Doing so would result in a mismatch in the data written on-demand. For example, if the length of the user state passed in is not aligned to the block size (the preread scene/DIO writes only need 512 alignment/Fault injection), the length of the write will differ from the actual length of the return. To solve this issue, since the __cachefiles_prepare_write() modifies the size of "len", we pass "aligned_len" to __cachefiles_prepare_write() to calculate the free blocks and use the original "len" as the return value of cachefiles_ondemand_fd_write_iter(). Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Zizhi Wo <wozizhi@huawei.com> Link: https://lore.kernel.org/r/20241107110649.3980193-2-wozizhi@huawei.com Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-11iomap: drop an obsolete comment in iomap_dio_bio_iterChristoph Hellwig
No more zone append special casing in iomap for quite a while. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241111121340.1390540-1-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-11btrfs: send: check for read-only send root under critical sectionFilipe Manana
We're checking if the send root is read-only without being under the protection of the root's root_item_lock spinlock, which is what protects the root's flags when clearing the read-only flag, done at btrfs_ioctl_subvol_setflags(). Furthermore, it should be done in the same critical section that increments the root's send_in_progress counter, as btrfs_ioctl_subvol_setflags() clears the read-only flag in the same critical section that checks the counter's value. So fix this by moving the read-only check under the critical section delimited by the root's root_item_lock which also increments the root's send_in_progress counter. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: send: check for dead send root under critical sectionFilipe Manana
We're checking if the send root is dead without the protection of the root's root_item_lock spinlock, which is what protects the root's flags. The inverse, setting the dead flag on a root, is done under the protection of that lock, at btrfs_delete_subvolume(). Also checking and updating the root's send_in_progress counter is supposed to be done in the same critical section as checking for or setting the root dead flag, so that these operations are done atomically as a single step (which is correctly done by btrfs_delete_subvolume()). So fix this by checking if the send root is dead in the same critical section that updates the send_in_progress counter, which is protected by the root's root_item_lock spinlock. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: remove check for NULL fs_info at btrfs_folio_end_lock_bitmap()Filipe Manana
Smatch complains about possibly dereferencing a NULL fs_info at btrfs_folio_end_lock_bitmap(): fs/btrfs/subpage.c:332 btrfs_folio_end_lock_bitmap() warn: variable dereferenced before check 'fs_info' (see line 326) because we access fs_info to set the 'start_bit' variable before doing the check for a NULL fs_info. However fs_info is never NULL, since in the only caller of btrfs_folio_end_lock_bitmap() is extent_writepage(), where we have an inode which always as a non-NULL fs_info. So remove the check for a NULL fs_info at btrfs_folio_end_lock_bitmap(). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: fix warning on PTR_ERR() against NULL device at btrfs_control_ioctl()Filipe Manana
Smatch complains about calling PTR_ERR() against a NULL pointer: fs/btrfs/super.c:2272 btrfs_control_ioctl() warn: passing zero to 'PTR_ERR' Fix this by calling PTR_ERR() against the device pointer only if it contains an error. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: fix a typo in btrfs_use_zone_appendChristoph Hellwig
REQ_OP_ZONE_APPNED -> REQ_OP_ZONE_APPEND. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: avoid superfluous calls to free_extent_map() in btrfs_encoded_read()Mark Harmstone
Change the control flow of btrfs_encoded_read() so that it doesn't call free_extent_map() when we know that this has already been done. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Mark Harmstone <maharmstone@fb.com> Suggested-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: simplify logic to decrement snapshot counter at btrfs_mksnapshot()Filipe Manana
There's no point in having a 'snapshot_force_cow' variable to track if we need to decrement the root->snapshot_force_cow counter, as we never jump to the 'out' label after incrementing the counter. Simplify this by removing the variable and always decrementing the counter before the 'out' label, right after the call to btrfs_mksubvol(). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: remove hole from struct btrfs_delayed_nodeFilipe Manana
On x86_64 and a release kernel, there's a 4 bytes hole in the structure after the ref count field: struct btrfs_delayed_node { u64 inode_id; /* 0 8 */ u64 bytes_reserved; /* 8 8 */ struct btrfs_root * root; /* 16 8 */ struct list_head n_list; /* 24 16 */ struct list_head p_list; /* 40 16 */ struct rb_root_cached ins_root; /* 56 16 */ /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */ struct rb_root_cached del_root; /* 72 16 */ struct mutex mutex; /* 88 32 */ struct btrfs_inode_item inode_item; /* 120 160 */ /* --- cacheline 4 boundary (256 bytes) was 24 bytes ago --- */ refcount_t refs; /* 280 4 */ /* XXX 4 bytes hole, try to pack */ u64 index_cnt; /* 288 8 */ long unsigned int flags; /* 296 8 */ int count; /* 304 4 */ u32 curr_index_batch_size; /* 308 4 */ u32 index_item_leaves; /* 312 4 */ /* size: 320, cachelines: 5, members: 15 */ /* sum members: 312, holes: 1, sum holes: 4 */ /* padding: 4 */ }; Move the 'count' field, which is 4 bytes long, to just below the ref count field, so we eliminate the hole and reduce the structure size from 320 bytes down to 312 bytes: struct btrfs_delayed_node { u64 inode_id; /* 0 8 */ u64 bytes_reserved; /* 8 8 */ struct btrfs_root * root; /* 16 8 */ struct list_head n_list; /* 24 16 */ struct list_head p_list; /* 40 16 */ struct rb_root_cached ins_root; /* 56 16 */ /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */ struct rb_root_cached del_root; /* 72 16 */ struct mutex mutex; /* 88 32 */ struct btrfs_inode_item inode_item; /* 120 160 */ /* --- cacheline 4 boundary (256 bytes) was 24 bytes ago --- */ refcount_t refs; /* 280 4 */ int count; /* 284 4 */ u64 index_cnt; /* 288 8 */ long unsigned int flags; /* 296 8 */ u32 curr_index_batch_size; /* 304 4 */ u32 index_item_leaves; /* 308 4 */ /* size: 312, cachelines: 5, members: 15 */ /* last cacheline: 56 bytes */ }; This now allows to have 13 delayed nodes per 4K page instead of 12. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: update stale comment for struct btrfs_delayed_ref_node::add_listFilipe Manana
The comment refers to a list in the respective delayed ref head that no longer exists (ref_list), it was replaced with a rbtree (ref_tree) in commit 0e0adbcfdc90 ("btrfs: track refs in a rb_tree instead of a list"). So update the stale comment to refer to the rbtree instead of the old list. Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: add new ioctl to wait for cleaned subvolumesDavid Sterba
Add a new unprivileged ioctl that will let the command 'btrfs subvolume sync' work without the (privileged) SEARCH_TREE ioctl. There are several modes of operation, where the most common ones are to wait on a specific subvolume or all currently queued for cleaning. This is utilized e.g. in backup applications that delete subvolumes and wait until they're cleaned to check for remaining space. The other modes are for flexibility, e.g. for monitoring or checkpoints in the queue of deleted subvolumes, again without the need to use SEARCH_TREE. Notes: - waiting is interruptible, the timeout is set to 1 second and is not configurable - repeated calls to the ioctl see a different state, so this is inherently racy when using e.g. the count or peek next/last Use cases: - a subvolume A was deleted, wait for cleaning (WAIT_FOR_ONE) - a bunch of subvolumes were deleted, wait for all (WAIT_FOR_QUEUED or PEEK_LAST + WAIT_FOR_ONE) - count how many are queued (not blocking), for monitoring purposes - report progress (PEEK_NEXT), may miss some if cleaning is quick - own waiting in user space (PEEK_LAST until it's 0) Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: simplify range tracking in cow_file_range()Haisu Wang
Simplify tracking of the range processed by using cur_alloc_size only to store the reserved part that may fail to the allocated extent. Remove the ram_size as well since it is always equal to cur_alloc_size in the context. Advance the start in normal path until extent allocation succeeds and keep the start unchanged in the error handling path. Passed the fstest generic/475 test for a hundred times with quota enabled. And a modified generic/475 test by removing the sleep time for a hundred times. About one tenth of the tests do enter the error handling path due to fail to reserve extent. Suggested-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Haisu Wang <haisuwang@tencent.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: remove conditional path allocation in btrfs_read_locked_inode()Leo Martins
Remove conditional path allocation from btrfs_read_locked_inode(). Add an ASSERT(path) to indicate it should never be called with a NULL path. Call btrfs_read_locked_inode() directly from btrfs_iget(). This causes code duplication between btrfs_iget() and btrfs_iget_path(), but I think this is justifiable as it removes the need for conditionally allocating the path inside of btrfs_read_locked_inode(). This makes the code easier to reason about and makes it clear who has the responsibility of allocating and freeing the path. Signed-off-by: Leo Martins <loemra.dev@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: push cleanup into btrfs_read_locked_inode()Leo Martins
Move btrfs_add_inode_to_root() so it can be called from btrfs_read_locked_inode(), no changes were made to the function. Move cleanup code from btrfs_iget_path() to btrfs_read_locked_inode. This improves readability and improves a leaky abstraction. Previously btrfs_iget_path() had to handle a positive error case as a result of a call to btrfs_search_slot(), but it makes more sense to handle this closer to the source of the call. Signed-off-by: Leo Martins <loemra.dev@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: add struct io_btrfs_cmd as type for io_uring_cmd_to_pdu()Mark Harmstone
Add struct io_btrfs_cmd as a wrapper type for io_uring_cmd_to_pdu(), rather than using a raw pointer. Suggested-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)Mark Harmstone
Add an io_uring command for encoded reads, using the same interface as the existing BTRFS_IOC_ENCODED_READ ioctl. btrfs_uring_encoded_read() is an io_uring version of btrfs_ioctl_encoded_read(), which validates the user input and calls btrfs_encoded_read() to read the appropriate metadata. If we determine that we need to read an extent from disk, we call btrfs_encoded_read_regular_fill_pages() through btrfs_uring_read_extent() to prepare the bio. The existing btrfs_encoded_read_regular_fill_pages() is changed so that if it is passed a valid uring_ctx, rather than waking up any waiting threads it calls btrfs_uring_read_extent_endio(). This in turn copies the read data back to userspace, and calls io_uring_cmd_done() to complete the io_uring command. Because we're potentially doing a non-blocking read, btrfs_uring_read_extent() doesn't clean up after itself if it returns -EIOCBQUEUED. Instead, it allocates a priv struct, populates the fields there that we will need to unlock the inode and free our allocations, and defers this to the btrfs_uring_read_finished() that gets called when the bio completes. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: move priv off stack in btrfs_encoded_read_regular_fill_pages()Mark Harmstone
Change btrfs_encoded_read_regular_fill_pages() so that the priv struct is allocated rather than stored on the stack, in preparation for adding an asynchronous mode to the function. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: don't sleep in btrfs_encoded_read() if IOCB_NOWAIT is setMark Harmstone
Change btrfs_encoded_read() so that it returns -EAGAIN rather than sleeps if IOCB_NOWAIT is set in iocb->ki_flags. The conditions that require sleeping are: inode lock, writeback, extent lock, ordered range. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: change btrfs_encoded_read() so that reading of extent is done by callerMark Harmstone
Change the behaviour of btrfs_encoded_read() so that if it needs to read an extent from disk, it leaves the extent and inode locked and returns -EIOCBQUEUED. The caller is then responsible for doing the I/O via btrfs_encoded_read_regular() and unlocking the extent and inode. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: remove pointless iocb::ki_pos addition in btrfs_encoded_read()Mark Harmstone
iocb->ki_pos isn't used after this function, so there's no point in changing its value. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: remove no longer used delayed ref head search functionalityFilipe Manana
After the previous patch, which converted the rb-tree used to track delayed ref heads into an xarray, the find_ref_head() function is now used only by one caller which always passes false to the 'return_bigger' argument. So remove the 'return_bigger' logic, simplifying the function, and move all the function code to the single caller. Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: track delayed ref heads in an xarrayFilipe Manana
Currently we use a red black tree (rb-tree) to track the delayed ref heads (in struct btrfs_delayed_ref_root::href_root). This however is not very efficient when the number of delayed ref heads is large (and it's very common to be at least in the order of thousands) since rb-trees are binary trees. For example for 10K delayed ref heads, the tree has a depth of 13. Besides that, inserting into the tree requires navigating through it and pulling useless cache lines in the process since the red black tree nodes are embedded within the delayed ref head structure - on the other hand, by being embedded, it requires no extra memory allocations. We can improve this by using an xarray instead which has a much higher branching factor than a red black tree (binary balanced tree) and is more cache friendly and behaves like a resizable array, with a much better search and insertion complexity than a red black tree. This only has one small disadvantage which is that insertion will sometimes require allocating memory for the xarray - which may fail (not that often since it uses a kmem_cache) - but on the other hand we can reduce the delayed ref head structure size by 24 bytes (from 152 down to 128 bytes) after removing the embedded red black tree node, meaning than we can now fit 32 delayed ref heads per 4K page instead of 26, and that gain compensates for the occasional memory allocations needed for the xarray nodes. We also end up using only 2 cache lines instead of 3 per delayed ref head. Running the following fs_mark test showed some improvements: $ cat test.sh #!/bin/bash DEV=/dev/nullb0 MNT=/mnt/nullb0 MOUNT_OPTIONS="-o ssd" FILES=100000 THREADS=$(nproc --all) echo "performance" | \ tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor mkfs.btrfs -f $DEV mount $MOUNT_OPTIONS $DEV $MNT OPTS="-S 0 -L 5 -n $FILES -s 0 -t $THREADS -k" for ((i = 1; i <= $THREADS; i++)); do OPTS="$OPTS -d $MNT/d$i" done fs_mark $OPTS umount $MNT Before this patch: FSUse% Count Size Files/sec App Overhead 10 1200000 0 171845.7 12253839 16 2400000 0 230898.7 12308254 23 3600000 0 212292.9 12467768 30 4800000 0 195737.8 12627554 46 6000000 0 171055.2 12783329 After this patch: FSUse% Count Size Files/sec App Overhead 10 1200000 0 173835.0 12246131 16 2400000 0 233537.8 12271746 23 3600000 0 220398.7 12307737 30 4800000 0 204483.6 12392318 40 6000000 0 182923.3 12771843 Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11btrfs: add comments regarding locking to struct btrfs_delayed_ref_rootFilipe Manana
Add some comments to struct btrfs_delayed_ref_root's fields to mention what its spinlock protects. Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>