summaryrefslogtreecommitdiff
path: root/fs/bcachefs/fs-io.c
AgeCommit message (Collapse)Author
2023-10-22bcachefs: bkey_on_stackKent Overstreet
This implements code for storing small bkeys on the stack and allocating out of a mempool if they're too big. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Use wbc_to_write_flags()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Some reflink fixesKent Overstreet
len might fit into a loff_t when aligned_len does not - make sure we use a u64 for aligned_len. Also, we weren't always extending the inode correctly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Eliminate function calls in DIO fastpathsKent Overstreet
We can assume that usually buffered and O_DIRECT IO won't be mixed, and the calls to flush the page cache won't be needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: DIO write path only needs to shoot down pagecache once, not twiceKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add pagecache_add lock to buffered IO path, fault pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't hold inode lock longer than necessary in dio write pathKent Overstreet
In theory we should be able to do (non appending/extending) dio writes without taking the inode lock at all - but this gets us most of the way there. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Avoid atomics in write fast pathKent Overstreet
This adds some horrible hacks, but the atomic ops for closures were getting to be a pretty expensive part of the write path. We don't want to rip out closures entirely from the write path, because they're used for e.g. waiting on the allocator, or waiting on the journal flush, and that stuff would get really ugly without closures. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix an error path raceKent Overstreet
On IO error, bch2_writepages_io_done() will set the page state to indicate nothing's already reserved (since the write didn't happen, we don't know what's already reserved). This can race with the buffered IO path, in between getting a disk reservation and calling bch2_set_page_dirty(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Refactor bch2_trans_commit() pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Limit bios in writepages path to 256MKent Overstreet
This works around a bug where bio_full() doesn't check for bio->bi_iter.bi_size overflowing - and, we don't really want to build bios that are that big anyways. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill bchfs_extent_update()Kent Overstreet
The generic IO path now handles inode updates for i_size and i_sectors - this means we can drop a fair amount of code from fs-io.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Convert bch2_fpunch to bch2_extent_update()Kent Overstreet
As before - we're moving non Linux specific code out of fs-io.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Split out bchfs_extent_update()Kent Overstreet
The next few patches are going to be more moving the logic around i_size/i_sectors updates to io.c, and better separating the Linux VFS specific code from core bcachefs code, to better support the fuse port. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill some dependencies on ei_inodeKent Overstreet
Moving bch2_extent_update() to io.c will be greatly simplified if we no longer have to keep ei_inode.bi_size/bi_sectors up to date. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Check if extending inode differentlyKent Overstreet
In bch2_extent_update(), we have to update the inode if i_size is changing (the file is being extend) or if i_sectors is changing, but we want to avoid touching the inode if it's not necessary. Change sum_sector_overwrites() to also check if there's already data above where we're writing to - this means we're definitely not extending the file. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add a lock to bch_page_stateKent Overstreet
We can't use the page lock to protect it, because on writeback IO error we need to access the page state before calling end_page_writeback() and the page lock semantics are completely insane so that deadlocks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_extent_atomic_end() now traverses iterKent Overstreet
This fixes a bug in io.c bch2_write_index_default() - it was missing the traverse call, but bch2_extent_atomic_end returns an error now and can just call it itself. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bch2_inode_peek()/bch2_inode_write()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix __bch2_buffered_write() returning -ENOMEMKent Overstreet
When grab_cache_page_write_begin() fails but we did pin some pages, we shouldn't return -ENOMEM, we should do a partial write. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Rework btree iterator lifetimesKent Overstreet
The btree_trans struct needs to memoize/cache btree iterators, so that on transaction restart we don't have to completely redo btree lookups, and so that we can do them all at once in the correct order when the transaction had to restart to avoid a deadlock. This switches the btree iterator lookups to work based on iterator position, instead of trying to match them up based on the stack trace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill deferred btree updatesKent Overstreet
Will be replaced by cached btree iterators Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix for partial buffered writesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: BTREE_ITER_SLOTS isn't a type of btree iterKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Trivial cleanupKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Convert a BUG_ON() to a warningKent Overstreet
We shouldn't ever be writing past i_size - but, apparently there's still a bug to track down. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Handle bio_iov_iter_get_pages() returning unaligned bioKent Overstreet
If the user buffer isn't aligned to the filesystem block size, on a large enough IO - where it won't fit into a single bio - bio_iov_iter_get_pages() won't necessarily return a bio with the proper alignment. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add support for FALLOC_FL_INSERT_RANGEKent Overstreet
Somewhat tricky and ugly, because iterating over extents backwards is a pain. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't write past eofKent Overstreet
When converting from PAGE_SIZE to block_size, the .mkwrite path was missed Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improved bch2_fcollapse()Kent Overstreet
Move extents instead of copying them - this way, we can iterate over only live extents, not the entire keyspace. Also, this means we can mostly skip running triggers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Inline some fast pathsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Check alignment in write pathKent Overstreet
Also - fix alignment in bch2_set_page_dirty() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: ReflinkKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Refactor bch2_extent_trim_atomic() for reflinkKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Mark space as unallocated on write failureKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Truncate/fpunch now works on block boundaries, not pageKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Count reserved extents as holesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Handle partial pages in seek data/holeKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Change buffered write path to write to partial pagesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Change __bch2_writepage() to not write to holesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix bch2_seek_data()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Refactor various code to not be extent specificKent Overstreet
With reflink, various code now has to handle both KEY_TYPE_extent or KEY_TYPE_reflink_v - so, convert it to be generic across all keys with pointers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Dont't call bch2_trans_begin_updates() in bch2_extent_update()Kent Overstreet
Prep work for reflink - for reflink, we're going to be using bch2_extent_update() with other updates in the same transaction. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Add offset_into_extent param to bch2_read_extent()Kent Overstreet
With reflink, we'll no longer be able to calculate the offset of the data we want into the extent we're reading from from the extent pos and the iter pos - we'll have to pass it in separately. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Track dirtyness at sector level, not pageKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill page_state_cmpxchgKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Always touch page state with page lockedKent Overstreet
This will mean we don't have to use cmpxchg for modifying page state, which will simplify a fair amount of code Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill direct access to bi_io_vecKent Overstreet
Switch to always using bio_add_page(), which merges contiguous pages now that we have multipage bvecs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: More work to avoid transaction restartsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Delete duplicate codeKent Overstreet
Also rename for consistency Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>