summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-22bcachefs: Journal updates to interior nodesKent Overstreet
Previously, the btree has always been self contained and internally consistent on disk without anything from the journal - the journal just contained pointers to the btree roots. However, this meant that btree node split or compact operations - i.e. anything that changes btree node topology and involves updates to interior nodes - would require that interior btree node to be written immediately, which means emitting a btree node write that's mostly empty (using 4k of space on disk if the filesystemm blocksize is 4k to only write perhaps ~100 bytes of new keys). More importantly, this meant most btree node writes had to be FUA, and consumer drives have a history of slow and/or buggy FUA support - other filesystes have been bit by this. This patch changes the interior btree update path to journal updates to interior nodes, after the writes for the new btree nodes have completed. Best of all, it turns out to simplify the interior node update path somewhat. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Replay interior node keysKent Overstreet
This slightly modifies the journal replay code so that it can replay updates to interior nodes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: trans_commit() path can now insert to interior nodesKent Overstreet
This will be needed for the upcoming patches to journal updates to interior btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Disable extent mergingKent Overstreet
Extent merging is currently broken, and will be reimplemented differently soon - right now it only happens when btree nodes are being compacted, which makes it difficult to test. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix a locking bug in fsckKent Overstreet
This works around a btree locking issue - we can't be holding read locks while taking write locks, which currently means we can't have live iterators holding read locks at commit time. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix count_iters_for_insert()Kent Overstreet
This fixes a transaction iterator overflow. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix an iterator bugKent Overstreet
We were incorrectly not restarting the transaction when re-traversing iterators. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Shut down quickerKent Overstreet
Internal writes (i.e. copygc/rebalance operations) shouldn't be blocking on the allocator when we're going RO. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: BCH_FEATURE_new_extent_overwrite is now requiredKent Overstreet
The patch "bcachefs: Move extent overwrite handling out of core btree code" should have been flipping on this feature bit; extent btree nodes in the old format have to be rewritten before we can insert into them with the new extent update path. Not turning on this feature bit was causing us to go into an infinite loop where we keep rewriting btree nodes over and over. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Clear BCH_FEATURE_extents_above_btree_updates on clean shutdownKent Overstreet
This is needed so that users can roll back to before "d9bb516b2d bcachefs: Move extent overwrite handling out of core btree code", which it appears may still be buggy. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix another iterator leakKent Overstreet
This updates bch2_rbio_narrow_crcs() to the current style for transactional btree code, and fixes a rare panic on iterator overflow. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't use peek_filter() unnecessarilyKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix a use after free in dio write pathKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Drop unused exportKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Move extent overwrite handling out of core btree codeKent Overstreet
Ever since the btree code was first written, handling of overwriting existing extents - including partially overwriting and splittin existing extents - was handled as part of the core btree insert path. The modern transaction and iterator infrastructure didn't exist then, so that was the only way for it to be done. This patch moves that outside of the core btree code to a pass that runs at transaction commit time. This is a significant simplification to the btree code and overall reduction in code size, but more importantly it gets us much closer to the core btree code being completely independent of extents and is important prep work for snapshots. This introduces a new feature bit; the old and new extent update models are incompatible when the filesystem needs journal replay. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: btree_iter_peek_with_updates()Kent Overstreet
Introduce a new iterator method that provides a consistent view of the btree plus uncommitted updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix build when CONFIG_BCACHEFS_DEBUG=nKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: More btree iter invariantsKent Overstreet
Ensure that iter->pos always lies between the start and end of iter->k (the last key returned). Also, bch2_btree_iter_set_pos() now invalidates the key that peek() or next() returned. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Simplify bch2_btree_iter_peek_slot()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Iterator debug code improvementsKent Overstreet
More aggressively checking iterator invariants, and fixing the resulting bugs. Also greatly simplifying iter_next() and iter_next_slot() - they were hyper optimized before, but the optimizations were getting too brittle. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Skip 0 size deleted extents in journal replayKent Overstreet
These are created by the new extent update path, but not used yet by the recovery code and they break the existing recovery code, so we can just skip them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Traverse iterator in journal replayKent Overstreet
This fixes a bug where we end up spinning in journal replay - in theory this shouldn't be necessary though, transaction reset should be re-traversing all iterators. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Don't log errors that are expected during shutdownKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix bch2_dump_bset()Kent Overstreet
It's used in the write path when the bset isn't in the btree node buffer. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix another iterator leakKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix off by one error in bch2_extent_crc_append()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix extent_sort_fix_overlapping()Kent Overstreet
Recently the extent update path started emmiting 0 size whiteouts on extent overwrite, as part of transitioning to moving extent handling out of the core btree code. Unfortunately, this broke the old code path that handles overlapping extents when reading in btree nodes - it relies on sorting incomming extents by start position, but the 0 size whiteouts broke that ordering. Skipping over them before the main algorithm sees them fixes this. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Some btree iterator improvementsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Journal pin cleanupsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Dont't del sysfs dir until after we go ROKent Overstreet
This will help for debugging hangs during unmount Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix error message on bucket sector count overflowKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improve an error messageKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: BCH_SB_FEATURES_ALLKent Overstreet
BCH_FEATURE_btree_ptr_v2 wasn't getting set on new filesystems, oops Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: fix setting btree_node_accessed()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Use btree_ptr_v2.mem_ptr to avoid hash table lookupKent Overstreet
Nice performance optimization Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix incorrect initialization of btree_node_old_extent_overwrite()Kent Overstreet
b->level and b->btree_id weren't set when the code was checking btree_node_is_extents() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Issue discards when needed to allocate journal writeKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Kill TRANS_RESET_MEM|TRANS_RESET_ITERSKent Overstreet
All iterators should be released now with bch2_trans_iter_put(), so TRANS_RESET_ITERS shouldn't be needed anymore, and TRANS_RESET_MEM is always used. Also convert more code to __bch2_trans_do(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Seralize btree_update operations at btree_update_nodes_written()Kent Overstreet
Prep work for journalling updates to interior nodes - enforcing ordering will greatly simplify those changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: btree_ptr_v2Kent Overstreet
Add a new btree ptr type which contains the sequence number (random 64 bit cookie, actually) for that btree node - this lets us verify that when we read in a btree node it really is the btree node we wanted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: introduce b->hash_valKent Overstreet
This is partly prep work for introducing bch_btree_ptr_v2, but it'll also be a bit of a performance boost by moving the full key out of the hot part of struct btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix traversing to interior nodesKent Overstreet
NULL is used to mean "reach end of traversal" - we were only initializing the leaf node in the iterator to the right sentinal value. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Check for bad key version numberKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix bch2_ptr_swab for indirect extentsKent Overstreet
bch2_ptr_swab was never updated when the code for generic keys with pointers was added - it assumed the entire val was only used for pointers. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Make BTREE_ITER_IS_EXTENTS private to iter codeKent Overstreet
Prep work for changing the core btree update path to handle extents like regular keys; we need to reduce the scope of what BTREE_ITER_IS_EXTENTS means Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: __bch2_btree_iter_set_pos()Kent Overstreet
This one takes an additional argument for whether we're searching for >= or > the search key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: btree_and_journal_iterKent Overstreet
Introduce a new iterator that iterates over keys in the btree with keys from the journal overlaid on top. This factors out what the erasure coding init code was doing manually. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Make sure we're releasing btree iteratorsKent Overstreet
This wasn't originally required, but this is the model we're moving towards. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Improve an insert path optimizationKent Overstreet
The insert path had an optimization to short circuit lookup table/iterator fixups when overwriting an existing key with the same size value - but it was incorrect when other key fields (size/version) were changing. This is important for the upcoming rework to have extent updates use the same insert path as regular keys. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Fix an uninitialized field in bch_write_opKent Overstreet
Regression from "bcachefs: Track incompressible data" Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>