diff options
author | Qu Wenruo <wqu@suse.com> | 2022-11-07 15:32:29 +0800 |
---|---|---|
committer | David Sterba <dsterba@suse.com> | 2022-12-05 18:00:55 +0100 |
commit | 2942a50dea74126bf3395b3060a808fb046136fc (patch) | |
tree | 6d6519b8bfb72e0bb9b936a7be5e732f8650745f /fs/btrfs/raid56.h | |
parent | e55cf7ca85e323028774feeb117ad94358a78070 (diff) |
btrfs: raid56: introduce btrfs_raid_bio::error_bitmap
Currently btrfs raid56 uses btrfs_raid_bio::faila and failb to indicate
which stripe(s) had IO errors.
But that has some problems:
- If one sector failed csum check, the whole stripe where the corruption
is will be marked error.
This can reduce the chance we do recover, like this:
0 4K 8K
Data 1 |XX| |
Data 2 | |XX|
Parity | | |
In above case, 0~4K in data 1 should be recovered using data 2 and
parity, while 4K~8K in data 2 should be recovered using data 1 and
parity.
Currently if we trigger read on 0~4K of data 1, we will also recover
4K~8K of data 1 using corrupted data 2 and parity, causing wrong
result in rbio cache.
- Harder to expand for future M-N scheme
As we're limited to just faila/b, two corruptions.
- Harder to expand to handle extra csum errors
This can be problematic if we start to do csum verification.
This patch will introduce an extra @error_bitmap, where one bit
represents error that happened for that sector.
The choice to introduce a new error bitmap other than reusing
sector_ptr, is to avoid extra search between rbio::stripe_sectors[] and
rbio::bio_sectors[].
Since we can submit bio using sectors from both sectors, doing proper
search on both array will more complex.
Although the new bitmap will take extra memory, later we can remove
things like @error and faila/b to save some memory.
Currently the new error bitmap and failab mechanism coexists, the error
bitmap is only updated at endio time and recover entrance.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'fs/btrfs/raid56.h')
-rw-r--r-- | fs/btrfs/raid56.h | 11 |
1 files changed, 11 insertions, 0 deletions
diff --git a/fs/btrfs/raid56.h b/fs/btrfs/raid56.h index 9fae97b7a2a5..e38da4fa76d6 100644 --- a/fs/btrfs/raid56.h +++ b/fs/btrfs/raid56.h @@ -126,6 +126,17 @@ struct btrfs_raid_bio { /* Allocated with real_stripes-many pointers for finish_*() calls */ void **finish_pointers; + + /* + * The bitmap recording where IO errors happened. + * Each bit is corresponding to one sector in either bio_sectors[] or + * stripe_sectors[] array. + * + * The reason we don't use another bit in sector_ptr is, we have two + * arrays of sectors, and a lot of IO can use sectors in both arrays. + * Thus making it much harder to iterate. + */ + unsigned long *error_bitmap; }; /* |