summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-03-29Btrfs: fix an integer overflow checkDan Carpenter
This isn't super serious because you need CAP_ADMIN to run this code. I added this integer overflow check last year but apparently I am rubbish at writing integer overflow checks... There are two issues. First, access_ok() works on unsigned long type and not u64 so on 32 bit systems the access_ok() could be checking a truncated size. The other issue is that we should be using a stricter limit so we don't overflow the kzalloc() setting ctx->clone_roots later in the function after the access_ok(): alloc_size = sizeof(struct clone_root) * (arg->clone_sources_count + 1); sctx->clone_roots = kzalloc(alloc_size, GFP_KERNEL | __GFP_NOWARN); Fixes: f5ecec3ce21f ("btrfs: send: silence an integer overflow warning") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ added comment ] Signed-off-by: David Sterba <dsterba@suse.com>
2017-03-29btrfs: Change qgroup_meta_rsv to 64bitGoldwyn Rodrigues
Using an int value is causing qg->reserved to become negative and exclusive -EDQUOT to be reached prematurely. This affects exclusive qgroups only. TEST CASE: DEVICE=/dev/vdb MOUNTPOINT=/mnt SUBVOL=$MOUNTPOINT/tmp umount $SUBVOL umount $MOUNTPOINT mkfs.btrfs -f $DEVICE mount /dev/vdb $MOUNTPOINT btrfs quota enable $MOUNTPOINT btrfs subvol create $SUBVOL umount $MOUNTPOINT mount /dev/vdb $MOUNTPOINT mount -o subvol=tmp $DEVICE $SUBVOL btrfs qgroup limit -e 3G $SUBVOL btrfs quota rescan /mnt -w for i in `seq 1 44000`; do dd if=/dev/zero of=/mnt/tmp/test_$i bs=10k count=1 if [[ $? > 0 ]]; then btrfs qgroup show -pcref $SUBVOL exit 1 fi done Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> [ add reproducer to changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
2017-03-29Btrfs: bring back repair during readLiu Bo
Commit 20a7db8ab3f2 ("btrfs: add dummy callback for readpage_io_failed and drop checks") made a cleanup around readpage_io_failed_hook, and it was supposed to keep the original sematics, but it also unexpectedly disabled repair during read for dup, raid1 and raid10. This fixes the problem by letting data's inode call the generic readpage_io_failed callback by returning -EAGAIN from its readpage_io_failed_hook in order to notify end_bio_extent_readpage to do the rest. We don't call it directly because the generic one takes an offset from end_bio_extent_readpage() to calculate the index in the checksum array and inode's readpage_io_failed_hook doesn't offer that offset. Cc: David Sterba <dsterba@suse.cz> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ keep the const function attribute ] Signed-off-by: David Sterba <dsterba@suse.com>
2017-03-17btrfs: add missing memset while reading compressed inline extentsZygo Blaxell
This is a story about 4 distinct (and very old) btrfs bugs. Commit c8b978188c ("Btrfs: Add zlib compression support") added three data corruption bugs for inline extents (bugs #1-3). Commit 93c82d5750 ("Btrfs: zero page past end of inline file items") fixed bug #1: uncompressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read. The fix was to add a memset in btrfs_get_extent to zero out the hole. Commit 166ae5a418 ("btrfs: fix inline compressed read err corruption") fixed bug #2: compressed inline extents which contained non-zero bytes might be replaced with zero bytes in some cases. This patch removed an unhelpful memset from uncompress_inline, but the case where memset is required was missed. There is also a memset in the decompression code, but this only covers decompressed data that is shorter than the ram_bytes from the extent ref record. This memset doesn't cover the region between the end of the decompressed data and the end of the page. It has also moved around a few times over the years, so there's no single patch to refer to. This patch fixes bug #3: compressed inline extents followed by a hole and more extents could get non-zero data in the hole as they were read (i.e. bug #3 is the same as bug #1, but s/uncompressed/compressed/). The fix is the same: zero out the hole in the compressed case too, by putting a memset back in uncompress_inline, but this time with correct parameters. The last and oldest bug, bug #0, is the cause of the offending inline extent/hole/extent pattern. Bug #0 is a subtle and mostly-harmless quirk of behavior somewhere in the btrfs write code. In a few special cases, an inline extent and hole are allowed to persist where they normally would be combined with later extents in the file. A fast reproducer for bug #0 is presented below. A few offending extents are also created in the wild during large rsync transfers with the -S flag. A Linux kernel build (git checkout; make allyesconfig; make -j8) will produce a handful of offending files as well. Once an offending file is created, it can present different content to userspace each time it is read. Bug #0 is at least 4 and possibly 8 years old. I verified every vX.Y kernel back to v3.5 has this behavior. There are fossil records of this bug's effects in commits all the way back to v2.6.32. I have no reason to believe bug #0 wasn't present at the beginning of btrfs compression support in v2.6.29, but I can't easily test kernels that old to be sure. It is not clear whether bug #0 is worth fixing. A fix would likely require injecting extra reads into currently write-only paths, and most of the exceptional cases caused by bug #0 are already handled now. Whether we like them or not, bug #0's inline extents followed by holes are part of the btrfs de-facto disk format now, and we need to be able to read them without data corruption or an infoleak. So enough about bug #0, let's get back to bug #3 (this patch). An example of on-disk structure leading to data corruption found in the wild: item 61 key (606890 INODE_ITEM 0) itemoff 9662 itemsize 160 inode generation 50 transid 50 size 47424 nbytes 49141 block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0 flags 0x0(none) item 62 key (606890 INODE_REF 603050) itemoff 9642 itemsize 20 inode ref index 3 namelen 10 name: DB_File.so item 63 key (606890 EXTENT_DATA 0) itemoff 8280 itemsize 1362 inline extent data size 1341 ram 4085 compress(zlib) item 64 key (606890 EXTENT_DATA 4096) itemoff 8227 itemsize 53 extent data disk byte 5367308288 nr 20480 extent data offset 0 nr 45056 ram 45056 extent compression(zlib) Different data appears in userspace during each read of the 11 bytes between 4085 and 4096. The extent in item 63 is not long enough to fill the first page of the file, so a memset is required to fill the space between item 63 (ending at 4085) and item 64 (beginning at 4096) with zero. Here is a reproducer from Liu Bo, which demonstrates another method of creating the same inline extent and hole pattern: Using 'page_poison=on' kernel command line (or enable CONFIG_PAGE_POISONING) run the following: # touch foo # chattr +c foo # xfs_io -f -c "pwrite -W 0 1000" foo # xfs_io -f -c "falloc 4 8188" foo # od -x foo # echo 3 >/proc/sys/vm/drop_caches # od -x foo This produce the following on my box: Correct output: file contains 1000 data bytes followed by zeros: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 0000 0000 0000 0000 0001760 0000 0000 0000 0000 0000 0000 0000 0000 * 0020000 Actual output: the data after the first 1000 bytes will be different each run: 0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd * 0001740 cdcd cdcd cdcd cdcd 6c63 7400 635f 006d 0001760 5f74 6f43 7400 435f 0053 5f74 7363 7400 0002000 435f 0056 5f74 6164 7400 645f 0062 5f74 (...) Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Chris Mason <clm@fb.com> Signed-off-by: Chris Mason <clm@fb.com>
2017-03-17Btrfs: fix regression in lock_delalloc_pagesLiu Bo
The bug is a regression after commit (da2c7009f6ca "btrfs: teach __process_pages_contig about PAGE_LOCK operation") and commit (76c0021db8fd "Btrfs: use helper to simplify lock/unlock pages"). So if the dirty pages which are under writeback got truncated partially before we lock the dirty pages, we couldn't find all pages mapping to the delalloc range, and the bug didn't return an error so it kept going on and found that the delalloc range got truncated and got to unlock the dirty pages, and then the ASSERT could caught the error, and showed ----------------------------------------------------------------------------- assertion failed: page_ops & PAGE_LOCK, file: fs/btrfs/extent_io.c, line: 1716 ----------------------------------------------------------------------------- This fixes the bug by returning the proper -EAGAIN. Cc: David Sterba <dsterba@suse.com> Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-03-07btrfs: remove btrfs_err_str function from uapi/linux/btrfs.hDmitry V. Levin
btrfs_err_str function is not called from anywhere and is replicated in the userspace headers for btrfs-progs. It's removal also fixes the following linux/btrfs.h userspace compilation error: /usr/include/linux/btrfs.h: In function 'btrfs_err_str': /usr/include/linux/btrfs.h:740:11: error: 'NULL' undeclared (first use in this function) return NULL; Suggested-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28Merge branch 'for-chris-4.11-part2' of ↵Chris Mason
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.11
2017-02-28btrfs: add dummy callback for readpage_io_failed and drop checksDavid Sterba
Make extent_io_ops::readpage_io_failed_hook callback mandatory and define a dummy function for btrfs_extent_io_ops. As the failed IO callback is not performance critical, the branch vs extra trade off does not hurt. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: drop checks for mandatory extent_io_ops callbacksDavid Sterba
We know that eadpage_end_io_hook, submit_bio_hook and merge_bio_hook are always defined so we can drop the checks before we call them. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: document existence of extent_io ops callbacksDavid Sterba
Some of the callbacks defined in btree_extent_io_ops and btrfs_extent_io_ops do always exist so we don't need to check the existence before each call. This patch just reorders the definition and documents which are mandatory/optional. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: let writepage_end_io_hook return voidDavid Sterba
There's no error path in any of the instances, always return 0. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: do proper error handling in btrfs_insert_xattr_itemDavid Sterba
The space check in btrfs_insert_xattr_item is duplicated in it's caller (do_setxattr) so we won't hit the BUG_ON. Continuing without any check could be disasterous so turn it to a proper error handling. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: handle allocation error in update_dev_stat_itemDavid Sterba
Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: remove BUG_ON from __tree_mod_log_insertDavid Sterba
All callers dereference the 'tm' parameter before it gets to this function, the NULL check does not make much sense here. Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: derive maximum output size in the compression implementationDavid Sterba
The value of max_out can be calculated from the parameters passed to the compressors, which is number of pages and the page size, and we don't have to needlessly pass it around. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: use predefined limits for calculating maximum number of pages for ↵David Sterba
compression Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: export compression buffer limits in a headerDavid Sterba
Move the buffer limit definitions out of compress_file_range. Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: merge nr_pages input and output parameter in compress_pagesDavid Sterba
The parameter saying how many pages can be allocated at maximum can be merged with the output page counter, to save some stack space. The compression implementation will sink the parameter to a local variable so everything works as before. The nr_pages variables can also be simply merged in compress_file_range into one. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: merge length input and output parameter in compress_pagesDavid Sterba
The length parameter is basically duplicated for input and output in the top level caller of the compress_pages chain. We can simply use one variable for that and reduce stack consumption. The compression implementation will sink the parameter to a local variable so everything works as before. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: constify name of subvolume in creation helpersDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: constify buffers used by compression helpersDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: constify input buffer of btrfs_csum_dataDavid Sterba
The function does not modify the input buffer, also update a typecast in one caller. Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: constify device path passed to relevant helpersDavid Sterba
Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make btrfs_inode_resume_unlocked_dio take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make btrfs_inode_block_unlocked_dio take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_add_nondir take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_add_link take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_del_delalloc_inode take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make get_extent_t take btrfs_inodeNikolay Borisov
In addition to changing the signature, this patch also switches all the functions which are used as an argument to also take btrfs_inode. Namely those are: btrfs_get_extent and btrfs_get_extent_filemap. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make check_extent_to_block take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make clone_update_extent_map take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_clear_bit_hook take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_extent_item_to_extent_map take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make btrfs_log_inode_parent take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make check_parent_dirs_for_sync take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_orphan_add take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make btrfs_orphan_del take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make btrfs_free_io_failure_record take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make clean_io_failure take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make repair_io_failure take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make check_compressed_csum take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make btrfs_print_data_csum_error take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: make free_io_failure take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make lock_and_cleanup_extent_if_need take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make check_can_nocow take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_lookup_ordered_range take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_mark_extent_written take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make fill_holes take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make hole_mergeable take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2017-02-28btrfs: Make btrfs_drop_extent_cache take btrfs_inodeNikolay Borisov
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>