summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-03-14ext4: use flexible-array member for xattr structsGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200309180813.GA3347@embeddedor Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: use flexible-array member in struct fnameGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200309154838.GA31559@embeddedor Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14Documentation: correct the description of FIEMAP_EXTENT_LASTRitesh Harjani
Currently FIEMAP_EXTENT_LAST is not working consistently across different filesystem's fiemap implementations. So add more information about how else this flag could set in other implementation. Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/5a00e8d4283d6849e0b8f408c8365b31fbc1d153.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: move ext4_fiemap to use iomap frameworkRitesh Harjani
This patch moves ext4_fiemap to use iomap framework. For xattr a new 'ext4_iomap_xattr_ops' is added. Reported-by: kbuild test robot <lkp@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/b9f45c885814fcdd0631747ff0fe08886270828c.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: make ext4_ind_map_blocks work with fiemapRitesh Harjani
For indirect block mapping if the i_block > max supported block in inode then ext4_ind_map_blocks() returns a -EIO error. But in case of fiemap this could be a valid query to ->iomap_begin call. So check if the offset >= s_bitmap_maxbytes in ext4_iomap_begin_report(), then simply skip calling ext4_map_blocks(). Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/87fa0ddc5967fa707656212a3b66a7233425325c.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: move ext4 bmap to use iomap infrastructureRitesh Harjani
ext4_iomap_begin is already implemented which provides ext4_map_blocks, so just move the API from generic_block_bmap to iomap_bmap for iomap conversion. Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/8bbd53bd719d5ccfecafcce93f2bf1d7955a44af.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: optimize ext4_ext_precache for 0 depthRitesh Harjani
This patch avoids the memory alloc & free path when depth is 0, since anyway there is no extra caching done in that case. So on checking depth 0, simply return early. Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/93da0d0f073c73358e85bb9849d8a5378d1da539.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-14ext4: add IOMAP_F_MERGED for non-extent based mappingRitesh Harjani
IOMAP_F_MERGED needs to be set in case of non-extent based mapping. This is needed in later patches for conversion of ext4_fiemap to use iomap. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Link: https://lore.kernel.org/r/a4764c91c08c16d4d4a4b36defb2a08625b0e9b3.1582880246.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: fix a data race at inode->i_disksizeQiujun Huang
KCSAN find inode->i_disksize could be accessed concurrently. BUG: KCSAN: data-race in ext4_mark_iloc_dirty / ext4_write_end write (marked) to 0xffff8b8932f40090 of 8 bytes by task 66792 on cpu 0: ext4_write_end+0x53f/0x5b0 ext4_da_write_end+0x237/0x510 generic_perform_write+0x1c4/0x2a0 ext4_buffered_write_iter+0x13a/0x210 ext4_file_write_iter+0xe2/0x9b0 new_sync_write+0x29c/0x3a0 __vfs_write+0x92/0xa0 vfs_write+0xfc/0x2a0 ksys_write+0xe8/0x140 __x64_sys_write+0x4c/0x60 do_syscall_64+0x8a/0x2a0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffff8b8932f40090 of 8 bytes by task 14414 on cpu 1: ext4_mark_iloc_dirty+0x716/0x1190 ext4_mark_inode_dirty+0xc9/0x360 ext4_convert_unwritten_extents+0x1bc/0x2a0 ext4_convert_unwritten_io_end_vec+0xc5/0x150 ext4_put_io_end+0x82/0x130 ext4_writepages+0xae7/0x16f0 do_writepages+0x64/0x120 __writeback_single_inode+0x7d/0x650 writeback_sb_inodes+0x3a4/0x860 __writeback_inodes_wb+0xc4/0x150 wb_writeback+0x43f/0x510 wb_workfn+0x3b2/0x8a0 process_one_work+0x39b/0x7e0 worker_thread+0x88/0x650 kthread+0x1d4/0x1f0 ret_from_fork+0x35/0x40 The plain read is outside of inode->i_data_sem critical section which results in a data race. Fix it by adding READ_ONCE(). Signed-off-by: Qiujun Huang <hqjagain@gmail.com> Link: https://lore.kernel.org/r/1582556566-3909-1-git-send-email-hqjagain@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: fix a data race at inode->i_blocksQian Cai
inode->i_blocks could be accessed concurrently as noticed by KCSAN, BUG: KCSAN: data-race in ext4_do_update_inode [ext4] / inode_add_bytes write to 0xffff9a00d4b982d0 of 8 bytes by task 22100 on cpu 118: inode_add_bytes+0x65/0xf0 __inode_add_bytes at fs/stat.c:689 (inlined by) inode_add_bytes at fs/stat.c:702 ext4_mb_new_blocks+0x418/0xca0 [ext4] ext4_ext_map_blocks+0x1a6b/0x27b0 [ext4] ext4_map_blocks+0x1a9/0x950 [ext4] _ext4_get_block+0xfc/0x270 [ext4] ext4_get_block_unwritten+0x33/0x50 [ext4] __block_write_begin_int+0x22e/0xae0 __block_write_begin+0x39/0x50 ext4_write_begin+0x388/0xb50 [ext4] ext4_da_write_begin+0x35f/0x8f0 [ext4] generic_perform_write+0x15d/0x290 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb05 entry_SYSCALL_64_after_hwframe+0x49/0xbe read to 0xffff9a00d4b982d0 of 8 bytes by task 8 on cpu 65: ext4_do_update_inode+0x4a0/0xf60 [ext4] ext4_inode_blocks_set at fs/ext4/inode.c:4815 ext4_mark_iloc_dirty+0xaf/0x160 [ext4] ext4_mark_inode_dirty+0x129/0x3e0 [ext4] ext4_convert_unwritten_extents+0x253/0x2d0 [ext4] ext4_convert_unwritten_io_end_vec+0xc5/0x150 [ext4] ext4_end_io_rsv_work+0x22c/0x350 [ext4] process_one_work+0x54f/0xb90 worker_thread+0x80/0x5f0 kthread+0x1cd/0x1f0 ret_from_fork+0x27/0x50 4 locks held by kworker/u256:0/8: #0: ffff9a025abc4328 ((wq_completion)ext4-rsv-conversion){+.+.}, at: process_one_work+0x443/0xb90 #1: ffffab5a862dbe20 ((work_completion)(&ei->i_rsv_conversion_work)){+.+.}, at: process_one_work+0x443/0xb90 #2: ffff9a025a9d0f58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2] #3: ffff9a00d4b985d8 (&(&ei->i_raw_lock)->rlock){+.+.}, at: ext4_do_update_inode+0xaa/0xf60 [ext4] irq event stamp: 3009267 hardirqs last enabled at (3009267): [<ffffffff980da9b7>] __find_get_block+0x107/0x790 hardirqs last disabled at (3009266): [<ffffffff980da8f9>] __find_get_block+0x49/0x790 softirqs last enabled at (3009230): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c softirqs last disabled at (3009223): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0 Reported by Kernel Concurrency Sanitizer on: CPU: 65 PID: 8 Comm: kworker/u256:0 Tainted: G L 5.6.0-rc2-next-20200221+ #7 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work [ext4] The plain read is outside of inode->i_lock critical section which results in a data race. Fix it by adding READ_ONCE() there. Link: https://lore.kernel.org/r/20200222043258.2279-1-cai@lca.pw Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2020-03-05ext4: clean up error return for convert_initialized_extent()Eric Whitney
Although convert_initialized_extent() can potentially return an error code with a negative value, its returned value is assigned to an unsigned variable containing a block count in ext4_ext_map_blocks() and then returned to that function's caller. The code currently works, though the way this happens is obscure. The code would be more readable if it followed the error handling convention used elsewhere in ext4_ext_map_blocks(). This patch does not address any known test failure or bug report - it's simply a cleanup. It also addresses a nearby coding standard issue. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20200218202656.21561-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05jbd2: improve comments about freeing data buffers whose page mapping is NULLzhangyi (F)
Improve comments in jbd2_journal_commit_transaction() to describe why we don't need to clear the buffer_mapped bit for freeing file mapping buffers whose page mapping is NULL. Link: https://lore.kernel.org/r/20200217112706.20085-1-yi.zhang@huawei.com Fixes: c96dceeabf76 ("jbd2: do not clear the BH_Mapped flag when forgetting a metadata buffer") Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: use flexible-array members in struct dx_node and struct dx_rootGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Link: https://lore.kernel.org/r/20200213160648.GA7054@embeddedor Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: use built-in RCU list checking in mballocMadhuparna Bhowmik
list_for_each_entry_rcu() has built-in RCU and lock checking. Pass cond argument to list_for_each_entry_rcu() to silence false lockdep warning when CONFIG_PROVE_RCU_LIST is enabled by default. Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Link: https://lore.kernel.org/r/20200213152558.7070-1-madhuparnabhowmik10@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: delete declaration for ext4_split_extent()Eric Whitney
There are no forward references for ext4_split_extent() in extents.c, so delete its unnecessary declaration. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20200212162141.22381-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: remove EXT4_EOFBLOCKS_FL and associated codeEric Whitney
The EXT4_EOFBLOCKS_FL inode flag is used to indicate whether a file contains unwritten blocks past i_size. It's set when ext4_fallocate is called with the KEEP_SIZE flag to extend a file with an unwritten extent. However, this flag hasn't been useful functionally since March, 2012, when a decision was made to remove it from ext4. All traces of EXT4_EOFBLOCKS_FL were removed from e2fsprogs version 1.42.2 by commit 010dc7b90d97 ("e2fsck: remove EXT4_EOFBLOCKS_FL flag handling") at that time. Now that enough time has passed to make e2fsprogs versions containing this modification common, this patch now removes the code associated with EXT4_EOFBLOCKS_FL from the kernel as well. This change has two implications. First, because pre-1.42.2 e2fsck versions only look for a problem if EXT4_EOFBLOCKS_FL is set, and because that bit will never be set by newer kernels containing this patch, old versions of e2fsck won't have a compatibility problem with files created by newer kernels. Second, newer kernels will not clear EXT4_EOFBLOCKS_FL inode flag bits belonging to a file written by an older kernel. If set, it will remain in that state until the file is deleted. Because e2fsck versions since 1.42.2 don't check the flag at all, no adverse effect is expected. However, pre-1.42.2 e2fsck versions that do check the flag may report that it is set when it ought not to be after a file has been truncated or had its unwritten blocks written. In this case, the old version of e2fsck will offer to clear the flag. No adverse effect would then occur whether the user chooses to clear the flag or not. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20200211210216.24960-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: code cleanup for ext4_statfs_project()Chengguang Xu
Calling min_not_zero() to simplify complicated prjquota limit comparison in ext4_statfs_project(). Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Link: https://lore.kernel.org/r/20200210082445.2379-1-cgxu519@mykernel.net Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: start to support iopoll methodXiaoguang Wang
Since commit "b1b4705d54ab ext4: introduce direct I/O read using iomap infrastructure", we can easily make ext4 support iopoll method, just use iomap_dio_iopoll(). Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com> Link: https://lore.kernel.org/r/20200207120758.2411-1-xiaoguang.wang@linux.alibaba.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-05ext4: force buffer up-to-date while marking it dirtyHarshad Shirwadkar
Writeback errors can leave buffer in not up-to-date state when there are errors during background writes. Force buffer up-to-date while marking it dirty. Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/20191224190940.157952-1-harshadshirwadkar@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-03-01Linux 5.6-rc4v5.6-rc4Linus Torvalds
2020-03-01Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Two more bug fixes (including a regression) for 5.6" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: potential crash on allocation error in ext4_alloc_flex_bg_array() jbd2: fix data races at struct journal_head
2020-03-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "More bugfixes, including a few remaining "make W=1" issues such as too large frame sizes on some configurations. On the ARM side, the compiler was messing up shadow stacks between EL1 and EL2 code, which is easily fixed with __always_inline" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: check descriptor table exits on instruction emulation kvm: x86: Limit the number of "kvm: disabled by bios" messages KVM: x86: avoid useless copy of cpufreq policy KVM: allow disabling -Werror KVM: x86: allow compiling as non-module with W=1 KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipis KVM: Introduce pv check helpers KVM: let declaration of kvm_get_running_vcpus match implementation KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter arm64: Ask the compiler to __always_inline functions used by KVM at HYP KVM: arm64: Define our own swab32() to avoid a uapi static inline KVM: arm64: Ask the compiler to __always_inline functions used at HYP kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe() KVM: arm/arm64: Fix up includes for trace.h
2020-03-01KVM: VMX: check descriptor table exits on instruction emulationOliver Upton
KVM emulates UMIP on hardware that doesn't support it by setting the 'descriptor table exiting' VM-execution control and performing instruction emulation. When running nested, this emulation is broken as KVM refuses to emulate L2 instructions by default. Correct this regression by allowing the emulation of descriptor table instructions if L1 hasn't requested 'descriptor table exiting'. Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode") Reported-by: Jan Kiszka <jan.kiszka@web.de> Cc: stable@vger.kernel.org Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Jim Mattson <jmattson@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-29Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "I2C has three driver bugfixes for you. We agreed on the Mac regression to go in via I2C" * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: macintosh: therm_windtunnel: fix regression when instantiating devices i2c: altera: Fix potential integer overflow i2c: jz4780: silence log flood on txabrt
2020-02-29ext4: potential crash on allocation error in ext4_alloc_flex_bg_array()Dan Carpenter
If sbi->s_flex_groups_allocated is zero and the first allocation fails then this code will crash. The problem is that "i--" will set "i" to -1 but when we compare "i >= sbi->s_flex_groups_allocated" then the -1 is type promoted to unsigned and becomes UINT_MAX. Since UINT_MAX is more than zero, the condition is true so we call kvfree(new_groups[-1]). The loop will carry on freeing invalid memory until it crashes. Fixes: 7c990728b99e ("ext4: fix potential race between s_flex_groups online resizing and access") Reviewed-by: Suraj Jitindar Singh <surajjs@amazon.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@kernel.org Link: https://lore.kernel.org/r/20200228092142.7irbc44yaz3by7nb@kili.mountain Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29macintosh: therm_windtunnel: fix regression when instantiating devicesWolfram Sang
Removing attach_adapter from this driver caused a regression for at least some machines. Those machines had the sensors described in their DT, too, so they didn't need manual creation of the sensor devices. The old code worked, though, because manual creation came first. Creation of DT devices then failed later and caused error logs, but the sensors worked nonetheless because of the manually created devices. When removing attach_adaper, manual creation now comes later and loses the race. The sensor devices were already registered via DT, yet with another binding, so the driver could not be bound to it. This fix refactors the code to remove the race and only manually creates devices if there are no DT nodes present. Also, the DT binding is updated to match both, the DT and manually created devices. Because we don't know which device creation will be used at runtime, the code to start the kthread is moved to do_probe() which will be called by both methods. Fixes: 3e7bed52719d ("macintosh: therm_windtunnel: drop using attach_adapter") Link: https://bugzilla.kernel.org/show_bug.cgi?id=201723 Reported-by: Erhard Furtner <erhard_f@mailbox.org> Tested-by: Erhard Furtner <erhard_f@mailbox.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org # v4.19+
2020-02-29jbd2: fix data races at struct journal_headQian Cai
journal_head::b_transaction and journal_head::b_next_transaction could be accessed concurrently as noticed by KCSAN, LTP: starting fsync04 /dev/zero: Can't open blockdev EXT4-fs (loop0): mounting ext3 file system using the ext4 subsystem EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) ================================================================== BUG: KCSAN: data-race in __jbd2_journal_refile_buffer [jbd2] / jbd2_write_access_granted [jbd2] write to 0xffff99f9b1bd0e30 of 8 bytes by task 25721 on cpu 70: __jbd2_journal_refile_buffer+0xdd/0x210 [jbd2] __jbd2_journal_refile_buffer at fs/jbd2/transaction.c:2569 jbd2_journal_commit_transaction+0x2d15/0x3f20 [jbd2] (inlined by) jbd2_journal_commit_transaction at fs/jbd2/commit.c:1034 kjournald2+0x13b/0x450 [jbd2] kthread+0x1cd/0x1f0 ret_from_fork+0x27/0x50 read to 0xffff99f9b1bd0e30 of 8 bytes by task 25724 on cpu 68: jbd2_write_access_granted+0x1b2/0x250 [jbd2] jbd2_write_access_granted at fs/jbd2/transaction.c:1155 jbd2_journal_get_write_access+0x2c/0x60 [jbd2] __ext4_journal_get_write_access+0x50/0x90 [ext4] ext4_mb_mark_diskspace_used+0x158/0x620 [ext4] ext4_mb_new_blocks+0x54f/0xca0 [ext4] ext4_ind_map_blocks+0xc79/0x1b40 [ext4] ext4_map_blocks+0x3b4/0x950 [ext4] _ext4_get_block+0xfc/0x270 [ext4] ext4_get_block+0x3b/0x50 [ext4] __block_write_begin_int+0x22e/0xae0 __block_write_begin+0x39/0x50 ext4_write_begin+0x388/0xb50 [ext4] generic_perform_write+0x15d/0x290 ext4_buffered_write_iter+0x11f/0x210 [ext4] ext4_file_write_iter+0xce/0x9e0 [ext4] new_sync_write+0x29c/0x3b0 __vfs_write+0x92/0xa0 vfs_write+0x103/0x260 ksys_write+0x9d/0x130 __x64_sys_write+0x4c/0x60 do_syscall_64+0x91/0xb05 entry_SYSCALL_64_after_hwframe+0x49/0xbe 5 locks held by fsync04/25724: #0: ffff99f9911093f8 (sb_writers#13){.+.+}, at: vfs_write+0x21c/0x260 #1: ffff99f9db4c0348 (&sb->s_type->i_mutex_key#15){+.+.}, at: ext4_buffered_write_iter+0x65/0x210 [ext4] #2: ffff99f5e7dfcf58 (jbd2_handle){++++}, at: start_this_handle+0x1c1/0x9d0 [jbd2] #3: ffff99f9db4c0168 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0x176/0x950 [ext4] #4: ffffffff99086b40 (rcu_read_lock){....}, at: jbd2_write_access_granted+0x4e/0x250 [jbd2] irq event stamp: 1407125 hardirqs last enabled at (1407125): [<ffffffff980da9b7>] __find_get_block+0x107/0x790 hardirqs last disabled at (1407124): [<ffffffff980da8f9>] __find_get_block+0x49/0x790 softirqs last enabled at (1405528): [<ffffffff98a0034c>] __do_softirq+0x34c/0x57c softirqs last disabled at (1405521): [<ffffffff97cc67a2>] irq_exit+0xa2/0xc0 Reported by Kernel Concurrency Sanitizer on: CPU: 68 PID: 25724 Comm: fsync04 Tainted: G L 5.6.0-rc2-next-20200221+ #7 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 The plain reads are outside of jh->b_state_lock critical section which result in data races. Fix them by adding pairs of READ|WRITE_ONCE(). Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/r/20200222043111.2227-1-cai@lca.pw Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2020-02-29Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four small fixes. Three are in drivers for fairly obvious bugs. The fourth is a set of regressions introduced by the compat_ioctl changes because some of the compat updates wrongly replaced .ioctl instead of .compat_ioctl" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: compat_ioctl: cdrom: Replace .ioctl with .compat_ioctl in four appropriate places scsi: zfcp: fix wrong data and display format of SFP+ temperature scsi: sd_sbc: Fix sd_zbc_report_zones() scsi: libfc: free response frame from GPN_ID
2020-02-28Merge tag 'pci-v5.6-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix build issue on 32-bit ARM with old compilers (Marek Szyprowski) - Update MAINTAINERS for recent Cadence driver file move (Lukas Bulwahn) * tag 'pci-v5.6-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: MAINTAINERS: Correct Cadence PCI driver path PCI: brcmstb: Fix build on 32bit ARM platforms with older compilers
2020-02-28Merge tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Passthrough insertion fix (Ming) - Kill off some unused arguments (John) - blktrace RCU fix (Jan) - Dead fields removal for null_blk (Dongli) - NVMe polled IO fix (Bijan) * tag 'block-5.6-2020-02-28' of git://git.kernel.dk/linux-block: nvme-pci: Hold cq_poll_lock while completing CQEs blk-mq: Remove some unused function arguments null_blk: remove unused fields in 'nullb_cmd' blktrace: Protect q->blk_trace with RCU blk-mq: insert passthrough request into hctx->dispatch directly
2020-02-28Merge tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix for a race with IOPOLL used with SQPOLL (Xiaoguang) - Only show ->fdinfo if procfs is enabled (Tobias) - Fix for a chain with multiple personalities in the SQEs - Fix for a missing free of personality idr on exit - Removal of the spin-for-work optimization - Fix for next work lookup on request completion - Fix for non-vec read/write result progation in case of links - Fix for a fileset references on switch - Fix for a recvmsg/sendmsg 32-bit compatability mode * tag 'io_uring-5.6-2020-02-28' of git://git.kernel.dk/linux-block: io_uring: fix 32-bit compatability with sendmsg/recvmsg io_uring: define and set show_fdinfo only if procfs is enabled io_uring: drop file set ref put/get on switch io_uring: import_single_range() returns 0/-ERROR io_uring: pick up link work on submit reference drop io-wq: ensure work->task_pid is cleared on init io-wq: remove spin-for-work optimization io_uring: fix poll_list race for SETUP_IOPOLL|SETUP_SQPOLL io_uring: fix personality idr leak io_uring: handle multiple personalities in link chains
2020-02-28Merge branch 'nvme-5.6-rc4' of git://git.infradead.org/nvme into block-5.6Jens Axboe
Pull NVMe fix from Keith. * 'nvme-5.6-rc4' of git://git.infradead.org/nvme: nvme-pci: Hold cq_poll_lock while completing CQEs
2020-02-28Merge tag 'acpi-5.6-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "Fix a couple of configuration issues in the ACPI watchdog (WDAT) driver (Mika Westerberg) and make it possible to disable that driver at boot time in case it still does not work as expected (Jean Delvare)" * tag 'acpi-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: watchdog: Set default timeout in probe ACPI: watchdog: Fix gas->access_width usage ACPICA: Introduce ACPI_ACCESS_BYTE_WIDTH() macro ACPI: watchdog: Allow disabling WDAT at boot
2020-02-28Merge tag 'pm-5.6-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Fix a recent cpufreq initialization regression (Rafael Wysocki), revert a devfreq commit that made incompatible changes and broke user land on some systems (Orson Zhai), drop a stale reference to a document that has gone away recently (Jonathan Neuschäfer), and fix a typo in a hibernation code comment (Alexandre Belloni)" * tag 'pm-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Fix policy initialization for internal governor drivers Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs" PM / hibernate: fix typo "reserverd_size" -> "reserved_size" Documentation: power: Drop reference to interface.rst
2020-02-28Merge tag 'zonefs-5.6-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fixes from Damien Le Moal: "Two fixes in here: - Revert the initial decision to silently ignore IOCB_NOWAIT for asynchronous direct IOs to sequential zone files. Instead, return an error to the user to signal that the feature is not supported (from Christoph) - A fix to zonefs Kconfig to select FS_IOMAP to avoid build failures if no other file system already selected this option (from Johannes)" * tag 'zonefs-5.6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: select FS_IOMAP zonefs: fix IOCB_NOWAIT handling
2020-02-28Merge tag 'kvmarm-fixes-5.6-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm fixes for 5.6, take #1 - Fix compilation on 32bit - Move VHE guest entry/exit into the VHE-specific entry code - Make sure all functions called by the non-VHE HYP code is tagged as __always_inline
2020-02-28kvm: x86: Limit the number of "kvm: disabled by bios" messagesErwan Velu
In older version of systemd(219), at boot time, udevadm is called with : /usr/bin/udevadm trigger --type=devices --action=add" This program generates an echo "add" in /sys/devices/system/cpu/cpu<x>/uevent, leading to the "kvm: disabled by bios" message in case of your Bios disabled the virtualization extensions. On a modern system running up to 256 CPU threads, this pollutes the Kernel logs. This patch offers to ratelimit this message to avoid any userspace program triggering this uevent printing this message too often. This patch is only a workaround but greatly reduce the pollution without breaking the current behavior of printing a message if some try to instantiate KVM on a system that doesn't support it. Note that recent versions of systemd (>239) do not have trigger this behavior. This patch will be useful at least for some using older systemd with recent Kernels. Signed-off-by: Erwan Velu <e.velu@criteo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28Merge branches 'pm-sleep' and 'pm-devfreq'Rafael J. Wysocki
* pm-sleep: PM / hibernate: fix typo "reserverd_size" -> "reserved_size" Documentation: power: Drop reference to interface.rst * pm-devfreq: Revert "PM / devfreq: Modify the device name as devfreq(X) for sysfs"
2020-02-28KVM: x86: avoid useless copy of cpufreq policyPaolo Bonzini
struct cpufreq_policy is quite big and it is not a good idea to allocate one on the stack. Just use cpufreq_cpu_get and cpufreq_cpu_put which is even simpler. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: allow disabling -WerrorPaolo Bonzini
Restrict -Werror to well-tested configurations and allow disabling it via Kconfig. Reported-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: x86: allow compiling as non-module with W=1Valdis Klētnieks
Compile error with CONFIG_KVM_INTEL=y and W=1: CC arch/x86/kvm/vmx/vmx.o arch/x86/kvm/vmx/vmx.c:68:32: error: 'vmx_cpu_id' defined but not used [-Werror=unused-const-variable=] 68 | static const struct x86_cpu_id vmx_cpu_id[] = { | ^~~~~~~~~~ cc1: all warnings being treated as errors When building with =y, the MODULE_DEVICE_TABLE macro doesn't generate a reference to the structure (or any code at all). This makes W=1 compiles unhappy. Wrap both in a #ifdef to avoid the issue. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> [Do the same for CONFIG_KVM_AMD. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: Pre-allocate 1 cpumask variable per cpu for both pv tlb and pv ipisWanpeng Li
Nick Desaulniers Reported: When building with: $ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000 The following warning is observed: arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=] static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector) ^ Debugging with: https://github.com/ClangBuiltLinux/frame-larger-than via: $ python3 frame_larger_than.py arch/x86/kernel/kvm.o \ kvm_send_ipi_mask_allbutself points to the stack allocated `struct cpumask newmask` in `kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as 8192, making a single instance of a `struct cpumask` 1024 B. This patch fixes it by pre-allocate 1 cpumask variable per cpu and use it for both pv tlb and pv ipis.. Reported-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: Introduce pv check helpersWanpeng Li
Introduce some pv check helpers for consistency. Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: let declaration of kvm_get_running_vcpus match implementationChristian Borntraeger
Sparse notices that declaration and implementation do not match: arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: warning: incorrect type in return expression (different address spaces) arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: expected struct kvm_vcpu [noderef] <asn:3> ** arch/s390/kvm/../../../virt/kvm/kvm_main.c:4435:17: got struct kvm_vcpu *[noderef] <asn:3> * Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-28KVM: SVM: allocate AVIC data structures based on kvm_amd module parameterPaolo Bonzini
Even if APICv is disabled at startup, the backing page and ir_list need to be initialized in case they are needed later. The only case in which this can be skipped is for userspace irqchip, and that must be done because avic_init_backing_page dereferences vcpu->arch.apic (which is NULL for userspace irqchip). Tested-by: rmuncrief@humanavance.com Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206579 Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-27Merge tag 'drm-fixes-2020-02-28' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Just some fixes for this week: amdgpu, radeon and i915. The main i915 one is a regression Gen7 (Ivybridge/Haswell), this moves them back from trying to use the full-ppgtt support to the aliasing version it used to use due to gpu hangs. Otherwise it's pretty quiet. amdgpu: - Drop DRIVER_USE_AGP - Fix memory leak in GPU reset - Resume fix for raven radeon: - Drop DRIVER_USE_AGP i915: - downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs - shrinker fix - pmu leak and double free fixes - gvt user after free and virtual display reset fixes - randconfig build fix" * tag 'drm-fixes-2020-02-28' of git://anongit.freedesktop.org/drm/drm: drm/radeon: Inline drm_get_pci_dev drm/amdgpu: Drop DRIVER_USE_AGP drm/i915: Avoid recursing onto active vma from the shrinker drm/i915/pmu: Avoid using globals for PMU events drm/i915/pmu: Avoid using globals for CPU hotplug state drm/i915/gtt: Downgrade gen7 (ivb, byt, hsw) back to aliasing-ppgtt drm/i915: fix header test with GCOV amdgpu/gmc_v9: save/restore sdpif regs during S3 drm/amdgpu: fix memory leak during TDR test(v2) drm/i915/gvt: Fix orphan vgpu dmabuf_objs' lifetime drm/i915/gvt: Separate display reset from ALL_ENGINES reset
2020-02-28Merge tag 'drm-intel-fixes-2020-02-27' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v5.6-rc4: - downgrade gen7 back to aliasing-ppgtt to avoid GPU hangs - shrinker fix - pmu leak and double free fixes - gvt user after free and virtual display reset fixes - randconfig build fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/874kvcsh00.fsf@intel.com
2020-02-28Merge tag 'amd-drm-fixes-5.6-2020-02-26' of ↵Dave Airlie
git://people.freedesktop.org/~agd5f/linux into drm-fixes amd-drm-fixes-5.6-2020-02-26: amdgpu: - Drop DRIVER_USE_AGP - Fix memory leak in GPU reset - Resume fix for raven radeon: - Drop DRIVER_USE_AGP Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227034106.3912-1-alexander.deucher@amd.com
2020-02-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix leak in nl80211 AP start where we leak the ACL memory, from Johannes Berg. 2) Fix double mutex unlock in mac80211, from Andrei Otcheretianski. 3) Fix RCU stall in ipset, from Jozsef Kadlecsik. 4) Fix devlink locking in devlink_dpipe_table_register, from Madhuparna Bhowmik. 5) Fix race causing TX hang in ll_temac, from Esben Haabendal. 6) Stale eth hdr pointer in br_dev_xmit(), from Nikolay Aleksandrov. 7) Fix TX hash calculation bounds checking wrt. tc rules, from Amritha Nambiar. 8) Size netlink responses properly in schedule action code to take into consideration TCA_ACT_FLAGS. From Jiri Pirko. 9) Fix firmware paths for mscc PHY driver, from Antoine Tenart. 10) Don't register stmmac notifier multiple times, from Aaro Koskinen. 11) Various rmnet bug fixes, from Taehee Yoo. 12) Fix vsock deadlock in vsock transport release, from Stefano Garzarella. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits) net: dsa: mv88e6xxx: Fix masking of egress port mlxsw: pci: Wait longer before accessing the device after reset sfc: fix timestamp reconstruction at 16-bit rollover points vsock: fix potential deadlock in transport->release() unix: It's CONFIG_PROC_FS not CONFIG_PROCFS net: rmnet: fix packet forwarding in rmnet bridge mode net: rmnet: fix bridge mode bugs net: rmnet: use upper/lower device infrastructure net: rmnet: do not allow to change mux id if mux id is duplicated net: rmnet: remove rcu_read_lock in rmnet_force_unassociate_device() net: rmnet: fix suspicious RCU usage net: rmnet: fix NULL pointer dereference in rmnet_changelink() net: rmnet: fix NULL pointer dereference in rmnet_newlink() net: phy: marvell: don't interpret PHY status unless resolved mlx5: register lag notifier for init network namespace only unix: define and set show_fdinfo only if procfs is enabled hinic: fix a bug of rss configuration hinic: fix a bug of setting hw_ioctxt hinic: fix a irq affinity bug net/smc: check for valid ib_client_data ...
2020-02-27MAINTAINERS: Correct Cadence PCI driver pathLukas Bulwahn
de80f95ccb9c ("PCI: cadence: Move all files to per-device cadence directory") moved files of the PCI cadence drivers, but did not update the MAINTAINERS entry. Since then, ./scripts/get_maintainer.pl --self-test complains: warning: no file matches F: drivers/pci/controller/pcie-cadence* Repair the MAINTAINERS entry. Link: https://lore.kernel.org/r/20200221185402.4703-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>