summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-09-08mailmap, MAINTAINERS: move to tycho.pizzaTycho Andersen
I've changed my e-mail address to tycho.pizza, so let's reflect that in these files. Signed-off-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200902014017.934315-2-tycho@tycho.pizza Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08seccomp: don't leak memory when filter install racesTycho Andersen
In seccomp_set_mode_filter() with TSYNC | NEW_LISTENER, we first initialize the listener fd, then check to see if we can actually use it later in seccomp_may_assign_mode(), which can fail if anyone else in our thread group has installed a filter and caused some divergence. If we can't, we partially clean up the newly allocated file: we put the fd, put the file, but don't actually clean up the *memory* that was allocated at filter->notif. Let's clean that up too. To accomplish this, let's hoist the actual "detach a notifier from a filter" code to its own helper out of seccomp_notify_release(), so that in case anyone adds stuff to init_listener(), they only have to add the cleanup code in one spot. This does a bit of extra locking and such on the failure path when the filter is not attached, but it's a slow failure path anyway. Fixes: 51891498f2da ("seccomp: allow TSYNC and USER_NOTIF together") Reported-by: syzbot+3ad9614a12f80994c32e@syzkaller.appspotmail.com Signed-off-by: Tycho Andersen <tycho@tycho.pizza> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/r/20200902014017.934315-1-tycho@tycho.pizza Signed-off-by: Kees Cook <keescook@chromium.org>
2020-09-08Merge tag 'drm-fixes-2020-09-08' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "The i915 reverts are going to be a bit of a conflict mess for next, so I decided to dequeue them now, along with some msm fixes for a ring corruption issue, that Rob sent over the weekend. Summary: i915: - revert gpu relocation changes due to regression msm: - fixes for RPTR corruption issue" * tag 'drm-fixes-2020-09-08' of git://anongit.freedesktop.org/drm/drm: Revert "drm/i915/gem: Delete unused code" Revert "drm/i915/gem: Async GPU relocations only" Revert "drm/i915: Remove i915_gem_object_get_dirty_page()" drm/msm: Disable the RPTR shadow drm/msm: Disable preemption on all 5xx targets drm/msm: Enable expanded apriv support for a650 drm/msm: Split the a5xx preemption record
2020-09-08Merge tag 'livepatching-for-5.9-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching fix from Petr Mladek: "Workaround for 'unreachable instruction' objtool warnings that happen with some compiler versions" * tag 'livepatching-for-5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: Revert "kbuild: use -flive-patching when CONFIG_LIVEPATCH is enabled"
2020-09-08nvme-tcp: cancel async events before freeing event structDavid Milburn
Cancel async event work in case async event has been queued up, and nvme_tcp_submit_async_event() runs after event has been freed. Signed-off-by: David Milburn <dmilburn@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-08nvme-rdma: cancel async events before freeing event structDavid Milburn
Cancel async event work in case async event has been queued up, and nvme_rdma_submit_async_event() runs after event has been freed. Signed-off-by: David Milburn <dmilburn@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-08nvme-fc: cancel async events before freeing event structDavid Milburn
Cancel async event work in case async event has been queued up, and nvme_fc_submit_async_event() runs after event has been freed. Signed-off-by: David Milburn <dmilburn@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-08nvme: Revert: Fix controller creation races with teardown flowJames Smart
The indicated patch introduced a barrier in the sysfs_delete attribute for the controller that rejects the request if the controller isn't created. "Created" is defined as at least 1 call to nvme_start_ctrl(). This is problematic in error-injection testing. If an error occurs on the initial attempt to create an association and the controller enters reconnect(s) attempts, the admin cannot delete the controller until either there is a successful association created or ctrl_loss_tmo times out. Where this issue is particularly hurtful is when the "admin" is the nvme-cli, it is performing a connection to a discovery controller, and it is initiated via auto-connect scripts. With the FC transport, if the first connection attempt fails, the controller enters a normal reconnect state but returns control to the cli thread that created the controller. In this scenario, the cli attempts to read the discovery log via ioctl, which fails, causing the cli to see it as an empty log and then proceeds to delete the discovery controller. The delete is rejected and the controller is left live. If the discovery controller reconnect then succeeds, there is no action to delete it, and it sits live doing nothing. Cc: <stable@vger.kernel.org> # v5.7+ Fixes: ce1518139e69 ("nvme: Fix controller creation races with teardown flow") Signed-off-by: James Smart <james.smart@broadcom.com> CC: Israel Rukshin <israelr@mellanox.com> CC: Max Gurtovoy <maxg@mellanox.com> CC: Christoph Hellwig <hch@lst.de> CC: Keith Busch <kbusch@kernel.org> CC: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-08Merge tag 'usb-serial-5.9-rc5' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fixes for 5.9-rc5 Here are some new device ids for 5.9. All have been in linux-next with no reported issues. * tag 'usb-serial-5.9-rc5' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: option: support dynamic Quectel USB compositions USB: serial: option: add support for SIM7070/SIM7080/SIM7090 modules USB: serial: ftdi_sio: add IDs for Xsens Mti USB converter
2020-09-08spi: spi-cadence-quadspi: Fix mapping of buffers for DMA readsVignesh Raghavendra
Buffers need to mapped to DMA channel's device pointer instead of SPI controller's device pointer as its system DMA that actually does data transfer. Data inconsistencies have been reported when reading from flash without this fix. Fixes: ffa639e069fb ("mtd: spi-nor: cadence-quadspi: Add DMA support for direct mode reads") Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Tested-by: Jan Kiszka <jan.kiszka@siemens.com> Link: https://lore.kernel.org/r/20200831130720.4524-1-vigneshr@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-08block: restore a specific error code in bdev_del_partitionChristoph Hellwig
mdadm relies on the fact that deleting an invalid partition returns -ENXIO or -ENOTTY to detect if a block device is a partition or a whole device. Fixes: 08fc1ab6d748 ("block: fix locking in bdev_del_partition") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-08drm/i915: fix regression leading to display audio probe failure on GLKKai Vehmanen
In commit 4f0b4352bd26 ("drm/i915: Extract cdclk requirements checking to separate function") the order of force_min_cdclk_changed check and intel_modeset_checks(), was reversed. This broke the mechanism to immediately force a new CDCLK minimum, and lead to driver probe errors for display audio on GLK platform with 5.9-rc1 kernel. Fix the issue by moving intel_modeset_checks() call later. [vsyrjala: It also broke the ability of planes to bump up the cdclk and thus could lead to underruns when eg. flipping from 32bpp to 64bpp framebuffer. To be clear, we still compute the new cdclk correctly but fail to actually program it to the hardware due to intel_set_cdclk_{pre,post}_plane_update() not getting called on account of state->modeset==false.] Fixes: 4f0b4352bd26 ("drm/i915: Extract cdclk requirements checking to separate function") BugLink: https://github.com/thesofproject/linux/issues/2410 Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200901151036.1312357-1-kai.vehmanen@linux.intel.com (cherry picked from commit cf696856bc54a31f78e6538b84c8f7a006b6108b) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-09-08i2c: npcm7xx: Fix timeout calculationTali Perry
timeout_usec value calculation was wrong, the calculated value was in msec instead of usec. Fixes: 56a1485b102e ("i2c: npcm7xx: Add Nuvoton NPCM I2C controller driver") Signed-off-by: Tali Perry <tali.perry1@gmail.com> Reviewed-by: Avi Fishman <avifishman70@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Alex Qiu <xqiu@google.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-09-08Revert "drm/i915/gem: Delete unused code"Dave Airlie
These commits caused a regression on Lenovo t520 sandybridge machine belonging to reporter. We are reverting them for 5.10 for other reasons, so just do it for 5.9 as well. This reverts commit 7ac2d2536dfa71c275a74813345779b1e7522c91. Reported-by: Harald Arnesen <harald@skogtun.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08Revert "drm/i915/gem: Async GPU relocations only"Dave Airlie
These commits caused a regression on Lenovo t520 sandybridge machine belonging to reporter. We are reverting them for 5.10 for other reasons, so just do it for 5.9 as well. This reverts commit 9e0f9464e2ab36b864359a59b0e9058fdef0ce47. Reported-by: Harald Arnesen <harald@skogtun.org> Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08Revert "drm/i915: Remove i915_gem_object_get_dirty_page()"Dave Airlie
These commits caused a regression on Lenovo t520 sandybridge machine belonging to reporter. We are reverting them for 5.10 for other reasons, so just do it for 5.9 as well. This reverts commit 763fedd6a216f94c2eb98d2f7ca21be3d3806e69. Reported-by: Harald Arnesen <harald@skogtun.org> Signed-off-by: Dave Airlie <airied@redhat.com>
2020-09-08Merge tag 'drm-msm-fixes-2020-09-04' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes A few fixes for a potential RPTR corruption issue. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/ <CAF6AEGvnr6Nhz2J0sjv2G+j7iceVtaDiJDT8T88uW6jiBfOGKQ@mail.gmail.com
2020-09-07btrfs: fix NULL pointer dereference after failure to create snapshotFilipe Manana
When trying to get a new fs root for a snapshot during the transaction at transaction.c:create_pending_snapshot(), if btrfs_get_new_fs_root() fails we leave "pending->snap" pointing to an error pointer, and then later at ioctl.c:create_snapshot() we dereference that pointer, resulting in a crash: [12264.614689] BUG: kernel NULL pointer dereference, address: 00000000000007c4 [12264.615650] #PF: supervisor write access in kernel mode [12264.616487] #PF: error_code(0x0002) - not-present page [12264.617436] PGD 0 P4D 0 [12264.618328] Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [12264.619150] CPU: 0 PID: 2310635 Comm: fsstress Tainted: G W 5.9.0-rc3-btrfs-next-67 #1 [12264.619960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [12264.621769] RIP: 0010:btrfs_mksubvol+0x438/0x4a0 [btrfs] [12264.622528] Code: bc ef ff ff (...) [12264.624092] RSP: 0018:ffffaa6fc7277cd8 EFLAGS: 00010282 [12264.624669] RAX: 00000000fffffff4 RBX: ffff9d3e8f151a60 RCX: 0000000000000000 [12264.625249] RDX: 0000000000000001 RSI: ffffffff9d56c9be RDI: fffffffffffffff4 [12264.625830] RBP: ffff9d3e8f151b48 R08: 0000000000000000 R09: 0000000000000000 [12264.626413] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000fffffff4 [12264.626994] R13: ffff9d3ede380538 R14: ffff9d3ede380500 R15: ffff9d3f61b2eeb8 [12264.627582] FS: 00007f140d5d8200(0000) GS:ffff9d3fb5e00000(0000) knlGS:0000000000000000 [12264.628176] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [12264.628773] CR2: 00000000000007c4 CR3: 000000020f8e8004 CR4: 00000000003706f0 [12264.629379] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [12264.629994] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [12264.630594] Call Trace: [12264.631227] btrfs_mksnapshot+0x7b/0xb0 [btrfs] [12264.631840] __btrfs_ioctl_snap_create+0x16f/0x1a0 [btrfs] [12264.632458] btrfs_ioctl_snap_create_v2+0xb0/0xf0 [btrfs] [12264.633078] btrfs_ioctl+0x1864/0x3130 [btrfs] [12264.633689] ? do_sys_openat2+0x1a7/0x2d0 [12264.634295] ? kmem_cache_free+0x147/0x3a0 [12264.634899] ? __x64_sys_ioctl+0x83/0xb0 [12264.635488] __x64_sys_ioctl+0x83/0xb0 [12264.636058] do_syscall_64+0x33/0x80 [12264.636616] entry_SYSCALL_64_after_hwframe+0x44/0xa9 (gdb) list *(btrfs_mksubvol+0x438) 0x7c7b8 is in btrfs_mksubvol (fs/btrfs/ioctl.c:858). 853 ret = 0; 854 pending_snapshot->anon_dev = 0; 855 fail: 856 /* Prevent double freeing of anon_dev */ 857 if (ret && pending_snapshot->snap) 858 pending_snapshot->snap->anon_dev = 0; 859 btrfs_put_root(pending_snapshot->snap); 860 btrfs_subvolume_release_metadata(root, &pending_snapshot->block_rsv); 861 free_pending: 862 if (pending_snapshot->anon_dev) So fix this by setting "pending->snap" to NULL if we get an error from the call to btrfs_get_new_fs_root() at transaction.c:create_pending_snapshot(). Fixes: 2dfb1e43f57dd3 ("btrfs: preallocate anon block device at first phase of snapshot creation") Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-07usb: typec: intel_pmc_mux: Do not configure SBU and HSL Orientation in ↵Utkarsh Patel
Alternate modes According to the PMC Type C Subsystem (TCSS) Mux programming guide rev 0.7, bits 4 and 5 are reserved in Alternate modes. SBU Orientation and HSL Orientation needs to be configured only during initial cable detection in USB connect flow based on device property of "sbu-orientation" and "hsl-orientation". Configuring these reserved bits in the Alternate modes may result in delay in display link training or some unexpected behaviour. So do not configure them while issuing Alternate Mode requests. Fixes: ff4a30d5e243 ("usb: typec: mux: intel_pmc_mux: Support for static SBU/HSL orientation") Signed-off-by: Utkarsh Patel <utkarsh.h.patel@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20200907142152.35678-3-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-07usb: typec: intel_pmc_mux: Do not configure Altmode HPD HighUtkarsh Patel
According to the PMC Type C Subsystem (TCSS) Mux programming guide rev 0.7, bit 14 is reserved in Alternate mode. In DP Alternate Mode state, if the HPD_STATE (bit 7) field in the status update command VDO is set to HPD_HIGH, HPD is configured via separate HPD mode request after configuring DP Alternate mode request. Configuring reserved bit may show unexpected behaviour. So do not configure them while issuing the Alternate Mode request. Fixes: 7990be48ef4d ("usb: typec: mux: intel: Handle alt mode HPD_HIGH") Cc: stable@vger.kernel.org Signed-off-by: Utkarsh Patel <utkarsh.h.patel@intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20200907142152.35678-2-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-07scripts/tags.sh: exclude tools directory from tags generationRustam Kovhaev
when COMPILED_SOURCE is set, running 'make ARCH=x86_64 COMPILED_SOURCE=1 cscope tags' in KBUILD_OUTPUT directory produces lots of "No such file or directory" warnings: ... realpath: sigchain.h: No such file or directory realpath: orc_gen.c: No such file or directory realpath: objtool.c: No such file or directory ... let's exclude tools directory from tags generation Fixes: 4f491bb6ea2a ("scripts/tags.sh: collect compiled source precisely") Link: https://lore.kernel.org/lkml/20200809210056.GA1344537@thinkpad Signed-off-by: Rustam Kovhaev <rkovhaev@gmail.com> Link: https://lore.kernel.org/r/20200810153650.1822316-1-rkovhaev@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-07btrfs: free data reloc tree on failed mountJosef Bacik
While testing a weird problem with -o degraded, I noticed I was getting leaked root errors BTRFS warning (device loop0): writable mount is not allowed due to too many missing devices BTRFS error (device loop0): open_ctree failed BTRFS error (device loop0): leaked root -9-0 refcount 1 This is the DATA_RELOC root, which gets read before the other fs roots, but is included in the fs roots radix tree. Handle this by adding a btrfs_drop_and_free_fs_root() on the data reloc root if it exists. This is ok to do here if we fail further up because we will only drop the ref if we delete the root from the radix tree, and all other cleanup won't be duplicated. CC: stable@vger.kernel.org # 5.8+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-07btrfs: require only sector size alignment for parent eb bytenrQu Wenruo
[BUG] A completely sane converted fs will cause kernel warning at balance time: [ 1557.188633] BTRFS info (device sda7): relocating block group 8162107392 flags data [ 1563.358078] BTRFS info (device sda7): found 11722 extents [ 1563.358277] BTRFS info (device sda7): leaf 7989321728 gen 95 total ptrs 213 free space 3458 owner 2 [ 1563.358280] item 0 key (7984947200 169 0) itemoff 16250 itemsize 33 [ 1563.358281] extent refs 1 gen 90 flags 2 [ 1563.358282] ref#0: tree block backref root 4 [ 1563.358285] item 1 key (7985602560 169 0) itemoff 16217 itemsize 33 [ 1563.358286] extent refs 1 gen 93 flags 258 [ 1563.358287] ref#0: shared block backref parent 7985602560 [ 1563.358288] (parent 7985602560 is NOT ALIGNED to nodesize 16384) [ 1563.358290] item 2 key (7985635328 169 0) itemoff 16184 itemsize 33 ... [ 1563.358995] BTRFS error (device sda7): eb 7989321728 invalid extent inline ref type 182 [ 1563.358996] ------------[ cut here ]------------ [ 1563.359005] WARNING: CPU: 14 PID: 2930 at 0xffffffff9f231766 Then with transaction abort, and obviously failed to balance the fs. [CAUSE] That mentioned inline ref type 182 is completely sane, it's BTRFS_SHARED_BLOCK_REF_KEY, it's some extra check making kernel to believe it's invalid. Commit 64ecdb647ddb ("Btrfs: add one more sanity check for shared ref type") introduced extra checks for backref type. One of the requirement is, parent bytenr must be aligned to node size, which is not correct. One example is like this: 0 1G 1G+4K 2G 2G+4K | |///////////////////|//| <- A chunk starts at 1G+4K | | <- A tree block get reserved at bytenr 1G+4K Then we have a valid tree block at bytenr 1G+4K, but not aligned to nodesize (16K). Such chunk is not ideal, but current kernel can handle it pretty well. We may warn about such tree block in the future, but should not reject them. [FIX] Change the alignment requirement from node size alignment to sector size alignment. Also, to make our lives a little easier, also output @iref when btrfs_get_extent_inline_ref_type() failed, so we can locate the item easier. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205475 Fixes: 64ecdb647ddb ("Btrfs: add one more sanity check for shared ref type") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> [ update comments and messages ] Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-07btrfs: fix lockdep splat in add_missing_devJosef Bacik
Nikolay reported a lockdep splat in generic/476 that I could reproduce with btrfs/187. ====================================================== WARNING: possible circular locking dependency detected 5.9.0-rc2+ #1 Tainted: G W ------------------------------------------------------ kswapd0/100 is trying to acquire lock: ffff9e8ef38b6268 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x3f/0x330 but task is already holding lock: ffffffffa9d74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire+0x65/0x80 slab_pre_alloc_hook.constprop.0+0x20/0x200 kmem_cache_alloc_trace+0x3a/0x1a0 btrfs_alloc_device+0x43/0x210 add_missing_dev+0x20/0x90 read_one_chunk+0x301/0x430 btrfs_read_sys_array+0x17b/0x1b0 open_ctree+0xa62/0x1896 btrfs_mount_root.cold+0x12/0xea legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 vfs_kern_mount.part.0+0x71/0xb0 btrfs_mount+0x10d/0x379 legacy_get_tree+0x30/0x50 vfs_get_tree+0x28/0xc0 path_mount+0x434/0xc00 __x64_sys_mount+0xe3/0x120 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #1 (&fs_info->chunk_mutex){+.+.}-{3:3}: __mutex_lock+0x7e/0x7e0 btrfs_chunk_alloc+0x125/0x3a0 find_free_extent+0xdf6/0x1210 btrfs_reserve_extent+0xb3/0x1b0 btrfs_alloc_tree_block+0xb0/0x310 alloc_tree_block_no_bg_flush+0x4a/0x60 __btrfs_cow_block+0x11a/0x530 btrfs_cow_block+0x104/0x220 btrfs_search_slot+0x52e/0x9d0 btrfs_lookup_inode+0x2a/0x8f __btrfs_update_delayed_inode+0x80/0x240 btrfs_commit_inode_delayed_inode+0x119/0x120 btrfs_evict_inode+0x357/0x500 evict+0xcf/0x1f0 vfs_rmdir.part.0+0x149/0x160 do_rmdir+0x136/0x1a0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&delayed_node->mutex){+.+.}-{3:3}: __lock_acquire+0x1184/0x1fa0 lock_acquire+0xa4/0x3d0 __mutex_lock+0x7e/0x7e0 __btrfs_release_delayed_node.part.0+0x3f/0x330 btrfs_evict_inode+0x24c/0x500 evict+0xcf/0x1f0 dispose_list+0x48/0x70 prune_icache_sb+0x44/0x50 super_cache_scan+0x161/0x1e0 do_shrink_slab+0x178/0x3c0 shrink_slab+0x17c/0x290 shrink_node+0x2b2/0x6d0 balance_pgdat+0x30a/0x670 kswapd+0x213/0x4c0 kthread+0x138/0x160 ret_from_fork+0x1f/0x30 other info that might help us debug this: Chain exists of: &delayed_node->mutex --> &fs_info->chunk_mutex --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&fs_info->chunk_mutex); lock(fs_reclaim); lock(&delayed_node->mutex); *** DEADLOCK *** 3 locks held by kswapd0/100: #0: ffffffffa9d74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 #1: ffffffffa9d65c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x115/0x290 #2: ffff9e8e9da260e0 (&type->s_umount_key#48){++++}-{3:3}, at: super_cache_scan+0x38/0x1e0 stack backtrace: CPU: 1 PID: 100 Comm: kswapd0 Tainted: G W 5.9.0-rc2+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack+0x92/0xc8 check_noncircular+0x12d/0x150 __lock_acquire+0x1184/0x1fa0 lock_acquire+0xa4/0x3d0 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 __mutex_lock+0x7e/0x7e0 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 ? lock_acquire+0xa4/0x3d0 ? btrfs_evict_inode+0x11e/0x500 ? find_held_lock+0x2b/0x80 __btrfs_release_delayed_node.part.0+0x3f/0x330 btrfs_evict_inode+0x24c/0x500 evict+0xcf/0x1f0 dispose_list+0x48/0x70 prune_icache_sb+0x44/0x50 super_cache_scan+0x161/0x1e0 do_shrink_slab+0x178/0x3c0 shrink_slab+0x17c/0x290 shrink_node+0x2b2/0x6d0 balance_pgdat+0x30a/0x670 kswapd+0x213/0x4c0 ? _raw_spin_unlock_irqrestore+0x46/0x60 ? add_wait_queue_exclusive+0x70/0x70 ? balance_pgdat+0x670/0x670 kthread+0x138/0x160 ? kthread_create_worker_on_cpu+0x40/0x40 ret_from_fork+0x1f/0x30 This is because we are holding the chunk_mutex when we call btrfs_alloc_device, which does a GFP_KERNEL allocation. We don't want to switch that to a GFP_NOFS lock because this is the only place where it matters. So instead use memalloc_nofs_save() around the allocation in order to avoid the lockdep splat. Reported-by: Nikolay Borisov <nborisov@suse.com> CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-09-07PM: <linux/device.h>: fix @em_pd kernel-doc warningRandy Dunlap
Fix kernel-doc warning in <linux/device.h>: ../include/linux/device.h:613: warning: Function parameter or member 'em_pd' not described in 'device' Fixes: 1bc138c62295 ("PM / EM: add support for other devices than CPUs in Energy Model") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://lore.kernel.org/r/d97f40ad-3033-703a-c3cb-2843ce0f6371@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-07openrisc: Fix cache API compile issue when not inliningStafford Horne
I found this when compiling a kbuild random config with GCC 11. The config enables CONFIG_DEBUG_SECTION_MISMATCH, which sets CFLAGS -fno-inline-functions-called-once. This causes the call to cache_loop in cache.c to not be inlined causing the below compile error. In file included from arch/openrisc/mm/cache.c:13: arch/openrisc/mm/cache.c: In function 'cache_loop': ./arch/openrisc/include/asm/spr.h:16:27: warning: 'asm' operand 0 probably does not match constraints 16 | #define mtspr(_spr, _val) __asm__ __volatile__ ( \ | ^~~~~~~ arch/openrisc/mm/cache.c:25:3: note: in expansion of macro 'mtspr' 25 | mtspr(reg, line); | ^~~~~ ./arch/openrisc/include/asm/spr.h:16:27: error: impossible constraint in 'asm' 16 | #define mtspr(_spr, _val) __asm__ __volatile__ ( \ | ^~~~~~~ arch/openrisc/mm/cache.c:25:3: note: in expansion of macro 'mtspr' 25 | mtspr(reg, line); | ^~~~~ make[1]: *** [scripts/Makefile.build:283: arch/openrisc/mm/cache.o] Error 1 The asm constraint "K" requires a immediate constant argument to mtspr, however because of no inlining a register argument is passed causing a failure. Fix this by using __always_inline. Link: https://lore.kernel.org/lkml/202008200453.ohnhqkjQ%25lkp@intel.com/ Signed-off-by: Stafford Horne <shorne@gmail.com>
2020-09-07openrisc: Reserve memblock for initrdStafford Horne
Recently OpenRISC added support for external initrd images, but I found some instability when using larger buildroot initrd images. It turned out that I forgot to reserve the memblock space for the initrd image. This patch fixes the instability issue by reserving memblock space. Fixes: ff6c923dbec3 ("openrisc: Add support for external initrd images") Signed-off-by: Stafford Horne <shorne@gmail.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
2020-09-07spi: stm32: Rate-limit the 'Communication suspended' messageMarek Vasut
The 'spi_stm32 44004000.spi: Communication suspended' message means that when using PIO, the kernel did not read the FIFO fast enough and so the SPI controller paused the transfer. Currently, this is printed on every single such event, so if the kernel is busy and the controller is pausing the transfers often, the kernel will be all the more busy scrolling this message into the log buffer every few milliseconds. That is not helpful. Instead, rate-limit the message and print it every once in a while. It is not possible to use the default dev_warn_ratelimited(), because that is still too verbose, as it prints 10 lines (DEFAULT_RATELIMIT_BURST) every 5 seconds (DEFAULT_RATELIMIT_INTERVAL). The policy here is to print 1 line every 50 seconds (DEFAULT_RATELIMIT_INTERVAL * 10), because 1 line is more than enough and the cycles saved on printing are better left to the CPU to handle the SPI. However, dev_warn_once() is also not useful, as the user should be aware that this condition is possibly recurring or ongoing. Thus the custom rate-limit policy. Finally, turn the message from dev_warn() to dev_dbg(), since the system does not suffer any sort of malfunction if this message appears, it is just slowing down. This further reduces the printing into the log buffer and frees the CPU to do useful work. Fixes: dcbe0d84dfa5 ("spi: add driver for STM32 SPI controller") Signed-off-by: Marek Vasut <marex@denx.de> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: Amelie Delaunay <amelie.delaunay@st.com> Cc: Antonio Borneo <borneo.antonio@gmail.com> Cc: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20200905151913.117775-1-marex@denx.de Signed-off-by: Mark Brown <broonie@kernel.org>
2020-09-07rbd: require global CAP_SYS_ADMIN for mapping and unmappingIlya Dryomov
It turns out that currently we rely only on sysfs attribute permissions: $ ll /sys/bus/rbd/{add*,remove*} --w------- 1 root root 4096 Sep 3 20:37 /sys/bus/rbd/add --w------- 1 root root 4096 Sep 3 20:37 /sys/bus/rbd/add_single_major --w------- 1 root root 4096 Sep 3 20:37 /sys/bus/rbd/remove --w------- 1 root root 4096 Sep 3 20:38 /sys/bus/rbd/remove_single_major This means that images can be mapped and unmapped (i.e. block devices can be created and deleted) by a UID 0 process even after it drops all privileges or by any process with CAP_DAC_OVERRIDE in its user namespace as long as UID 0 is mapped into that user namespace. Be consistent with other virtual block devices (loop, nbd, dm, md, etc) and require CAP_SYS_ADMIN in the initial user namespace for mapping and unmapping, and also for dumping the configuration string and refreshing the image header. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2020-09-07kobject: Drop unneeded conditional in __kobject_del()Andy Shevchenko
__kobject_del() is called from two places, in one where kobj is dereferenced before and thus can't be NULL, and in the other the NULL check is done before call. Drop unneeded conditional in __kobject_del(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20200803083520.5460-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-07mmc: sdio: Use mmc_pre_req() / mmc_post_req()Adrian Hunter
SDHCI changed from using a tasklet to finish requests, to using an IRQ thread i.e. commit c07a48c2651965 ("mmc: sdhci: Remove finish_tasklet"). Because this increased the latency to complete requests, a preparatory change was made to complete the request from the IRQ handler if possible i.e. commit 19d2f695f4e827 ("mmc: sdhci: Call mmc_request_done() from IRQ handler if possible"). That alleviated the situation for MMC block devices because the MMC block driver makes use of mmc_pre_req() and mmc_post_req() so that successful requests are completed in the IRQ handler and any DMA unmapping is handled separately in mmc_post_req(). However SDIO was still affected, and an example has been reported with up to 20% degradation in performance. Looking at SDIO I/O helper functions, sdio_io_rw_ext_helper() appeared to be a possible candidate for making use of asynchronous requests within its I/O loops, but analysis revealed that these loops almost never iterate more than once, so the complexity of the change would not be warrented. Instead, mmc_pre_req() and mmc_post_req() are added before and after I/O submission (mmc_wait_for_req) in mmc_io_rw_extended(). This still has the potential benefit of reducing the duration of interrupt handlers, as well as addressing the latency issue for SDHCI. It also seems a more reasonable solution than forcing drivers to do everything in the IRQ handler. Reported-by: Dmitry Osipenko <digetx@gmail.com> Fixes: c07a48c2651965 ("mmc: sdhci: Remove finish_tasklet") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200903082007.18715-1-adrian.hunter@intel.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-09-07mmc: sdhci-of-esdhc: Don't walk device-tree on every interruptChris Packham
Commit b214fe592ab7 ("mmc: sdhci-of-esdhc: add erratum eSDHC7 support") added code to check for a specific compatible string in the device-tree on every esdhc interrupat. Instead of doing this record the quirk in struct sdhci_esdhc and lookup the struct in esdhc_irq. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Link: https://lore.kernel.org/r/20200903012029.25673-1-chris.packham@alliedtelesis.co.nz Fixes: b214fe592ab7 ("mmc: sdhci-of-esdhc: add erratum eSDHC7 support") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-09-07mmc: mmc_spi: Allow the driver to be built when CONFIG_HAS_DMA is unsetUlf Hansson
The commit cd57d07b1e4e ("sh: don't allow non-coherent DMA for NOMMU") made CONFIG_NO_DMA to be set for some platforms, for good reasons. Consequentially, CONFIG_HAS_DMA doesn't get set, which makes the DMA mapping interface to be built as stub functions, but also prevent the mmc_spi driver from being built as it depends on CONFIG_HAS_DMA. It turns out that for some odd cases, the driver still relied on the DMA mapping interface, even if the DMA was not actively being used. To fixup the behaviour, let's drop the build dependency for CONFIG_HAS_DMA. Moreover, as to allow the driver to succeed probing, let's move the DMA initializations behind "#ifdef CONFIG_HAS_DMA". Fixes: cd57d07b1e4e ("sh: don't allow non-coherent DMA for NOMMU") Reported-by: Rich Felker <dalias@libc.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Rich Felker <dalias@libc.org> Link: https://lore.kernel.org/r/20200901150438.228887-1-ulf.hansson@linaro.org
2020-09-07mmc: sdhci-msm: Add retries when all tuning phases are found validDouglas Anderson
As the comments in this patch say, if we tune and find all phases are valid it's _almost_ as bad as no phases being found valid. Probably all phases are not really reliable but we didn't detect where the unreliable place is. That means we'll essentially be guessing and hoping we get a good phase. This is not just a problem in theory. It was causing real problems on a real board. On that board, most often phase 10 is found as the only invalid phase, though sometimes 10 and 11 are invalid and sometimes just 11. Some percentage of the time, however, all phases are found to be valid. When this happens, the current logic will decide to use phase 11. Since phase 11 is sometimes found to be invalid, this is a bad choice. Sure enough, when phase 11 is picked we often get mmc errors later in boot. I have seen cases where all phases were found to be valid 3 times in a row, so increase the retry count to 10 just to be extra sure. Fixes: 415b5a75da43 ("mmc: sdhci-msm: Add platform_execute_tuning implementation") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Veerabhadrarao Badiganti <vbadigan@codeaurora.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20200827075809.1.If179abf5ecb67c963494db79c3bc4247d987419b@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-09-07mmc: sdhci-acpi: Clear amd_sdhci_host on resetRaul E Rangel
The commit 61d7437ed1390 ("mmc: sdhci-acpi: Fix HS400 tuning for AMDI0040") broke resume for eMMC HS400. When the system suspends the eMMC controller is powered down. So, on resume we need to reinitialize the controller. Although, amd_sdhci_host was not getting cleared, so the DLL was never re-enabled on resume. This results in HS400 being non-functional. To fix the problem, this change clears the tuned_clock flag, clears the dll_enabled flag and disables the DLL on reset. Fixes: 61d7437ed1390 ("mmc: sdhci-acpi: Fix HS400 tuning for AMDI0040") Signed-off-by: Raul E Rangel <rrangel@chromium.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20200831150517.1.I93c78bfc6575771bb653c9d3fca5eb018a08417d@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2020-09-06cifs: fix DFS mount with cifsacl/modefromsidRonnie Sahlberg
RHBZ: 1871246 If during cifs_lookup()/get_inode_info() we encounter a DFS link and we use the cifsacl or modefromsid mount options we must suppress any -EREMOTE errors that triggers or else we will not be able to follow the DFS link and automount the target. This fixes an issue with modefromsid/cifsacl where these mountoptions would break DFS and we would no longer be able to access the share. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com>
2020-09-06Linux 5.9-rc4v5.9-rc4Linus Torvalds
2020-09-06Merge tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull more io_uring fixes from Jens Axboe: "Two followup fixes. One is fixing a regression from this merge window, the other is two commits fixing cancelation of deferred requests. Both have gone through full testing, and both spawned a few new regression test additions to liburing. - Don't play games with const, properly store the output iovec and assign it as needed. - Deferred request cancelation fix (Pavel)" * tag 'io_uring-5.9-2020-09-06' of git://git.kernel.dk/linux-block: io_uring: fix linked deferred ->files cancellation io_uring: fix cancel of deferred reqs with ->files io_uring: fix explicit async read/write mapping for large segments
2020-09-06Merge tag 'iommu-fixes-v5.9-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - three Intel VT-d fixes to fix address handling on 32bit, fix a NULL pointer dereference bug and serialize a hardware register access as required by the VT-d spec. - two patches for AMD IOMMU to force AMD GPUs into translation mode when memory encryption is active and disallow using IOMMUv2 functionality. This makes the AMDGPU driver work when memory encryption is active. - two more fixes for AMD IOMMU to fix updating the Interrupt Remapping Table Entries. - MAINTAINERS file update for the Qualcom IOMMU driver. * tag 'iommu-fixes-v5.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Handle 36bit addressing for x86-32 iommu/amd: Do not use IOMMUv2 functionality when SME is active iommu/amd: Do not force direct mapping when SME is active iommu/amd: Use cmpxchg_double() when updating 128-bit IRTE iommu/amd: Restore IRTE.RemapEn bit after programming IRTE iommu/vt-d: Fix NULL pointer dereference in dev_iommu_priv_set() iommu/vt-d: Serialize IOMMU GCMD register modifications MAINTAINERS: Update QUALCOMM IOMMU after Arm SMMU drivers move
2020-09-06Merge tag 'x86-urgent-2020-09-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - more generic entry code ABI fallout - debug register handling bugfixes - fix vmalloc mappings on 32-bit kernels - kprobes instrumentation output fix on 32-bit kernels - fix over-eager WARN_ON_ONCE() on !SMAP hardware - NUMA debugging fix - fix Clang related crash on !RETPOLINE kernels * tag 'x86-urgent-2020-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Unbreak 32bit fast syscall x86/debug: Allow a single level of #DB recursion x86/entry: Fix AC assertion tracing/kprobes, x86/ptrace: Fix regs argument order for i386 x86, fakenuma: Fix invalid starting node ID x86/mm/32: Bring back vmalloc faulting on x86_32 x86/cmdline: Disable jump tables for cmdline.c
2020-09-06Merge tag 'for-linus-5.9-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "A small series for fixing a problem with Xen PVH guests when running as backends (e.g. as dom0). Mapping other guests' memory is now working via ZONE_DEVICE, thus not requiring to abuse the memory hotplug functionality for that purpose" * tag 'for-linus-5.9-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: add helpers to allocate unpopulated memory memremap: rename MEMORY_DEVICE_DEVDAX to MEMORY_DEVICE_GENERIC xen/balloon: add header guard
2020-09-05io_uring: fix linked deferred ->files cancellationPavel Begunkov
While looking for ->files in ->defer_list, consider that requests there may actually be links. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-05io_uring: fix cancel of deferred reqs with ->filesPavel Begunkov
While trying to cancel requests with ->files, it also should look for requests in ->defer_list, otherwise it might end up hanging a thread. Cancel all requests in ->defer_list up to the last request there with matching ->files, that's needed to follow drain ordering semantics. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-05Merge tags 'auxdisplay-for-linus-v5.9-rc4', ↵Linus Torvalds
'clang-format-for-linus-v5.9-rc4' and 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux Pull misc fixes from Miguel Ojeda: "A trivial patch for auxdisplay: - Replace HTTP links with HTTPS ones (Alexander A. Klimov) The usual clang-format trivial update: - Update with the latest for_each macro list (Miguel Ojeda) And Luc requested me to pick a sparse fix on my queue, so here it goes along with other two trivial Compiler Attributes ones (also from Luc). - sparse: use static inline for __chk_{user,io}_ptr() (Luc Van Oostenryck) - Compiler Attributes: fix comment concerning GCC 4.6 (Luc Van Oostenryck) - Compiler Attributes: remove comment about sparse not supporting __has_attribute (Luc Van Oostenryck)" * tag 'auxdisplay-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: auxdisplay: Replace HTTP links with HTTPS ones * tag 'clang-format-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: clang-format: Update with the latest for_each macro list * tag 'compiler-attributes-for-linus-v5.9-rc4' of git://github.com/ojeda/linux: sparse: use static inline for __chk_{user,io}_ptr() Compiler Attributes: fix comment concerning GCC 4.6 Compiler Attributes: remove comment about sparse not supporting __has_attribute
2020-09-05Merge tag 'arc-5.9-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - HSDK-4xd Dev system: perf driver updates for sampling interrupt - HSDK* Dev System: Ethernet broken [Evgeniy Didin] - HIGHMEM broken (2 memory banks) [Mike Rapoport] - show_regs() rewrite once and for all - Other minor fixes * tag 'arc-5.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: [plat-hsdk]: Switch ethernet phy-mode to rgmii-id arc: fix memory initialization for systems with two memory banks irqchip/eznps: Fix build error for !ARC700 builds ARC: show_regs: fix r12 printing and simplify ARC: HSDK: wireup perf irq ARC: perf: don't bail setup if pct irq missing in device-tree ARC: pgalloc.h: delete a duplicated word + other fixes
2020-09-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "19 patches. Subsystems affected by this patch series: MAINTAINERS, ipc, fork, checkpatch, lib, and mm (memcg, slub, pagemap, madvise, migration, hugetlb)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: include/linux/log2.h: add missing () around n in roundup_pow_of_two() mm/khugepaged.c: fix khugepaged's request size in collapse_file mm/hugetlb: fix a race between hugetlb sysctl handlers mm/hugetlb: try preferred node first when alloc gigantic page from cma mm/migrate: preserve soft dirty in remove_migration_pte() mm/migrate: remove unnecessary is_zone_device_page() check mm/rmap: fixup copying of soft dirty and uffd ptes mm/migrate: fixup setting UFFD_WP flag mm: madvise: fix vma user-after-free checkpatch: fix the usage of capture group ( ... ) fork: adjust sysctl_max_threads definition to match prototype ipc: adjust proc_ipc_sem_dointvec definition to match prototype mm: track page table modifications in __apply_to_page_range() MAINTAINERS: IA64: mark Status as Odd Fixes only MAINTAINERS: add LLVM maintainers MAINTAINERS: update Cavium/Marvell entries mm: slub: fix conversion of freelist_corrupted() mm: memcg: fix memcg reclaim soft lockup memcg: fix use-after-free in uncharge_batch
2020-09-05include/linux/log2.h: add missing () around n in roundup_pow_of_two()Jason Gunthorpe
Otherwise gcc generates warnings if the expression is complicated. Fixes: 312a0c170945 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05mm/khugepaged.c: fix khugepaged's request size in collapse_fileDavid Howells
collapse_file() in khugepaged passes PAGE_SIZE as the number of pages to be read to page_cache_sync_readahead(). The intent was probably to read a single page. Fix it to use the number of pages to the end of the window instead. Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Yang Shi <shy828301@gmail.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Eric Biggers <ebiggers@google.com> Link: https://lkml.kernel.org/r/20200903140844.14194-2-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05mm/hugetlb: fix a race between hugetlb sysctl handlersMuchun Song
There is a race between the assignment of `table->data` and write value to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on the other thread. CPU0: CPU1: proc_sys_write hugetlb_sysctl_handler proc_sys_call_handler hugetlb_sysctl_handler_common hugetlb_sysctl_handler table->data = &tmp; hugetlb_sysctl_handler_common table->data = &tmp; proc_doulongvec_minmax do_proc_doulongvec_minmax sysctl_head_finish __do_proc_doulongvec_minmax unuse_table i = table->data; *i = val; // corrupt CPU1's stack Fix this by duplicating the `table`, and only update the duplicate of it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to simplify the code. The following oops was seen: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page Code: Bad RIP value. ... Call Trace: ? set_max_huge_pages+0x3da/0x4f0 ? alloc_pool_huge_page+0x150/0x150 ? proc_doulongvec_minmax+0x46/0x60 ? hugetlb_sysctl_handler_common+0x1c7/0x200 ? nr_hugepages_store+0x20/0x20 ? copy_fd_bitmaps+0x170/0x170 ? hugetlb_sysctl_handler+0x1e/0x20 ? proc_sys_call_handler+0x2f1/0x300 ? unregister_sysctl_table+0xb0/0xb0 ? __fd_install+0x78/0x100 ? proc_sys_write+0x14/0x20 ? __vfs_write+0x4d/0x90 ? vfs_write+0xef/0x240 ? ksys_write+0xc0/0x160 ? __ia32_sys_read+0x50/0x50 ? __close_fd+0x129/0x150 ? __x64_sys_write+0x43/0x50 ? do_syscall_64+0x6c/0x200 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: e5ff215941d5 ("hugetlb: multiple hstates for multiple page sizes") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20200828031146.43035-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-09-05mm/hugetlb: try preferred node first when alloc gigantic page from cmaLi Xinhai
Since commit cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma"), the gigantic page would be allocated from node which is not the preferred node, although there are pages available from that node. The reason is that the nid parameter has been ignored in alloc_gigantic_page(). Besides, the __GFP_THISNODE also need be checked if user required to alloc only from the preferred node. After this patch, the preferred node is tried first before other allowed nodes, and don't try to allocate from other nodes if __GFP_THISNODE is specified. If user don't specify the preferred node, the current node will be used as preferred node, which makes sure consistent behavior of allocating gigantic and non-gigantic hugetlb page. Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma") Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <guro@fb.com> Link: https://lkml.kernel.org/r/20200902025016.697260-1-lixinhai.lxh@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>