summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-05-30PM / Domains: Drop genpd as in-param for pm_genpd_remove_device()Ulf Hansson
There is no need to pass a genpd struct to pm_genpd_remove_device(), as we already have the information about the PM domain (genpd) through the device structure. Additionally, we don't allow to remove a PM domain from a device, other than the one it may have assigned to it, so really it does not make sense to have a separate in-param for it. For these reason, drop it and update the current only call to pm_genpd_remove_device() from amdgpu_acp. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30PM / Domains: Drop __pm_genpd_add_device()Ulf Hansson
There are still a few non-DT existing users of genpd, however neither of them uses __pm_genpd_add_device(), hence let's drop it. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30PM / Domains: Drop extern declarations of functions in pm_domain.hUlf Hansson
Using "extern" to declare a function in a public header file is somewhat pointless, but also doesn't hurt. However, to make all the function declarations in pm_domain.h to be consistent, let's drop the use of "extern". Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30Merge branch 'pm-domains' into pm-oppRafael J. Wysocki
2018-05-30PM / domains: Add perf_state attribute to genpd debugfsRajendra Nayak
Now that genpd supports performance states, add this additional attribute as part of the power domains debugfs entry, to display the current performance state for the Power domain. Suggested-by: David Collins <collinsd@codeaurora.org> Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-30Merge branch 'opp/linux-next' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm into pm-opp Pull more OPP updates for v4.18 from Viresh Kumar. * 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: OPP: Allow same OPP table to be used for multiple genpd PM / OPP: Fix shared OPP table support in dev_pm_opp_register_set_opp_helper() PM / OPP: Fix shared OPP table support in dev_pm_opp_set_regulators() PM / OPP: Fix shared OPP table support in dev_pm_opp_set_prop_name() PM / OPP: Fix shared OPP table support in dev_pm_opp_set_supported_hw()
2018-05-30OPP: Allow same OPP table to be used for multiple genpdViresh Kumar
The OPP binding says: Property: operating-points-v2 ... This can contain more than one phandle for power domain providers that provide multiple power domains. That is, one phandle for each power domain. If only one phandle is available, then the same OPP table will be used for all power domains provided by the power domain provider. But the OPP core isn't allowing the same OPP table to be used for multiple domains. Update dev_pm_opp_of_add_table_indexed() to allow that. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Rajendra Nayak <rnayak@codeaurora.org>
2018-05-24PM / Domain: Return 0 on error from of_genpd_opp_to_performance_state()Viresh Kumar
of_genpd_opp_to_performance_state() should return 0 on errors, as its doc comment describes. While it follows that mostly, it returns a negative error number on one of the failures. Fix that. Fixes: 6e41766a6a50 "PM / Domain: Implement of_genpd_opp_to_performance_state()" Reported-by: Rajendra Nayak <rnayak@codeaurora.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-22PM / OPP: Fix shared OPP table support in dev_pm_opp_register_set_opp_helper()Viresh Kumar
It should be fine to call dev_pm_opp_register_set_opp_helper() for all possible CPUs, even if some of them share the OPP table as the caller may not be aware of sharing policy. Lets increment the reference count of the OPP table and return its pointer. The caller need to call dev_pm_opp_register_put_opp_helper() the same number of times later on to drop all the references. To avoid adding another counter to count how many times dev_pm_opp_register_set_opp_helper() is called for the same OPP table, dev_pm_opp_register_put_opp_helper() frees the resources on the very first call made to it, assuming that the caller would be calling it sequentially for all the CPUs. We can revisit that if that assumption is broken in the future. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2018-05-22PM / OPP: Fix shared OPP table support in dev_pm_opp_set_regulators()Viresh Kumar
It should be fine to call dev_pm_opp_set_regulators() for all possible CPUs, even if some of them share the OPP table as the caller may not be aware of sharing policy. Lets increment the reference count of the OPP table and return its pointer. The caller need to call dev_pm_opp_put_regulators() the same number of times later on to drop all the references. To avoid adding another counter to count how many times dev_pm_opp_set_regulators() is called for the same OPP table, dev_pm_opp_put_regulators() frees the resources on the very first call made to it, assuming that the caller would be calling it sequentially for all the CPUs. We can revisit that if that assumption is broken in the future. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2018-05-22PM / OPP: Fix shared OPP table support in dev_pm_opp_set_prop_name()Viresh Kumar
It should be fine to call dev_pm_opp_set_prop_name() for all possible CPUs, even if some of them share the OPP table as the caller may not be aware of sharing policy. Lets increment the reference count of the OPP table and return its pointer. The caller need to call dev_pm_opp_put_prop_name() the same number of times later on to drop all the references. To avoid adding another counter to count how many times dev_pm_opp_set_prop_name() is called for the same OPP table, dev_pm_opp_put_prop_name() frees the resources on the very first call made to it, assuming that the caller would be calling it sequentially for all the CPUs. We can revisit that if that assumption is broken in the future. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2018-05-22PM / OPP: Fix shared OPP table support in dev_pm_opp_set_supported_hw()Viresh Kumar
It should be fine to call dev_pm_opp_set_supported_hw() for all possible CPUs, even if some of them share the OPP table as the caller may not be aware of sharing policy. Lets increment the reference count of the OPP table and return its pointer. The caller need to call dev_pm_opp_put_supported_hw() the same number of times later on to drop all the references. To avoid adding another counter to count how many times dev_pm_opp_set_supported_hw() is called for the same OPP table, dev_pm_opp_put_supported_hw() frees the resources on the very first call made to it, assuming that the caller would be calling it sequentially for all the CPUs. We can revisit that if that assumption is broken in the future. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2018-05-17PM / domains: Improve wording of dev_pm_domain_attach() commentGeert Uytterhoeven
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-17Merge branch 'opp/linux-next' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm into pm-opp Merge an OPP fix for v4.18 from Viresh Kumar. * 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: PM / OPP: silence an uninitialized variable warning
2018-05-16PM / OPP: silence an uninitialized variable warningDan Carpenter
Smatch complains that it's possible we print "rate" in the debug output when it hasn't been initialized. It should be zero on that path. Fixes: a1e8c13600bf ("PM / OPP: "opp-hz" is optional for power domains") [ Viresh: Added the Fixes tag ] Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2018-05-15PM / Domains: Don't return -EEXIST at attach when PM domain existsUlf Hansson
As dev_pm_domain_attach() isn't the only way to assign PM domain pointers to devices, clearly we must allow a device to have the pointer already being assigned. For this reason, return 0 instead of -EEXIST. Reported-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Tested-by: Krzysztof Kozlowski <krzk@kernel.org> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14Merge branch 'opp/genpd-pstate-updates' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull Operating Performance Points (OPP) library changes for v4.18 from Viresh Kumar. * 'opp/genpd-pstate-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: PM / OPP: Remove dev_pm_opp_{un}register_get_pstate_helper() PM / OPP: Get performance state using genpd helper PM / Domain: Implement of_genpd_opp_to_performance_state() PM / Domain: Add support to parse domain's OPP table PM / Domain: Add struct device to genpd PM / OPP: Implement dev_pm_opp_get_of_node() PM / OPP: Implement of_dev_pm_opp_find_required_opp() PM / OPP: Implement dev_pm_opp_of_add_table_indexed() PM / OPP: "opp-hz" is optional for power domains PM / OPP: dt-bindings: Make "opp-hz" optional for power domains PM / OPP: dt-bindings: Rename "required-opp" as "required-opps" soc/tegra: pmc: Don't allocate struct tegra_powergate on stack
2018-05-14spi: Respect all error codes from dev_pm_domain_attach()Ulf Hansson
The limitation of being able to check only for -EPROBE_DEFER from dev_pm_domain_attach() has been removed. Hence let's respect all error codes and bail out accordingly. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Mark Brown <broonie@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14soundwire: Respect all error codes from dev_pm_domain_attach()Ulf Hansson
The limitation of being able to check only for -EPROBE_DEFER from dev_pm_domain_attach() has been removed. Hence let's respect all error codes and bail out accordingly. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14mmc: sdio: Respect all error codes from dev_pm_domain_attach()Ulf Hansson
The limitation of being able to check only for -EPROBE_DEFER from dev_pm_domain_attach() has been removed. Hence let's respect all error codes and bail out accordingly. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14i2c: Respect all error codes from dev_pm_domain_attach()Ulf Hansson
The limitation of being able to check only for -EPROBE_DEFER from dev_pm_domain_attach() has been removed. Hence let's respect all error codes and bail out accordingly. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14driver core: Respect all error codes from dev_pm_domain_attach()Ulf Hansson
The limitation of being able to check only for -EPROBE_DEFER from dev_pm_domain_attach() has been removed. Hence let's respect all error codes and bail out accordingly. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14amba: Respect all error codes from dev_pm_domain_attach()Ulf Hansson
The limitation of being able to check only for -EPROBE_DEFER from dev_pm_domain_attach() has been removed. Hence let's respect all error codes and bail out accordingly. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14PM / Domains: Allow a better error handling of dev_pm_domain_attach()Ulf Hansson
The callers of dev_pm_domain_attach() currently checks the returned error code for -EPROBE_DEFER and needs to ignore other error codes. This is an unnecessary limitation, which also leads to a rather strange behaviour in the error path. Address this limitation, by changing the return codes from acpi_dev_pm_attach() and genpd_dev_pm_attach(). More precisely, let them return 0, when no PM domain is needed for the device and then return 1, in case the device was successfully attached to its PM domain. In this way, dev_pm_domain_attach(), gets a better understanding of what happens in the attach attempts and also allowing its caller to better act on real errors codes. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14PM / Domains: Check for existing PM domain in dev_pm_domain_attach()Ulf Hansson
Instead of checking if an existing PM domain pointer has been assigned in genpd_dev_pm_attach() and acpi_dev_pm_attach(), move the check to the common path in dev_pm_domain_attach(), thus potentially avoid one unnecessary check. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14PM / Domains: Drop redundant code in genpd while attaching devicesUlf Hansson
The driver core together with the PM core, nowadays deals with deferring all probes during the device system sleep phases. Therefore genpd no longer need to care about this situation, so let's drop the corresponding code. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14PM / Domains: Drop comment in genpd about legacy Samsung DT bindingUlf Hansson
The parsing of the Samsung specific DT binding is gone, but the comment in the function header remained. Let's drop the comment to avoid confusions. Fixes: 001d50c9a14f (PM / Domains: Remove obsolete "samsung,power-domain" check) Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-14PM / Domains: Fix error path during attach in genpdUlf Hansson
In case the PM domain fails to be powered on in genpd_dev_pm_attach(), it returns -EPROBE_DEFER, but keeping the device attached to its PM domain. This leads to problems when the next attempt to attach is re-tried. More precisely, in that situation an -EEXIST error code is returned, because the device already has its PM domain pointer assigned, from the first attempt. Now, because of the sloppy error handling by the existing callers of dev_pm_domain_attach(), probing is allowed to continue when -EEXIST is returned. However, in such case there are no guarantees that the PM domain is powered on by genpd, which may lead to hangs when buses/drivers tried to access their devices. Let's fix this behaviour, simply by detaching the device when powering on fails in genpd_dev_pm_attach(). Cc: v4.11+ <stable@vger.kernel.org> # v4.11+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-05-13Linux 4.17-rc5v4.17-rc5Linus Torvalds
2018-05-13Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti updates from Thomas Gleixner: "A mixed bag of fixes and updates for the ghosts which are hunting us. The scheduler fixes have been pulled into that branch to avoid conflicts. - A set of fixes to address a khread_parkme() race which caused lost wakeups and loss of state. - A deadlock fix for stop_machine() solved by moving the wakeups outside of the stopper_lock held region. - A set of Spectre V1 array access restrictions. The possible problematic spots were discuvered by Dan Carpenters new checks in smatch. - Removal of an unused file which was forgotten when the rest of that functionality was removed" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Remove unused file perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map() perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_* perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[] sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[] sched/core: Fix possible Spectre-v1 indexing for sched_prio_to_weight[] sched/core: Introduce set_special_state() kthread, sched/wait: Fix kthread_parkme() completion issue kthread, sched/wait: Fix kthread_parkme() wait-loop sched/fair: Fix the update of blocked load when newly idle stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
2018-05-13Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Thomas Gleixner: "Revert the new NUMA aware placement approach which turned out to create more problems than it solved" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine()"
2018-05-13Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling fixes from Thomas Gleixner: "Another small set of perf tooling fixes and updates: - Revert "perf pmu: Fix pmu events parsing rule", as it broke Intel PT event description parsing (Arnaldo Carvalho de Melo) - Sync x86's cpufeatures.h and kvm UAPI headers with the kernel sources, suppressing the ABI drift warnings (Arnaldo Carvalho de Melo) - Remove duplicated entry for westmereep-dp in Intel's mapfile.csv (William Cohen) - Fix typo in 'perf bench numa' options description (Yisheng Xie)" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "perf pmu: Fix pmu events parsing rule" tools headers kvm: Sync ARM UAPI headers with the kernel sources tools headers kvm: Sync uapi/linux/kvm.h with the kernel sources tools headers: Sync x86 cpufeatures.h with the kernel sources perf vendor events intel: Remove duplicated entry for westmereep-dp in mapfile.csv perf bench numa: Fix typo in options
2018-05-13Merge tag 'dma-mapping-4.17-5' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fix from Christoph Hellwig: "Just one little fix from Jean to avoid a harmless but very annoying warning, especially for the drm code" * tag 'dma-mapping-4.17-5' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: silent unwanted warning "buffer is full"
2018-05-12Merge tag '4.17-rc4-SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Some small SMB3 fixes for 4.17-rc5, some for stable" * tag '4.17-rc4-SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6: smb3: directory sync should not return an error cifs: smb2ops: Fix listxattr() when there are no EAs cifs: smbd: Enable signing with smbdirect cifs: Allocate validate negotiation request through kmalloc
2018-05-12Merge branch 'next' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal fixes from Zhang Rui: - fix NULL pointer dereference on module load/probe for int3403_thermal driver - fix an emergency shutdown issue on exynos thermal driver * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: exynos: Propagate error value from tmu_read() thermal: exynos: Reading temperature makes sense only when TMU is turned on thermal: int3403_thermal: Fix NULL pointer deref on module load / probe
2018-05-12Merge tag 'for-linus-20180511' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Just a few NVMe fixes this round - one fixing a use-after-free, one fixes the return value after controller reset, and the last one fixes an issue where some drives will spuriously EIO. We should get these into 4.17" * tag 'for-linus-20180511' of git://git.kernel.dk/linux-block: nvme: add quirk to force medium priority for SQ creation nvme: Fix sync controller reset return nvme: fix use-after-free in nvme_free_ns_head
2018-05-12swiotlb: silent unwanted warning "buffer is full"Jean Delvare
If DMA_ATTR_NO_WARN is passed to swiotlb_alloc_buffer(), it should be passed further down to swiotlb_tbl_map_single(). Otherwise we escape half of the warnings but still log the other half. This is one of the multiple causes of spurious warnings reported at: https://bugs.freedesktop.org/show_bug.cgi?id=104082 Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: 0176adb00406 ("swiotlb: refactor coherent buffer allocation") Cc: Christoph Hellwig <hch@lst.de> Cc: Christian König <christian.koenig@amd.com> Cc: Michel Dänzer <michel@daenzer.net> Cc: Takashi Iwai <tiwai@suse.de> Cc: stable@vger.kernel.org # v4.16
2018-05-12Revert "sched/numa: Delay retrying placement for automatic NUMA balance ↵Mel Gorman
after wake_affine()" This reverts commit 7347fc87dfe6b7315e74310ee1243dc222c68086. Srikar Dronamra pointed out that while the commit in question did show a performance improvement on ppc64, it did so at the cost of disabling active CPU migration by automatic NUMA balancing which was not the intent. The issue was that a serious flaw in the logic failed to ever active balance if SD_WAKE_AFFINE was disabled on scheduler domains. Even when it's enabled, the logic is still bizarre and against the original intent. Investigation showed that fixing the patch in either the way he suggested, using the correct comparison for jiffies values or introducing a new numa_migrate_deferred variable in task_struct all perform similarly to a revert with a mix of gains and losses depending on the workload, machine and socket count. The original intent of the commit was to handle a problem whereby wake_affine, idle balancing and automatic NUMA balancing disagree on the appropriate placement for a task. This was particularly true for cases where a single task was a massive waker of tasks but where wake_wide logic did not apply. This was particularly noticeable when a futex (a barrier) woke all worker threads and tried pulling the wakees to the waker nodes. In that specific case, it could be handled by tuning MPI or openMP appropriately, but the behavior is not illogical and was worth attempting to fix. However, the approach was wrong. Given that we're at rc4 and a fix is not obvious, it's better to play safe, revert this commit and retry later. Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: efault@gmx.de Cc: ggherdovich@suse.cz Cc: hpa@zytor.com Cc: matt@codeblueprint.co.uk Cc: mpe@ellerman.id.au Link: http://lkml.kernel.org/r/20180509163115.6fnnyeg4vdm2ct4v@techsingularity.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-05-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: rbtree: include rcu.h scripts/faddr2line: fix error when addr2line output contains discriminator ocfs2: take inode cluster lock before moving reflinked inode from orphan dir mm, oom: fix concurrent munlock and oom reaper unmap, v3 mm: migrate: fix double call of radix_tree_replace_slot() proc/kcore: don't bounds check against address 0 mm: don't show nr_indirectly_reclaimable in /proc/vmstat mm: sections are not offlined during memory hotremove z3fold: fix reclaim lock-ups init: fix false positives in W+X checking lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit() KASAN: prohibit KASAN+STRUCTLEAK combination MAINTAINERS: update Shuah's email address
2018-05-11rbtree: include rcu.hSebastian Andrzej Siewior
Since commit c1adf20052d8 ("Introduce rb_replace_node_rcu()") rbtree_augmented.h uses RCU related data structures but does not include the header file. It works as long as it gets somehow included before that and fails otherwise. Link: http://lkml.kernel.org/r/20180504103159.19938-1-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11scripts/faddr2line: fix error when addr2line output contains discriminatorChangbin Du
When addr2line output contains discriminator, the current awk script cannot parse it. This patch fixes it by extracting key words using regex which is more reliable. $ scripts/faddr2line vmlinux tlb_flush_mmu_free+0x26 tlb_flush_mmu_free+0x26/0x50: tlb_flush_mmu_free at mm/memory.c:258 (discriminator 3) scripts/faddr2line: eval: line 173: unexpected EOF while looking for matching `)' Link: http://lkml.kernel.org/r/1525323379-25193-1-git-send-email-changbin.du@intel.com Fixes: 6870c0165feaa5 ("scripts/faddr2line: show the code context") Signed-off-by: Changbin Du <changbin.du@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: NeilBrown <neilb@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Kate Stewart <kstewart@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11ocfs2: take inode cluster lock before moving reflinked inode from orphan dirAshish Samant
While reflinking an inode, we create a new inode in orphan directory, then take EX lock on it, reflink the original inode to orphan inode and release EX lock. Once the lock is released another node could request it in EX mode from ocfs2_recover_orphans() which causes downconvert of the lock, on this node, to NL mode. Later we attempt to initialize security acl for the orphan inode and move it to the reflink destination. However, while doing this we dont take EX lock on the inode. This could potentially cause problems because we could be starting transaction, accessing journal and modifying metadata of the inode while holding NL lock and with another node holding EX lock on the inode. Fix this by taking orphan inode cluster lock in EX mode before initializing security and moving orphan inode to reflink destination. Use the __tracker variant while taking inode lock to avoid recursive locking in the ocfs2_init_security_and_acl() call chain. Link: http://lkml.kernel.org/r/1523475107-7639-1-git-send-email-ashish.samant@oracle.com Signed-off-by: Ashish Samant <ashish.samant@oracle.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Acked-by: Jun Piao <piaojun@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Changwei Ge <ge.changwei@h3c.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11mm, oom: fix concurrent munlock and oom reaper unmap, v3David Rientjes
Since exit_mmap() is done without the protection of mm->mmap_sem, it is possible for the oom reaper to concurrently operate on an mm until MMF_OOM_SKIP is set. This allows munlock_vma_pages_all() to concurrently run while the oom reaper is operating on a vma. Since munlock_vma_pages_range() depends on clearing VM_LOCKED from vm_flags before actually doing the munlock to determine if any other vmas are locking the same memory, the check for VM_LOCKED in the oom reaper is racy. This is especially noticeable on architectures such as powerpc where clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd is zapped by the oom reaper during follow_page_mask() after the check for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a kernel oops. Fix this by manually freeing all possible memory from the mm before doing the munlock and then setting MMF_OOM_SKIP. The oom reaper can not run on the mm anymore so the munlock is safe to do in exit_mmap(). It also matches the logic that the oom reaper currently uses for determining when to set MMF_OOM_SKIP itself, so there's no new risk of excessive oom killing. This issue fixes CVE-2018-1000200. Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1804241526320.238665@chino.kir.corp.google.com Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: David Rientjes <rientjes@google.com> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11mm: migrate: fix double call of radix_tree_replace_slot()Naoya Horiguchi
radix_tree_replace_slot() is called twice for head page, it's obviously a bug. Let's fix it. Link: http://lkml.kernel.org/r/20180423072101.GA12157@hori1.linux.bs1.fc.nec.co.jp Fixes: e71769ae5260 ("mm: enable thp migration for shmem thp") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reported-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Zi Yan <zi.yan@sent.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11proc/kcore: don't bounds check against address 0Laura Abbott
The existing kcore code checks for bad addresses against __va(0) with the assumption that this is the lowest address on the system. This may not hold true on some systems (e.g. arm64) and produce overflows and crashes. Switch to using other functions to validate the address range. It's currently only seen on arm64 and it's not clear if anyone wants to use that particular combination on a stable release. So this is not urgent for stable. Link: http://lkml.kernel.org/r/20180501201143.15121-1-labbott@redhat.com Signed-off-by: Laura Abbott <labbott@redhat.com> Tested-by: Dave Anderson <anderson@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alexey Dobriyan <adobriyan@gmail.com>a Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11mm: don't show nr_indirectly_reclaimable in /proc/vmstatRoman Gushchin
Don't show nr_indirectly_reclaimable in /proc/vmstat, because there is no need to export this vm counter to userspace, and some changes are expected in reclaimable object accounting, which can alter this counter. Link: http://lkml.kernel.org/r/20180425191422.9159-1-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11mm: sections are not offlined during memory hotremovePavel Tatashin
Memory hotplug and hotremove operate with per-block granularity. If the machine has a large amount of memory (more than 64G), the size of a memory block can span multiple sections. By mistake, during hotremove we set only the first section to offline state. The bug was discovered because kernel selftest started to fail: https://lkml.kernel.org/r/20180423011247.GK5563@yexl-desktop After commit, "mm/memory_hotplug: optimize probe routine". But, the bug is older than this commit. In this optimization we also added a check for sections to be in a proper state during hotplug operation. Link: http://lkml.kernel.org/r/20180427145257.15222-1-pasha.tatashin@oracle.com Fixes: 2d070eab2e82 ("mm: consider zone which is not fully populated to have holes") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11z3fold: fix reclaim lock-upsVitaly Wool
Do not try to optimize in-page object layout while the page is under reclaim. This fixes lock-ups on reclaim and improves reclaim performance at the same time. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20180430125800.444cae9706489f412ad12621@gmail.com Signed-off-by: Vitaly Wool <vitaly.vul@sony.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Cc: <Oleksiy.Avramchenko@sony.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11init: fix false positives in W+X checkingJeffrey Hugo
load_module() creates W+X mappings via __vmalloc_node_range() (from layout_and_allocate()->move_module()->module_alloc()) by using PAGE_KERNEL_EXEC. These mappings are later cleaned up via "call_rcu_sched(&freeinit->rcu, do_free_init)" from do_init_module(). This is a problem because call_rcu_sched() queues work, which can be run after debug_checkwx() is run, resulting in a race condition. If hit, the race results in a nasty splat about insecure W+X mappings, which results in a poor user experience as these are not the mappings that debug_checkwx() is intended to catch. This issue is observed on multiple arm64 platforms, and has been artificially triggered on an x86 platform. Address the race by flushing the queued work before running the arch-defined mark_rodata_ro() which then calls debug_checkwx(). Link: http://lkml.kernel.org/r/1525103946-29526-1-git-send-email-jhugo@codeaurora.org Fixes: e1a58320a38d ("x86/mm: Warn on W^X mappings") Signed-off-by: Jeffrey Hugo <jhugo@codeaurora.org> Reported-by: Timur Tabi <timur@codeaurora.org> Reported-by: Jan Glauber <jan.glauber@caviumnetworks.com> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Laura Abbott <labbott@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-11lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit()Yury Norov
test_find_first_bit() is intentionally sub-optimal, and may cause soft lockup due to long time of run on some systems. So decrease length of bitmap to traverse to avoid lockup. With the change below, time of test execution doesn't exceed 0.2 seconds on my testing system. Link: http://lkml.kernel.org/r/20180420171949.15710-1-ynorov@caviumnetworks.com Fixes: 4441fca0a27f5 ("lib: test module for find_*_bit() functions") Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>