summaryrefslogtreecommitdiff
path: root/drivers/idle
AgeCommit message (Collapse)Author
2018-10-23Merge branch 'perf-core-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main updates in this cycle were: - Lots of perf tooling changes too voluminous to list (big perf trace and perf stat improvements, lots of libtraceevent reorganization, etc.), so I'll list the authors and refer to the changelog for details: Benjamin Peterson, Jérémie Galarneau, Kim Phillips, Peter Zijlstra, Ravi Bangoria, Sangwon Hong, Sean V Kelley, Steven Rostedt, Thomas Gleixner, Ding Xiang, Eduardo Habkost, Thomas Richter, Andi Kleen, Sanskriti Sharma, Adrian Hunter, Tzvetomir Stoyanov, Arnaldo Carvalho de Melo, Jiri Olsa. ... with the bulk of the changes written by Jiri Olsa, Tzvetomir Stoyanov and Arnaldo Carvalho de Melo. - Continued intel_rdt work with a focus on playing well with perf events. This also imported some non-perf RDT work due to dependencies. (Reinette Chatre) - Implement counter freezing for Arch Perfmon v4 (Skylake and newer). This allows to speed up the PMI handler by avoiding unnecessary MSR writes and make it more accurate. (Andi Kleen) - kprobes cleanups and simplification (Masami Hiramatsu) - Intel Goldmont PMU updates (Kan Liang) - ... plus misc other fixes and updates" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (155 commits) kprobes/x86: Use preempt_enable() in optimized_callback() x86/intel_rdt: Prevent pseudo-locking from using stale pointers kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack perf/x86/intel: Export mem events only if there's PEBS support x86/cpu: Drop pointless static qualifier in punit_dev_state_show() x86/intel_rdt: Fix initial allocation to consider CDP x86/intel_rdt: CBM overlap should also check for overlap with CDP peer x86/intel_rdt: Introduce utility to obtain CDP peer tools lib traceevent, perf tools: Move struct tep_handler definition in a local header file tools lib traceevent: Separate out tep_strerror() for strerror_r() issues perf python: More portable way to make CFLAGS work with clang perf python: Make clang_has_option() work on Python 3 perf tools: Free temporary 'sys' string in read_event_files() perf tools: Avoid double free in read_event_file() perf tools: Free 'printk' string in parse_ftrace_printk() perf tools: Cleanup trace-event-info 'tdata' leak perf strbuf: Match va_{add,copy} with va_end perf test: S390 does not support watchpoints in test 22 perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONG tools include: Adopt linux/bits.h ...
2018-10-02x86/cpu: Sanitize FAM6_ATOM namingPeter Zijlstra
Going primarily by: https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors with additional information gleaned from other related pages; notably: - Bonnell shrink was called Saltwell - Moorefield is the Merriefield refresh which makes it Airmont The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE for i in `git grep -l FAM6_ATOM` ; do sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \ -e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \ -e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \ -e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \ -e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \ -e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \ -e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \ -e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \ -e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \ -e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \ -e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i} done Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: dave.hansen@linux.intel.com Cc: len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-09-10intel_idle: Get rid of custom ICPU() macroAndy Shevchenko
Replace custom grown macro with generic INTEL_CPU_FAM6() one. No functional change intended. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-13Merge branch 'pm-cpuidle'Rafael J. Wysocki
* pm-cpuidle: intel_idle: Graceful probe failure when MWAIT is disabled cpuidle: Avoid assignment in if () argument cpuidle: Clean up cpuidle_enable_device() error handling a bit cpuidle: ladder: Add per CPU PM QoS resume latency support ARM: cpuidle: Refactor rollback operations if init fails ARM: cpuidle: Correct driver unregistration if init fails intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT) cpuidle: fix broadcast control when broadcast can not be entered Conflicts: drivers/idle/intel_idle.c
2017-11-09intel_idle: Graceful probe failure when MWAIT is disabledLen Brown
When MWAIT is disabled, intel_idle refuses to probe. But it may mis-lead the user by blaming this on the model number: intel_idle: does not run on family 6 modesl 79 So defer the check for MWAIT until after the model# white-list check succeeds, and if the MWAIT check fails, tell the user how to fix it: intel_idle: Please enable MWAIT in BIOS SETUP Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-11-04Revert "x86/mm: Stop calling leave_mm() in idle code"Andy Lutomirski
This reverts commit 43858b4f25cf0adc5c2ca9cf5ce5fdf2532941e5. The reason I removed the leave_mm() calls in question is because the heuristic wasn't needed after that patch. With the original version of my PCID series, we never flushed a "lazy cpu" (i.e. a CPU running kernel thread) due a flush on the loaded mm. Unfortunately, that caused architectural issues, so now I've reinstated these flushes on non-PCID systems in: commit b956575bed91 ("x86/mm: Flush more aggressively in lazy TLB mode"). That, in turn, gives us a power management and occasionally performance regression as compared to old kernels: a process that goes into a deep idle state on a given CPU and gets its mm flushed due to activity on a different CPU will wake the idle CPU. Reinstate the old ugly heuristic: if a CPU goes into ACPI C3 or an intel_idle state that is likely to cause a TLB flush gets its mm switched to init_mm before going idle. FWIW, this heuristic is lousy. Whether we should change CR3 before idle isn't a good hint except insofar as the performance hit is a bit lower if the TLB is getting flushed by the idle code anyway. What we really want to know is whether we anticipate being idle long enough that the mm is likely to be flushed before we wake up. This is more a matter of the expected latency than the idle state that gets chosen. This heuristic also completely fails on systems that don't know whether the TLB will be flushed (e.g. AMD systems?). OTOH it may be a bit obsolete anyway -- PCID systems don't presently benefit from this heuristic at all. We also shouldn't do this callback from innermost bit of the idle code due to the RCU nastiness it causes. All the information need is available before rcu_idle_enter() needs to happen. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 43858b4f25cf "x86/mm: Stop calling leave_mm() in idle code" Link: http://lkml.kernel.org/r/c513bbd4e653747213e05bc7062de000bf0202a5.1509793738.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-11intel_idle: replace conditionals with static_cpu_has(X86_FEATURE_ARAT)Jason Baron
If the 'arat' cpu flag is set, then the conditionals in intel_idle() that guard calling tick_broadcast_enter()/exit() will never be true. Use static_cpu_has(X86_FEATURE_ARAT) to create a fast path to replace the conditional. Signed-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-09-05Merge tag 'pm-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "This time (again) cpufreq gets the majority of changes which mostly are driver updates (including a major consolidation of intel_pstate), some schedutil governor modifications and core cleanups. There also are some changes in the system suspend area, mostly related to diagnostics and debug messages plus some renames of things related to suspend-to-idle. One major change here is that suspend-to-idle is now going to be preferred over S3 on systems where the ACPI tables indicate to do so and provide requsite support (the Low Power Idle S0 _DSM in particular). The system sleep documentation and the tools related to it are updated too. The rest is a few cpuidle changes (nothing major), devfreq updates, generic power domains (genpd) framework updates and a few assorted modifications elsewhere. Specifics: - Drop the P-state selection algorithm based on a PID controller from intel_pstate and make it use the same P-state selection method (based on the CPU load) for all types of systems in the active mode (Rafael Wysocki, Srinivas Pandruvada). - Rework the cpufreq core and governors to make it possible to take cross-CPU utilization updates into account and modify the schedutil governor to actually do so (Viresh Kumar). - Clean up the handling of transition latency information in the cpufreq core and untangle it from the information on which drivers cannot do dynamic frequency switching (Viresh Kumar). - Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek cpufreq driver and update its DT bindings (Sean Wang). - Modify the cpufreq dt-platdev driver to autimatically create cpufreq devices for the new (v2) Operating Performance Points (OPP) DT bindings and update its whitelist of supported systems (Viresh Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley Xiao). - Add support for Ux500 to the cpufreq-dt driver and drop the obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann). - Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem Nguyen). - Fix and clean up assorted issues in the cpufreq drivers and core (Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva, Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla). - Update the IO-wait boost handling in the schedutil governor to make it less aggressive (Joel Fernandes). - Rework system suspend diagnostics to make it print fewer messages to the kernel log by default, add a sysfs knob to allow more suspend-related messages to be printed and add Low Power S0 Idle constraints checks to the ACPI suspend-to-idle code (Rafael Wysocki, Srinivas Pandruvada). - Prefer suspend-to-idle over S3 on ACPI-based systems with the ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM interface present in the ACPI tables (Rafael Wysocki). - Update documentation related to system sleep and rename a number of items in the code to make it cleare that they are related to suspend-to-idle (Rafael Wysocki). - Export a variable allowing device drivers to check the target system sleep state from the core system suspend code (Florian Fainelli). - Clean up the cpuidle subsystem to handle the polling state on x86 in a more straightforward way and to use %pOF instead of full_name (Rafael Wysocki, Rob Herring). - Update the devfreq framework to fix and clean up a few minor issues (Chanwoo Choi, Rob Herring). - Extend diagnostics in the generic power domains (genpd) framework and clean it up slightly (Thara Gopinath, Rob Herring). - Fix and clean up a couple of issues in the operating performance points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz). - Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling (AVS) driver (David Wu). - Fix the usage of notifiers in CPU power management on some platforms (Alex Shi). - Update the pm-graph system suspend/hibernation and boot profiling utility (Todd Brandt). - Make it possible to run the cpupower utility without CPU0 (Prarit Bhargava)" * tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits) cpuidle: Make drivers initialize polling state cpuidle: Move polling state initialization code to separate file cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol cpufreq: imx6q: Fix imx6sx low frequency support cpufreq: speedstep-lib: make several arrays static, makes code smaller PM: docs: Delete the obsolete states.txt document PM: docs: Describe high-level PM strategies and sleep states PM / devfreq: Fix memory leak when fail to register device PM / devfreq: Add dependency on PM_OPP PM / devfreq: Move private devfreq_update_stats() into devfreq PM / devfreq: Convert to using %pOF instead of full_name PM / AVS: rockchip-io: add io selectors and supplies for RV1108 cpufreq: ti: Fix 'of_node_put' being called twice in error handling path cpufreq: dt-platdev: Drop few entries from whitelist cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2 ARM: ux500: don't select CPUFREQ_DT cpuidle: Convert to using %pOF instead of full_name cpufreq: Convert to using %pOF instead of full_name PM / Domains: Convert to using %pOF instead of full_name cpufreq: Cap the default transition delay value to 10 ms ...
2017-09-04Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: ACPI / PM: Check low power idle constraints for debug only PM / s2idle: Rename platform operations structure PM / s2idle: Rename ->enter_freeze to ->enter_s2idle PM / s2idle: Rename freeze_state enum and related items PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE ACPI / PM: Prefer suspend-to-idle over S3 on some systems platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle PM / suspend: Define pr_fmt() in suspend.c PM / suspend: Use mem_sleep_labels[] strings in messages PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq() PM / core: Add error argument to dpm_show_time() PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq() PM / s2idle: Rearrange the main suspend-to-idle loop PM / timekeeping: Print debug messages when requested PM / sleep: Mark suspend/hibernation start and finish PM / sleep: Do not print debug messages by default PM / suspend: Export pm_suspend_target_state
2017-08-30cpuidle: Make drivers initialize polling stateRafael J. Wysocki
Make the drivers that want to include the polling state into their states table initialize it explicitly and drop the initialization of it (which in fact is conditional, but that is not obvious from the code) from the core. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2017-08-11PM / s2idle: Rename ->enter_freeze to ->enter_s2idleRafael J. Wysocki
Rename the ->enter_freeze cpuidle driver callback to ->enter_s2idle to make it clear that it is used for entering suspend-to-idle and rename the related functions, variables and so on accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-18Merge branch 'x86/boot' into x86/mm, to pick up interacting changesIngo Molnar
The SME patches we are about to apply add some E820 logic, so merge in pending E820 code changes first, to have a single code base. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-05x86/mm: Stop calling leave_mm() in idle codeAndy Lutomirski
Now that lazy TLB suppresses all flush IPIs (as opposed to all but the first), there's no need to leave_mm() when going idle. This means we can get rid of the rcuidle hack in switch_mm_irqs_off() and we can unexport leave_mm(). This also removes acpi_unlazy_tlb() from the x86 and ia64 headers, since it has no callers any more. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Nadav Amit <nadav.amit@gmail.com> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/03c699cfd6021e467be650d6b73deaccfe4b4bd7.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-29intel_idle: Use more common logging styleJoe Perches
Remove #define PREFIX and add #define pr_fmt to use more common logging. Miscellanea: o Add missing newline to format o Convert a single printk without KERN_<LEVEL> to pr_info Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-05-01x86/intel_idle: add Gemini Lake supportDavid E. Box
Gemini Lake uses the same C-states as Broxton and also uses the IRTL MSR's to determine maximum C-state latency. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Acked-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-03-02Merge tag 'pm-turbostat-4.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull turbostat utility updates from Rafael Wysocki: "Power management turbostat utility updates. These update turbostat significantly and in particular: - default output is now verbose, --debug is no longer required to get all counters. As a result, some options have been added to specify exactly what output is wanted. - added --quiet to skip system configuration output - added --list, --show and --hide parameters - added --cpu parameter - enhanced Baytrail SoC support - added Gemini Lake SoC support - added sysfs C-state columns Also the symbol definitions in arch/x86/include/asm/intel-family.h and arch/x86/include/asm/msr-index.h are updated and the intel_idle and intel_pstate drivers are modified to use the updated symbols. Credits to Len Brown for all of these changes" * tag 'pm-turbostat-4.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (44 commits) tools/power turbostat: version 17.02.24 tools/power turbostat: bugfix: --add u32 was printed as u64 tools/power turbostat: show error on exec tools/power turbostat: dump p-state software config tools/power turbostat: show package number, even without --debug tools/power turbostat: support "--hide C1" etc. tools/power turbostat: move --Package and --processor into the --cpu option tools/power turbostat: turbostat.8 update tools/power turbostat: update --list feature tools/power turbostat: use wide columns to display large numbers tools/power turbostat: Add --list option to show available header names tools/power turbostat: fix zero IRQ count shown in one-shot command mode tools/power turbostat: add --cpu parameter tools/power turbostat: print sysfs C-state stats tools/power turbostat: extend --add option to accept /sys path tools/power turbostat: skip unused counters on BDX tools/power turbostat: fix decoding for GLM, DNV, SKX turbo-ratio limits tools/power turbostat: skip unused counters on SKX tools/power turbostat: Denverton: use HW CC1 counter, skip C3, C7 tools/power turbostat: initial Gemini Lake SOC support ...
2017-03-01intel_idle: use new name for MSR_PKG_CST_CONFIG_CONTROLLen Brown
previously known as MSR_NHM_SNB_PKG_CST_CFG_CTL Signed-off-by: Len Brown <len.brown@intel.com>
2017-03-01intel_idle: stop exposing platform acronyms in sysfsLen Brown
Cosmetic only -- no functional change in this patch. sysfs before: state4/desc:MWAIT 0x20 state4/name:C6-HSW sysfs after: state4/desc:MWAIT 0x20 state4/name:C6 We remove the platform acronyms from the end of the state name (-HSW in this case) for three reasonse. 1. more consistency with acpi_idle, which prints C1, C2, C3 etc. 2. users know what platform they are on already an acronym for the processor code name here seems to cause more confusion than clarity. 3. less clutter in "cpupower monitor" output, which truncates the names to 4 columns. The precise definition of the state continues to be available in "desc". Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-13Merge tag 'pm-4.10-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "Again, cpufreq gets more changes than the other parts this time (one new driver, one old driver less, a bunch of enhancements of the existing code, new CPU IDs, fixes, cleanups) There also are some changes in cpuidle (idle injection rework, a couple of new CPU IDs, online/offline rework in intel_idle, fixes and cleanups), in the generic power domains framework (mostly related to supporting power domains containing CPUs), and in the Operating Performance Points (OPP) library (mostly related to supporting devices with multiple voltage regulators) In addition to that, the system sleep state selection interface is modified to make it easier for distributions with unchanged user space to support suspend-to-idle as the default system suspend method, some issues are fixed in the PM core, the latency tolerance PM QoS framework is improved a bit, the Intel RAPL power capping driver is cleaned up and there are some fixes and cleanups in the devfreq subsystem Specifics: - New cpufreq driver for Broadcom STB SoCs and a Device Tree binding for it (Markus Mayer) - Support for ARM Integrator/AP and Integrator/CP in the generic DT cpufreq driver and elimination of the old Integrator cpufreq driver (Linus Walleij) - Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier, and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie, Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik) - cpufreq core fix to eliminate races that may lead to using inactive policy objects and related cleanups (Rafael Wysocki) - cpufreq schedutil governor update to make it use SCHED_FIFO kernel threads (instead of regular workqueues) for doing delayed work (to reduce the response latency in some cases) and related cleanups (Viresh Kumar) - New cpufreq sysfs attribute for resetting statistics (Markus Mayer) - cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis, Viresh Kumar) - Support for using generic cpufreq governors in the intel_pstate driver (Rafael Wysocki) - Support for per-logical-CPU P-state limits and the EPP/EPB (Energy Performance Preference/Energy Performance Bias) knobs in the intel_pstate driver (Srinivas Pandruvada) - New CPU ID for Knights Mill in intel_pstate (Piotr Luc) - intel_pstate driver modification to use the P-state selection algorithm based on CPU load on platforms with the system profile in the ACPI tables set to "mobile" (Srinivas Pandruvada) - intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki, Srinivas Pandruvada) - cpufreq powernv driver updates including fast switching support (for the schedutil governor), fixes and cleanus (Akshay Adiga, Andrew Donnellan, Denis Kirjanov) - acpi-cpufreq driver rework to switch it over to the new CPU offline/online state machine (Sebastian Andrzej Siewior) - Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth Prakash) - Idle injection rework (to make it use the regular idle path instead of a home-grown custom one) and related powerclamp thermal driver updates (Peter Zijlstra, Jacob Pan, Petr Mladek, Sebastian Andrzej Siewior) - New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy Shevchenko, Piotr Luc) - intel_idle driver cleanups and switch over to using the new CPU offline/online state machine (Anna-Maria Gleixner, Sebastian Andrzej Siewior) - cpuidle DT driver update to support suspend-to-idle properly (Sudeep Holla) - cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian, Rafael Wysocki) - Preliminary support for power domains including CPUs in the generic power domains (genpd) framework and related DT bindings (Lina Iyer) - Assorted fixes and cleanups in the generic power domains (genpd) framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven) - Preliminary support for devices with multiple voltage regulators and related fixes and cleanups in the Operating Performance Points (OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd) - System sleep state selection interface rework to make it easier to support suspend-to-idle as the default system suspend method (Rafael Wysocki) - PM core fixes and cleanups, mostly related to the interactions between the system suspend and runtime PM frameworks (Ulf Hansson, Sahitya Tummala, Tony Lindgren) - Latency tolerance PM QoS framework imorovements (Andrew Lutomirski) - New Knights Mill CPU ID for the Intel RAPL power capping driver (Piotr Luc) - Intel RAPL power capping driver fixes, cleanups and switch over to using the new CPU offline/online state machine (Jacob Pan, Thomas Gleixner, Sebastian Andrzej Siewior) - Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc, rockchip-dfi devfreq drivers and the devfreq core (Axel Lin, Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh Kumar) - Fix for false-positive KASAN warnings during resume from ACPI S3 (suspend-to-RAM) on x86 (Josh Poimboeuf) - Memory map verification during resume from hibernation on x86 to ensure a consistent address space layout (Chen Yu) - Wakeup sources debugging enhancement (Xing Wei) - rockchip-io AVS driver cleanup (Shawn Lin)" * tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (127 commits) devfreq: rk3399_dmc: Don't use OPP structures outside of RCU locks devfreq: rk3399_dmc: Remove dangling rcu_read_unlock() devfreq: exynos: Don't use OPP structures outside of RCU locks Documentation: intel_pstate: Document HWP energy/performance hints cpufreq: intel_pstate: Support for energy performance hints with HWP cpufreq: intel_pstate: Add locking around HWP requests PM / sleep: Print active wakeup sources when blocking on wakeup_count reads PM / core: Fix bug in the error handling of async suspend PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend PM / Domains: Fix compatible for domain idle state PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators() PM / OPP: Allow platform specific custom set_opp() callbacks PM / OPP: Separate out _generic_set_opp() PM / OPP: Add infrastructure to manage multiple regulators PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage() PM / OPP: Manage supply's voltage/current in a separate structure PM / OPP: Don't use OPP structure outside of rcu protected section PM / OPP: Reword binding supporting multiple regulators per device PM / OPP: Fix incorrect cpu-supply property in binding cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state() ..
2016-12-01intel_idle: Convert to hotplug state machineSebastian Andrzej Siewior
Install the callbacks via the state machine and let the core invoke the callbacks on the already online CPUs. The two smp_call_function_single() invocations in intel_idle_cpu_init() have been removed because intel_idle_cpu_init() is now invoked via the hotplug callback which runs on the target CPU. The IRQ-off calling convention for auto_demotion_disable() and c1e_promotion_disable() has not been preserved because only those two modify the MSR during CPU intialization. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-01intel_idle: Remove superfluous SMP fuction callAnna-Maria Gleixner
Since commit 1cf4f629d9d2 ("cpu/hotplug: Move online calls to hotplugged cpu") the CPU_ONLINE and CPU_DOWN_PREPARE notifiers are always run on the hot plugged CPU, and as of commit 3b9d6da67e11 ("cpu/hotplug: Fix rollback during error-out in __cpu_disable()") the CPU_DOWN_FAILED notifier also runs on the hot plugged CPU. This patch converts the SMP functional calls into direct calls. smp_function_call_single() executes the function with interrupts disabled. This calling convention is not preserved, because tick_broadcast_enable() and tick_braodcast_disable() handle interrupts themselves. Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-01x86/intel_idle: Add Knights Mill CPUIDPiotr Luc
Add Knights Mill (KNM) to the list of CPUIDs supported by intel_idle. Signed-off-by: Piotr Luc <piotr.luc@intel.com> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2016-12-01x86/intel_idle: Add CPU model 0x4a (Atom Z34xx series)Andy Shevchenko
Add CPU ID for Atom Z34xx processors. Datasheets indicate support for this, detailed information about potential quirks or limitations are missing, though. So we just reuse the definition from official BSP code. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2016-11-18i7300_idle: Remove this driverLen Brown
In preparation for removing the idle_notifier, remove its only user, the i7300_idle driver. i7300_idle was deployed in 2008 to reduce idle memory power on systems using the i7300 chipset. The driver worked by throttling the fully-buffered DIMMs during idle periods using the IOAT DMA engine. The driver ran only on the i7300 chip-set, and no other hardware has used this mechanism. The driver no longer has a maintainer. Removing this driver will increase idle power on i7300 systems when they run the new kernel without the driver. Signed-off-by: Len Brown <len.brown@intel.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/ad6a044e57cc75f44cc8621abe846e58f7882243.1479449716.git.len.brown@intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-10-07nmi_backtrace: generate one-line reports for idle cpusChris Metcalf
When doing an nmi backtrace of many cores, most of which are idle, the output is a little overwhelming and very uninformative. Suppress messages for cpus that are idling when they are interrupted and just emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN". We do this by grouping all the cpuidle code together into a new .cpuidle.text section, and then checking the address of the interrupted PC to see if it lies within that section. This commit suitably tags x86 and tile idle routines, and only adds in the minimal framework for other architectures. Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm] Tested-by: Petr Mladek <pmladek@suse.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-30Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpufeature updates from Thomas Gleixner: - a workaround for the MONITOR instruction erratum of Goldmont CPUs - small fixes and cleanups here and there * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add workaround for MONITOR instruction erratum on Goldmont based CPUs x86/cpu: Rename "WESTMERE2" family to "NEHALEM_G" x86/amd_nb: Clean up init path x86/cpufeature: Add helper macro for mask check macros x86/cpufeature: Make sure DISABLED/REQUIRED macros are updated x86/cpufeature: Update cpufeaure macros
2016-07-09intel_idle: correct BXT supportJan Beulich
Commit 5dcef69486 ("intel_idle: add BXT support") added an 8-element lookup array with just a 2-bit value used for lookups. As per the SDM that bit field is really 3 bits wide. While this is supposedly benign here, future re-use of the code for other CPUs might expose the issue. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-09intel_idle: re-work bxt_idle_state_table_update() and its helperJan Beulich
Since irtl_ns_units[] has itself zero entries, make sure the caller recognized those cases along with the MSR read returning zero, as zero is not a valid value for exit_latency and target_residency. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-07-01x86/cpu: Rename "WESTMERE2" family to "NEHALEM_G"Dave Hansen
Len Brown noticed something was amiss in our INTEL_FAM6_* definitions. It seems like model 0x1F was a Nehalem part, marketed as "Intel Core i7 and i5 Processors" (according to the SDM). But, although it was a Nehalem 0x1F had some uncore events which were shared with Westmere. Len also mentioned he thought it was called "Havendale", which Wikipedia says was graphics-oriented and canceled: https://en.wikipedia.org/wiki/Nehalem_(microarchitecture) So either way, it's probably not imporant what we call it, but call it Nehalem to be accurate, and add a "G" since it seems graphics-related. If it were canceled that would be a good reason why it's so sparsely and inconsistently referred to in the code. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Hansen <dave@sr71.net> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20160629192737.949C41A8@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-06-23idle_intel: Add DenvertonJacob Pan
Denverton is an Intel Atom based micro server which shares the same Goldmont architecture as Broxton. The available C-states on Denverton is a subset of Broxton with only C1, C1e, and C6. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-23drivers/idle: make intel_idle.c driver more explicitly non-modularPaul Gortmaker
The Kconfig for this driver is currently declared with: config INTEL_IDLE bool "Cpuidle Driver for Intel Processors" ...meaning that it currently is not being built as a module by anyone. This was done in commit 6ce9cd8669fa1195fdc21643370e34523c7ac988 ("intel_idle: disable module support") since "...the module capability is cauing more trouble than it is worth." This was done over 5y ago, and Daniel adds that: ...the modular support has been removed from almost all the cpuidle drivers and the cpuidle framework is no longer assuming driver could be unloaded. Removing the modular dead code in the driver makes sense as this what have been done in the others drivers. So lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. At a later date we might want to consider whether subsys_init or another init category seems more appropriate than device_init. We replace module.h with moduleparam.h since the file does declare some module parameters, and leaving them as such is currently the easiest way to remain compatible with existing boot arg use cases. Note that MODULE_DEVICE_TABLE is a no-op for non-modular code. Also note that we can't remove intel_idle_cpuidle_devices_uninit() as that is still used for unwind purposes if the init fails. We also delete the MODULE_LICENSE tag etc. since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-08x86/intel_idle: Use Intel family macros for intel_idleDave Hansen
Use the new INTEL_FAM6_* macros for intel_idle.c. Also fix up some of the macros to be consistent with how some of the intel_idle code refers to the model. There's on oddity here: model 0x1F is uniquely referred to here and nowhere else that I could find. 0x1E/0x1F are just spelled out as "Intel Core i7 and i5 Processors" in the SDM or as "Intel processors based on the Nehalem, Westmere microarchitectures" in the RDPMC section. Comments between tables 19-19 and 19-20 in the SDM seem to point to 0x1F being some kind of Westmere, so let's call it "WESTMERE2". Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave@sr71.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <lenb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jacob.jun.pan@intel.com Cc: linux-pm@vger.kernel.org Link: http://lkml.kernel.org/r/20160603001932.EE978EB9@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-09intel_idle: add BXT supportLen Brown
Broxton has all the HSW C-states, except C3. BXT C-state timing is slightly different. Here we trust the IRTL MSRs as authority on maximum C-state latency, and override the driver's tables with the values found in the associated IRTL MSRs. Further we set the target_residency to 1x maximum latency, trusting the hardware demotion logic. Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Add KBL supportLen Brown
KBL is similar to SKL Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Add SKX supportLen Brown
SKX is similar to BDX Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Clean up all registered devices on exit.Richard Cochran
This driver registers cpuidle devices when a CPU comes online, but it leaves the registrations in place when a CPU goes offline. The module exit code only unregisters the currently online CPUs, leaving the devices for offline CPUs dangling. This patch changes the driver to clean up all registrations on exit, even those from CPUs that are offline. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Propagate hot plug errors.Richard Cochran
If a cpuidle registration error occurs during the hot plug notifier callback, we should really inform the hot plug machinery instead of just ignoring the error. This patch changes the callback to properly return on error. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Don't overreact to a cpuidle registration failure.Richard Cochran
The helper function, intel_idle_cpu_init, registers one new device with the cpuidle layer. If the registration should fail, that function immediately calls intel_idle_cpuidle_devices_uninit() to unregister every last CPU's device. However, it makes no sense to do so, when called from the hot plug notifier callback. This patch moves the call to intel_idle_cpuidle_devices_uninit() outside of the helper function to the one call site that actually needs to perform the de-registrations. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Setup the timer broadcast only on successful driver load.Richard Cochran
This driver sets the broadcast tick quite early on during probe and does not clean up again in cast of failure. This patch moves the setup call after the registration, placing the on_each_cpu() calls within the global CPU lock region. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Avoid a double free of the per-CPU data.Richard Cochran
The helper function, intel_idle_cpuidle_devices_uninit, frees the globally allocated per-CPU data. However, this function is invoked from the hot plug notifier callback at a time when freeing that data is not safe. If the call to cpuidle_register_driver() should fail (say, due to lack of memory), then the driver will free its per-CPU region. On the *next* CPU_ONLINE event, the driver will happily use the region again and even free it again if the failure repeats. This patch fixes the issue by moving the call to free_percpu() outside of the helper function at the two call sites that actually need to free the per-CPU data. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Fix dangling registration on error path.Richard Cochran
In the module_init() method, if the per-CPU allocation fails, then the active cpuidle registration is not cleaned up. This patch fixes the issue by attempting the allocation before registration, and then cleaning it up again on registration failure. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Fix deallocation order on the driver exit path.Richard Cochran
In the module_exit() method, this driver first frees its per-CPU pointer, then unregisters a callback making use of the pointer. Furthermore, the function, intel_idle_cpuidle_devices_uninit, is racy against CPU hot plugging as it calls for_each_online_cpu(). This patch corrects the issues by unregistering first on the exit path while holding the hot plug lock. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Remove redundant initialization calls.Richard Cochran
The function, intel_idle_cpuidle_driver_init, makes calls on each CPU to auto_demotion_disable() and c1e_promotion_disable(). These calls are redundant, as intel_idle_cpu_init() does the same calls just a bit later on. They are also premature, as the driver registration may yet fail. This patch removes the redundant code. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: Fix a helper function's return value.Richard Cochran
The function, intel_idle_cpuidle_driver_init, delivers no error codes at all. This patch changes the function to return 'void' instead of returning zero. Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-04-07intel_idle: remove useless return from void function.Richard Cochran
Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-23intel_idle: Support for Intel Xeon Phi Processor x200 Product FamilyDasaratharaman Chandramouli
Enables "Intel(R) Xeon Phi(TM) Processor x200 Product Family" support, formerly code-named KNL. It is based on modified Intel Atom Silvermont microarchitecture. Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> [micah.barany@intel.com: adjusted values of residency and latency] Signed-off-by: Micah Barany <micah.barany@intel.com> [hubert.chrzaniuk@intel.com: removed deprecated CPUIDLE_FLAG_TIME_VALID flag] Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com> Signed-off-by: Pawel Karczewski <pawel.karczewski@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2016-03-23intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabledLen Brown
Some SKL-H configurations require "intel_idle.max_cstate=7" to boot. While that is an effective workaround, it disables C10. This patch detects the problematic configuration, and disables C8 and C9, keeping C10 enabled. Note that enabling SGX in BIOS SETUP can also prevent this issue, if the system BIOS provides that option. https://bugzilla.kernel.org/show_bug.cgi?id=109081 "Freezes with Intel i7 6700HQ (Skylake), unless intel_idle.max_cstate=7" Signed-off-by: Len Brown <len.brown@intel.com> Cc: stable@vger.kernel.org
2015-09-10intel_idle: Skylake Client Support - updatedLen Brown
Addition of PC9 state, and minor tweaks to existing PC6 and PC8 states. Signed-off-by: Len Brown <len.brown@intel.com>
2015-08-15intel_idle: Skylake Client SupportLen Brown
Skylake Client CPU idle Power states (C-states) are similar to the previous generation, Broadwell. However, Skylake does get its own table with updated worst-case latency and average energy-break-even residency values. Signed-off-by: Len Brown <len.brown@intel.com>
2015-07-26intel_idle: allow idle states to be freeze-mode specificLen Brown
intel_idle uses a NULL "enter" field in a cpuidle state to recognize the invalid entry terminating a variable-length array. Linux-4.0 added support for the system-wide "freeze" state in cpuidle drivers via the new "enter_freeze" field. The natural way to expose a deep idle state for freeze, but not for run-time idle is to supply "enter_freeze" without "enter"; so we update the driver to accept such states. Signed-off-by: Len Brown <len.brown@intel.com>