summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2015-05-07nohz: Fix !HIGH_RES_TIMERS hangThomas Gleixner
Simon Horman reported this crash on a system with high-res timers disabled but nohz enabled: > ------------[ cut here ]------------ > kernel BUG at kernel/irq_work.c:135! BUG_ON(!irqs_disabled()); So something enabled interrupts in the periodic tick handling machinery, and that code path indeed has a local_irq_disable()/enable pair in tick_nohz_switch_to_nohz() which causes havoc. Fix it. This patch also fixes a +nohz -hrtimers hang reported by Ingo Molnar. Reported-by: Simon Horman <horms@verge.net.au> Reported-by: Ingo Molnar <mingo@kernel.org> Tested-by: Simon Horman <horms@verge.net.au> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: LAK <linux-arm-kernel@lists.infradead.org> Cc: Magnus Damm <magnus.damm@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1505071425520.4225@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-05tick: hrtimer-broadcast: Prevent endless restarting when broadcast device is ↵Andreas Sandberg
unused The hrtimer callback in the hrtimer's tick broadcast code sometimes incorrectly ends up scheduling events at the current tick causing the kernel to hang servicing the same hrtimer forever. This typically happens when a device is swapped out by tick_install_broadcast_device(), which replaces the event handler with clock_events_handle_noop() and sets the device mode to CLOCK_EVT_MODE_UNUSED. If the timer is scheduled when this happens, the next_event field will not be updated and the hrtimer ends up being restarted at the current tick. To prevent this from happening, only try to restart the hrtimer if the broadcast clock event device is in one of the active modes and try to cancel the timer when entering the CLOCK_EVT_MODE_UNUSED mode. Signed-off-by: Andreas Sandberg <andreas.sandberg@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1429880765-5558-1-git-send-email-andreas.sandberg@arm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-05timer: Use timer->base for flag checksJoonwoo Park
At present, internal_add_timer() examines flags with 'base' which doesn't contain flags. Examine with 'timer->base' to avoid unnecessary waking up of nohz CPU when timer base has TIMER_DEFERRABLE set. Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org> Cc: sboyd@codeaurora.org Cc: skannan@codeaurora.org Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/1430187709-21087-1-git-send-email-joonwoop@codeaurora.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-05tick-broadcast: Fix the printing of broadcast masksPreeti U Murthy
Today the number of bits of the broadcast masks that is output into /proc/timer_list is sizeof(unsigned long). This means that on machines with a larger number of CPUs, the bitmasks of CPUs beyond this range do not appear. Fix this by using bitmap printing through "%*pb" instead, so as to output the broadcast masks for the range of nr_cpu_ids into /proc/timer_list. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: peterz@infradead.org Cc: linuxppc-dev@ozlabs.org Cc: john.stultz@linaro.org Link: http://lkml.kernel.org/r/20150428084520.3314.62668.stgit@preeti.in.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-05tick: broadcast: Simplify oneshot logic and shorten lock regionThomas Gleixner
Simplify the oneshot logic by avoiding the reprogramming loops. That also allows to call the cpu local handler outside of the broadcast_lock held region. Tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-05tick: broadcast: Prevent livelock from event handlerThomas Gleixner
With the removal of the hrtimer softirq the switch to highres/nohz mode happens in the tick interrupt. That leads to a livelock when the per cpu event handler is directly called from the broadcast handler under broadcast lock because broadcast lock needs to be taken for the highres/nohz switch as well. Solve this by calling the cpu local handler outside the broadcast_lock held region. Fixes: c6eb3f70d448 "hrtimer: Get rid of hrtimer softirq" Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-05-04perf: Remove unused function perf_mux_hrtimer_cancel()Thomas Gleixner
Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-23sched: debug: Remove the cfs bandwidth timer_active printoutThomas Gleixner
The struct member is gone. Reported-by: fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
2015-04-22perf: perf_mux_hrtimer_cancel() can be statickbuild test robot
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: kbuild-all@01.org Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150422200000.GA122603@lkp-sb04 Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22perf: Fix mux_interval hrtimer wreckagePeter Zijlstra
Thomas stumbled over the hrtimer_forward_now() in perf_event_mux_interval_ms_store() and noticed its broken-ness. You cannot just change the expiry time of an active timer, it will destroy the red-black tree order and cause havoc. Change it to (re)start the timer instead, (re)starting a timer will dequeue and enqueue a timer and therefore preserve rb-tree order. Since we cannot enqueue remotely, wrap the thing in cpu_function_call(), this however mandates that we restrict ourselves to online cpus. Also serialize the entire setting so we don't get multiple concurrent threads trying to update to different values. Also fix a problem in perf_mux_hrtimer_restart(), checking against hrtimer_active() can actually loose us the timer when timer->state == HRTIMER_STATE_CALLBACK and the callback has already decided NORESTART. Furthermore it doesn't make any sense to test hrtimer_callback_running() when we already tested hrtimer_active(), but with the above change, we explicitly must call it when callback_running. Lastly, rename a few functions: s/perf_cpu_hrtimer_/perf_mux_hrtimer_/ -- because I could not find the mux timer function s/\<hr\>/timer/ -- because that's the normal way of calling things. Fixes: 62b856397927 ("perf: Add sysfs entry to adjust multiplexing interval per PMU") Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150415095011.863052571@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22sched: Cleanup bandwidth timersPeter Zijlstra
Roman reported a 3 cpu lockup scenario involving __start_cfs_bandwidth(). The more I look at that code the more I'm convinced its crack, that entire __start_cfs_bandwidth() thing is brain melting, we don't need to cancel a timer before starting it, *hrtimer_start*() will happily remove the timer for you if its still enqueued. Removing that, removes a big part of the problem, no more ugly cancel loop to get stuck in. So now, if I understand things right, the entire reason you have this cfs_b->lock guarded ->timer_active nonsense is to make sure we don't accidentally lose the timer. It appears to me that it should be possible to guarantee that same by unconditionally (re)starting the timer when !queued. Because regardless what hrtimer::function will return, if we beat it to (re)enqueue the timer, it doesn't matter. Now, because hrtimers don't come with any serialization guarantees we must ensure both handler and (re)start loop serialize their access to the hrtimer to avoid both trying to forward the timer at the same time. Update the rt bandwidth timer to match. This effectively reverts: 09dc4ab03936 ("sched/fair: Fix tg_set_cfs_bandwidth() deadlock on rq->lock"). Reported-by: Roman Gushchin <klamm@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ben Segall <bsegall@google.com> Cc: Paul Turner <pjt@google.com> Link: http://lkml.kernel.org/r/20150415095011.804589208@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Allow concurrent hrtimer_start() for self restarting timersPeter Zijlstra
Because we drop cpu_base->lock around calling hrtimer::function, it is possible for hrtimer_start() to come in between and enqueue the timer. If hrtimer::function then returns HRTIMER_RESTART we'll hit the BUG_ON because HRTIMER_STATE_ENQUEUED will be set. Since the above is a perfectly valid scenario, remove the BUG_ON and make the enqueue_hrtimer() call conditional on the timer not being enqueued already. NOTE: in that concurrent scenario its entirely common for both sites to want to modify the hrtimer, since hrtimers don't provide serialization themselves be sure to provide some such that the hrtimer::function and the hrtimer_start() caller don't both try and fudge the expiration state at the same time. To that effect, add a WARN when someone tries to forward an already enqueued timer, the most common way to change the expiry of self restarting timers. Ideally we'd put the WARN in everything modifying the expiry but most of that is inlines and we don't need the bloat. Fixes: 2d44ae4d7135 ("hrtimer: clean up cpu->base locking tricks") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ben Segall <bsegall@google.com> Cc: Roman Gushchin <klamm@yandex-team.ru> Cc: Paul Turner <pjt@google.com> Link: http://lkml.kernel.org/r/20150415113105.GT5029@twins.programming.kicks-ass.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22timer: Put usleep_range into the __sched sectionThomas Gleixner
do_usleep_range() and schedule_hrtimeout_range() are __sched as well. So it makes no sense to have the exported function in a different section. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203503.833709502@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22timer: Remove pointless return value of do_usleep_range()Thomas Gleixner
The only user ignores it anyway and rightfully so. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203503.756060258@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Avoid locking in hrtimer_cancel() if timer not activeThomas Gleixner
We can do a lockless check for hrtimer_active before actually taking the lock in hrtimer[_try_to]_cancel. This is useful for hotpath users like nanosleep as they avoid the lock dance when the timer has expired. This is safe because active is true when the timer is enqueued or the callback is running. Taking the hrtimer base lock does not protect against concurrent hrtimer_start calls, the callsite has to do the proper serialization itself. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203503.580273114@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Remove hrtimer_start() return valueThomas Gleixner
No user was ever interested whether the timer was active or not when it was started. All abusers of the return value are gone, so get rid of it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203503.483556394@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22tick: broadcast-hrtimer: Remove overly clever return value abuseThomas Gleixner
The assignment of bc_moved in the conditional construct relies on the fact that in the case of hrtimer_start() invocation the return value is always 0. It took me a while to understand it. We want to get rid of the hrtimer_start() return value. Open code the logic which makes it readable as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203503.404751457@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22alarmtimer: Get rid of unused return valueThomas Gleixner
We want to get rid of the hrtimer_start() return value and the alarm timer return value is nowhere used. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20150414203503.243910615@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22rtmutex: Remove bogus hrtimer_active() checkThomas Gleixner
The check for hrtimer_active() after starting the timer is pointless. If the timer is inactive it has expired already and therefor the task pointer is already NULL. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203503.081830481@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22futex: Remove bogus hrtimer_active() checkThomas Gleixner
The check for hrtimer_active() after starting the timer is pointless. If the timer is inactive it has expired already and therefor the task pointer is already NULL. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203502.985825453@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Remove bogus hrtimer_active() checkThomas Gleixner
The check for hrtimer_active() after starting the timer is pointless. If the timer is inactive it has expired already and therefor the task pointer is already NULL. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203502.907149271@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Make hrtimer_start() a inline wrapperThomas Gleixner
No point for an extra export just to set the extra argument of hrtimer_start_range_ns() to 0. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203502.808544539@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Get rid of __hrtimer_start_range_ns()Thomas Gleixner
No more callers. Remove the leftovers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203502.707871492@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22sched: deadline: Use hrtimer_start()Thomas Gleixner
hrtimer_start() does not longer defer already expired timers to the softirq. Get rid of the __hrtimer_start_range_ns() invocation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203502.627353666@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22sched: core: Use hrtimer_start[_expires]()Thomas Gleixner
hrtimer_start() now enforces a timer interrupt when an already expired timer is enqueued. Get rid of the __hrtimer_start_range_ns() invocations and the loops around it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203502.531131739@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22perf: core: Use hrtimer_start()Thomas Gleixner
hrtimer_start() does not longer defer already expired timers to the softirq. Get rid of the __hrtimer_start_range_ns() invocation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203502.452104213@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22tick: Nohz: Rework next timer evaluationThomas Gleixner
The evaluation of the next timer in the nohz code is based on jiffies while all the tick internals are nano seconds based. We have also to convert hrtimer nanoseconds to jiffies in the !highres case. That's just wrong and introduces interesting corner cases. Turn it around and convert the next timer wheel timer expiry and the rcu event to clock monotonic and base all calculations on nanoseconds. That identifies the case where no timer is pending clearly with an absolute expiry value of KTIME_MAX. Makes the code more readable and gets rid of the jiffies magic in the nohz code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Link: http://lkml.kernel.org/r/20150414203502.184198593@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22tick: Sched: Restructure codeThomas Gleixner
Get rid of one indentation level. Preparatory patch for a major rework. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Link: http://lkml.kernel.org/r/20150414203502.101563235@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22tick: sched: Force tick interrupt and get rid of softirq magicThomas Gleixner
We already got rid of the hrtimer reprogramming loops and hoops as hrtimer now enforces an interrupt if the enqueued time is in the past. Do the same for the nohz non highres mode. That gets rid of the need to raise the softirq which only serves the purpose of getting the machine out of the inner idle loop. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Link: http://lkml.kernel.org/r/20150414203502.023464878@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22tick: sched: Remove hrtimer_active() checksThomas Gleixner
hrtimer_start() enforces a timer interrupt if the timer is already expired. Get rid of the checks and the forward loop. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Link: http://lkml.kernel.org/r/20150414203501.943658239@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Get rid of hrtimer softirqThomas Gleixner
hrtimer softirq is a leftover from the initial implementation and serves only the purpose to handle the enqueueing of already expired timers in the high resolution timer mode. We discussed whether we change the return value and force all start sites to handle that the timer is already expired, but that would be a Herculean task and I'm not sure whether its a good idea to enforce that handling on everyone. A simpler solution is to enforce a timer interrupt instead of raising and scheduling a softirq. Just use the existing infrastructure to do so and remove all the softirq leftovers. The HRTIMER softirq enum is now unused, but kept around because trace parsers rely on the existing numbering. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203501.840834708@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Keep pointer to first timer and simplify __remove_hrtimer()Thomas Gleixner
__remove_hrtimer() needs to evaluate the expiry time to figure out whether the timer which is removed is eventually the first expiring timer on the cpu. Keep a pointer to it, which is lazily updated, so we can avoid the evaluation dance and retrieve the information from there. Generates slightly better code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203501.752838019@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Make use of timerqueue_add/del return valuesThomas Gleixner
Use the return value instead of reevaluating the information. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203501.658152945@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Use cpu_base->active_base for hotpath iteratorsThomas Gleixner
The active_bases field is guaranteed to be in sync with the timerqueue of the corresponding clock base. So we can use it for iterating over the clock bases. This allows to break out early if no more active clock bases are available and avoids touching the cache lines of inactive clock bases. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203501.322887675@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Use bits for various boolean indicatorsThomas Gleixner
No point in wasting 12 byte storage space. Generates better code as well. Text size reduction: x8664 -64, i386 -16, ARM -132, ARM64 -0, power64 -48 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203501.227955358@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Make offset update smarterThomas Gleixner
On every tick/hrtimer interrupt we update the offset variables of the clock bases. That's silly because these offsets change very seldom. Add a sequence counter to the time keeping code which keeps track of the offset updates (clock_was_set()). Have a sequence cache in the hrtimer cpu bases to evaluate whether the offsets must be updated or not. This allows us later to avoid pointless cacheline pollution. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/20150414203501.132820245@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: John Stultz <john.stultz@linaro.org>
2015-04-22hrtimer: Get rid of softirq timeThomas Gleixner
The softirq time field in the clock bases is an optimization from the early days of hrtimers. It provides a coarse "jiffies" like time mostly for self rearming timers. But that comes with a price: - Larger code size - Extra storage space - Duplicated functions with really small differences The benefit of this is optimization is marginal for contemporary systems. Consolidate everything on the high resolution timer implementation. This makes further optimizations possible. Text size reduction: x8664 -95, i386 -356, ARM -148, ARM64 -40, power64 -16 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203501.039977424@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Make the statistics fields smallerThomas Gleixner
No point in having usigned long for /proc/timer_list statistics. Make them unsigned int. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203500.959773467@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Get rid of hrtimer_get_res()Thomas Gleixner
The resolution is directly accessible now. So its simpler just to fill in the values of the timespec and be done with it. Text size reduction (combined with "hrtimer: Get rid of the resolution field in hrtimer_clock_base"): x8664 -61, i386 -221, ARM -60, power64 -48 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203500.879888080@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Get rid of the resolution field in hrtimer_clock_baseThomas Gleixner
The field has no value because all clock bases have the same resolution. The resolution only changes when we switch to high resolution timer mode. We can evaluate that from a single static variable as well. In the !HIGHRES case its simply a constant. Export the variable, so we can simplify the usage sites. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20150414203500.645454122@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Update active_bases before calling hrtimer_force_reprogram()Viresh Kumar
'active_bases' indicates which clock-base have active timer. The intention of this bit field was to avoid evaluating inactive bases. It was introduced with the introduction of the BOOTTIME and TAI clock bases, but it was never brought into full use. We want to use it now, but in __remove_hrtimer() the update happens after the calling hrtimer_force_reprogram() which has to evaluate all clock bases for the next expiring timer. So in case the last timer of a clock base got removed we still see the active bit and therefor evaluate the clock base for no value. There are further optimizations possible when active_bases is updated in the right place. Move the update before the call to hrtimer_force_reprogram() [ tglx: Massaged changelog ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: linaro-kernel@lists.linaro.org Link: http://lkml.kernel.org/r/20150414203500.533438642@linutronix.de Link: http://lkml.kernel.org/r/c7c8ebcd9ed88bb09d76059c745a1fafb48314e7.1428039899.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22hrtimer: Document hrtimer_forward[_now]() properThomas Gleixner
Document the calling context conditions. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20150413210035.178751779@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22timekeeping: Remove stale function prototypeThomas Gleixner
commit 61edec81d260 "timekeeping: Simplify timekeeping_clocktai()" implemented timekeeping_clocktai() as an inline function, but left the old extern prototype in the header file. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-22timer_list: Reduce SEQ_printf footprintJoe Perches
This macro can be converted to a static function to reduce object size. (x86-64 defconfig) $ size kernel/time/timer_list.o* text data bss dec hex filename 6583 8 0 6591 19bf kernel/time/timer_list.o.old 4647 8 0 4655 122f kernel/time/timer_list.o.new Signed-off-by: Joe Perches <joe@perches.com> Cc: John Stultz <john.stultz@linaro.org> Link: http://lkml.kernel.org/r/1429295958.2850.104.camel@perches.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-04-19smp: Fix error case handling in smp_call_function_*()Linus Torvalds
Commit 8053871d0f7f ("smp: Fix smp_call_function_single_async() locking") fixed the locking for the asynchronous smp-call case, but in the process of moving the lock handling around, one of the error cases ended up not unlocking the call data at all. This went unnoticed on x86, because this is a "caller is buggy" case, where the caller is trying to call a non-existent CPU. But apparently ARM does that (at least under qemu-arm). Bindly doing cross-cpu calls to random CPU's that aren't even online seems a bit fishy, but the error handling was clearly not correct. Simply add the missing "csd_unlock()" to the error path. Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Analyzed-by: Rabin Vincent <rabin@rab.in> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-18Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Two fixes: an smp-call fix and a lockdep fix" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Fix smp_call_function_single_async() locking lockdep: Make print_lock() robust against concurrent release
2015-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix verifier memory corruption and other bugs in BPF layer, from Alexei Starovoitov. 2) Add a conservative fix for doing BPF properly in the BPF classifier of the packet scheduler on ingress. Also from Alexei. 3) The SKB scrubber should not clear out the packet MARK and security label, from Herbert Xu. 4) Fix oops on rmmod in stmmac driver, from Bryan O'Donoghue. 5) Pause handling is not correct in the stmmac driver because it doesn't take into consideration the RX and TX fifo sizes. From Vince Bridgers. 6) Failure path missing unlock in FOU driver, from Wang Cong. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) net: dsa: use DEVICE_ATTR_RW to declare temp1_max netns: remove BUG_ONs from net_generic() IB/ipoib: Fix ndo_get_iflink sfc: Fix memcpy() with const destination compiler warning. altera tse: Fix network-delays and -retransmissions after high throughput. net: remove unused 'dev' argument from netif_needs_gso() act_mirred: Fix bogus header when redirecting from VLAN inet_diag: fix access to tcp cc information tcp: tcp_get_info() should fetch socket fields once net: dsa: mv88e6xxx: Add missing initialization in mv88e6xxx_set_port_state() skbuff: Do not scrub skb mark within the same name space Revert "net: Reset secmark when scrubbing packet" bpf: fix two bugs in verification logic when accessing 'ctx' pointer bpf: fix bpf helpers to use skb->mac_header relative offsets stmmac: Configure Flow Control to work correctly based on rxfifo size stmmac: Enable unicast pause frame detect in GMAC Register 6 stmmac: Read tx-fifo-depth and rx-fifo-depth from the devicetree stmmac: Add defines and documentation for enabling flow control stmmac: Add properties for transmit and receive fifo sizes stmmac: fix oops on rmmod after assigning ip addr ...
2015-04-17oprofile: reduce mmap_sem hold for mm->exe_fileDavidlohr Bueso
sync_buffer() needs the mmap_sem for two distinct operations, both only occurring upon user context switch handling: 1) Dealing with the exe_file. 2) Adding the dcookie data as we need to lookup the vma that backs it. This is done via add_sample() and add_data(). This patch isolates 1), for it will no longer need the mmap_sem for serialization. However, for now, make of the more standard get_mm_exe_file(), requiring only holding the mmap_sem to read the value, and relying on reference counting to make sure that the exe file won't dissappear underneath us while doing the get dcookie. As a consequence, for 2) we move the mmap_sem locking into where we really need it, in lookup_dcookie(). The benefits are twofold: reduce mmap_sem hold times, and cleaner code. [akpm@linux-foundation.org: export get_mm_exe_file for arch/x86/oprofile/oprofile.ko] Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Cc: Robert Richter <rric@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17gcov: fix softlockupsAndrey Ryabinin
gcov profiling if enabled with other heavy compile-time instrumentation like KASan could trigger following softlockups: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:1] Modules linked in: irq event stamp: 22823276 hardirqs last enabled at (22823275): [<ffffffff86e8d10d>] mutex_lock_nested+0x7d9/0x930 hardirqs last disabled at (22823276): [<ffffffff86e9521d>] apic_timer_interrupt+0x6d/0x80 softirqs last enabled at (22823172): [<ffffffff811ed969>] __do_softirq+0x4db/0x729 softirqs last disabled at (22823167): [<ffffffff811edfcf>] irq_exit+0x7d/0x15b CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.19.0-05245-gbb33326-dirty #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014 task: ffff88006cba8000 ti: ffff88006cbb0000 task.ti: ffff88006cbb0000 RIP: kasan_mem_to_shadow+0x1e/0x1f Call Trace: strcmp+0x28/0x70 get_node_by_name+0x66/0x99 gcov_event+0x4f/0x69e gcov_enable_events+0x54/0x7b gcov_fs_init+0xf8/0x134 do_one_initcall+0x1b2/0x288 kernel_init_freeable+0x467/0x580 kernel_init+0x15/0x18b ret_from_fork+0x7c/0xb0 Kernel panic - not syncing: softlockup: hung tasks Fix this by sticking cond_resched() in gcov_enable_events(). Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-04-17kernel/sysctl.c: detect overflows when converting to intHeinrich Schuchardt
When converting unsigned long to int overflows may occur. These currently are not detected when writing to the sysctl file system. E.g. on a system where int has 32 bits and long has 64 bits echo 0x800001234 > /proc/sys/kernel/threads-max has the same effect as echo 0x1234 > /proc/sys/kernel/threads-max The patch adds the missing check in do_proc_dointvec_conv. With the patch an overflow will result in an error EINVAL when writing to the the sysctl file system. Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>