From 2800486ee34825d954f64c6f98037daea328f121 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Thu, 7 Sep 2017 17:03:50 +0200 Subject: sched/fair: Avoid newidle balance for !active CPUs On CPU hot unplug, when parking the last kthread we'll try and schedule into idle to kill the CPU. This last schedule can (and does) trigger newidle balance because at this point the sched domains are still up because of commit: 77d1dfda0e79 ("sched/topology, cpuset: Avoid spurious/wrong domain rebuilds") Obviously pulling tasks to an already offline CPU is a bad idea, and all balancing operations _should_ be subject to cpu_active_mask, make it so. Reported-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Fixes: 77d1dfda0e79 ("sched/topology, cpuset: Avoid spurious/wrong domain rebuilds") Link: http://lkml.kernel.org/r/20170907150613.994135806@infradead.org Signed-off-by: Ingo Molnar --- kernel/sched/fair.c | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'kernel/sched/fair.c') diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8415d1ec2b84..3bcea40d3029 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8447,6 +8447,12 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf) */ this_rq->idle_stamp = rq_clock(this_rq); + /* + * Do not pull tasks towards !active CPUs... + */ + if (!cpu_active(this_cpu)) + return 0; + /* * This is OK, because current is on_cpu, which avoids it being picked * for load-balance and preemption/IRQs are still disabled avoiding -- cgit v1.2.3-70-g09d2 From edd8e41d2e3cbd6ebe13ead30eb1adc6f48cbb33 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Thu, 7 Sep 2017 17:03:51 +0200 Subject: sched/fair: Plug hole between hotplug and active_load_balance() The load balancer applies cpu_active_mask to whatever sched_domains it finds, however in the case of active_balance there is a hole between setting rq->{active_balance,push_cpu} and running the stop_machine work doing the actual migration. The @push_cpu can go offline in this window, which would result in us moving a task onto a dead cpu, which is a fairly bad thing. Double check the active mask before the stop work does the migration. CPU0 CPU1 stop_machine(takedown_cpu) load_balance() cpu_stopper_thread() ... work = multi_cpu_stop stop_one_cpu_nowait( /* wait for CPU0 */ .func = active_load_balance_cpu_stop ); cpu_stopper_thread() work = multi_cpu_stop /* sync with CPU1 */ take_cpu_down() play_dead(); work = active_load_balance_cpu_stop set_task_cpu(p, CPU1); /* oops!! */ Reported-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/20170907150614.044460912@infradead.org Signed-off-by: Ingo Molnar --- kernel/sched/fair.c | 7 +++++++ 1 file changed, 7 insertions(+) (limited to 'kernel/sched/fair.c') diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3bcea40d3029..efeebed935ae 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8560,6 +8560,13 @@ static int active_load_balance_cpu_stop(void *data) struct rq_flags rf; rq_lock_irq(busiest_rq, &rf); + /* + * Between queueing the stop-work and running it is a hole in which + * CPUs can become inactive. We should not move tasks from or to + * inactive CPUs. + */ + if (!cpu_active(busiest_cpu) || !cpu_active(target_cpu)) + goto out_unlock; /* make sure the requested cpu hasn't gone down in the meantime */ if (unlikely(busiest_cpu != smp_processor_id() || -- cgit v1.2.3-70-g09d2