Commit 687f21a0 authored by Zhu Kaiqian's avatar Zhu Kaiqian
Browse files

BACKPORT: FROMGIT: hrtimers: Force migrate away hrtimers queued after CPUHP_AP_HRTIMERS_DYING



hrtimers are migrated away from the dying CPU to any online target at
the CPUHP_AP_HRTIMERS_DYING stage in order not to delay bandwidth timers
handling tasks involved in the CPU hotplug forward progress.

However wakeups can still be performed by the outgoing CPU after
CPUHP_AP_HRTIMERS_DYING. Those can result again in bandwidth timers being
armed. Depending on several considerations (crystal ball power management
based election, earliest timer already enqueued, timer migration enabled or
not), the target may eventually be the current CPU even if offline. If that
happens, the timer is eventually ignored.

The most notable example is RCU which had to deal with each and every of
those wake-ups by deferring them to an online CPU, along with related
workarounds:

_ e787644c (rcu: Defer RCU kthreads wakeup when CPU is dying)
_ 9139f932 (rcu/nocb: Fix RT throttling hrtimer armed from offline CPU)
_ f7345ccc (rcu/nocb: Fix rcuog wake-up from offline softirq)

The problem isn't confined to RCU though as the stop machine kthread
(which runs CPUHP_AP_HRTIMERS_DYING) reports its completion at the end
of its work through cpu_stop_signal_done() and performs a wake up that
eventually arms the deadline server timer:

   WARNING: CPU: 94 PID: 588 at kernel/time/hrtimer.c:1086 hrtimer_start_range_ns+0x289/0x2d0
   CPU: 94 UID: 0 PID: 588 Comm: migration/94 Not tainted
   Stopper: multi_cpu_stop+0x0/0x120 <- stop_machine_cpuslocked+0x66/0xc0
   RIP: 0010:hrtimer_start_range_ns+0x289/0x2d0
   Call Trace:
   <TASK>
     start_dl_timer
     enqueue_dl_entity
     dl_server_start
     enqueue_task_fair
     enqueue_task
     ttwu_do_activate
     try_to_wake_up
     complete
     cpu_stopper_thread

Instead of providing yet another bandaid to work around the situation, fix
it in the hrtimers infrastructure instead: always migrate away a timer to
an online target whenever it is enqueued from an offline CPU.

This will also allow to revert all the above RCU disgraceful hacks.

Fixes: 5c0930cc ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Reported-by: default avatarVlad Poenaru <vlad.wing@gmail.com>
Reported-by: default avatarUsama Arif <usamaarif642@gmail.com>
Signed-off-by: default avatarFrederic Weisbecker <frederic@kernel.org>
Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>

Bug: 392352128
(cherry picked from commit 53dac345 https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

 timers/urgent)
[kaiqian: Resolved the symbol declartion of hrtimer_cpu_base struct in hrtimer.h]
Change-Id: I3f6104ec1b67c8929edd81c7f2471a202c09493c
Signed-off-by: default avatarZhu Kaiqian <zhukaiqian@xiaomi.com>
parent 1d28a506
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment