Commit 19dd4101 authored by Thomas Gleixner's avatar Thomas Gleixner Committed by Treehugger Robot
Browse files

UPSTREAM: x86/smp: Cure kexec() vs. mwait_play_dead() breakage



commit d7893093 upstream.

TLDR: It's a mess.

When kexec() is executed on a system with offline CPUs, which are parked in
mwait_play_dead() it can end up in a triple fault during the bootup of the
kexec kernel or cause hard to diagnose data corruption.

The reason is that kexec() eventually overwrites the previous kernel's text,
page tables, data and stack. If it writes to the cache line which is
monitored by a previously offlined CPU, MWAIT resumes execution and ends
up executing the wrong text, dereferencing overwritten page tables or
corrupting the kexec kernels data.

Cure this by bringing the offlined CPUs out of MWAIT into HLT.

Write to the monitored cache line of each offline CPU, which makes MWAIT
resume execution. The written control word tells the offlined CPUs to issue
HLT, which does not have the MWAIT problem.

That does not help, if a stray NMI, MCE or SMI hits the offlined CPUs as
those make it come out of HLT.

A follow up change will put them into INIT, which protects at least against
NMI and SMI.

Fixes: ea530692 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case")
Reported-by: default avatarAshok Raj <ashok.raj@intel.com>
Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
Tested-by: default avatarAshok Raj <ashok.raj@intel.com>
Reviewed-by: default avatarAshok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230615193330.492257119@linutronix.de


Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Change-Id: I80035e671b55732ac3d56c71dc53364e82238fe2
(cherry-picked from commit 0af4750e)
Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
parent 26260c4b
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment