Commit 082735fb authored by Shuai Xue's avatar Shuai Xue Committed by Greg Kroah-Hartman
Browse files

ACPI: APEI: send SIGBUS to current task if synchronous memory error not recovered



[ Upstream commit 79a5ae3c ]

If a synchronous error is detected as a result of user-space process
triggering a 2-bit uncorrected error, the CPU will take a synchronous
error exception such as Synchronous External Abort (SEA) on Arm64. The
kernel will queue a memory_failure() work which poisons the related
page, unmaps the page, and then sends a SIGBUS to the process, so that
a system wide panic can be avoided.

However, no memory_failure() work will be queued when abnormal
synchronous errors occur. These errors can include situations like
invalid PA, unexpected severity, no memory failure config support,
invalid GUID section, etc. In such a case, the user-space process will
trigger SEA again.  This loop can potentially exceed the platform
firmware threshold or even trigger a kernel hard lockup, leading to a
system reboot.

Fix it by performing a force kill if no memory_failure() work is queued
for synchronous errors.

Signed-off-by: default avatarShuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: default avatarJarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: default avatarYazen Ghannam <yazen.ghannam@amd.com>
Reviewed-by: default avatarJane Chu <jane.chu@oracle.com>
Reviewed-by: default avatarHanjun Guo <guohanjun@huawei.com>
Link: https://patch.msgid.link/20250714114212.31660-2-xueshuai@linux.alibaba.com


[ rjw: Changelog edits ]
Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
parent 84518ec7
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment