Commit 8fa99863 authored by John Stultz's avatar John Stultz Committed by Treehugger Robot
Browse files

ANDROID: sched: Reapply reverted portions of "sched/core: Prevent race...


ANDROID: sched: Reapply reverted portions of "sched/core: Prevent race condition between cpuset and __sched_setscheduler()"

This reverts commit 44ee6786

The original change, commit 710da3c8 ("sched/core: Prevent race
condition between cpuset and __sched_setscheduler()") added potential
rwsem locking inside __sched_setscheduler() and moved the call
to __sched_setscheduler() out of the rcu read lock section in
do_sched_setschduler(). However, there was a complication with
binder calling sched_setscheduler_nocheck() while holding the node
spin lock as well as potentially the thread->prio_lock.

So in commit 44ee6786 this was reverted in the Android tree,
undoing the rwsem additions and moving __sched_setscheduler() back
under the rcu read lock.

Later, upstream in commit 111cd11b ("sched/cpuset: Bring back
cpuset_mutex") and backported via 6.1-stable in commit 9bcfe152,
the change reverted the original rwsem locking in __sched_setscheduler()
replacing them with mutexes, used only in the SCHED_DEADLINE case.

This resulted in the android tree having do_sched_setscheduler()
code paths take an rcu_read_lock() and then eventually call into a
mutex_lock(), triggering the following warning:

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:293
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 13352, name: <test>
preempt_count: 0, expected: 0
RCU nest depth: 1, expected: 0
Call trace:
 dump_backtrace+0xf8/0x148
 show_stack+0x18/0x24
 dump_stack_lvl+0x60/0x7c
 dump_stack+0x18/0x38
 __might_resched+0x1f0/0x2e8
 __might_sleep+0x48/0x7c
 mutex_lock+0x24/0xfc
 cpuset_lock+0x18/0x28
 __sched_setscheduler+0x2ec/0xb38
 do_sched_setscheduler+0x180/0x1fc
 __arm64_sys_sched_setscheduler+0x20/0x3c
 invoke_syscall+0x58/0x118
 el0_svc_common+0xb4/0xf4
 do_el0_svc+0x24/0x80
 el0_svc+0x2c/0x90
 el0t_64_sync_handler+0x68/0xb4
 el0t_64_sync+0x1a4/0x1a8

In the android-mainline tree, it was noted that the origial issue with
binder had been resolved in 6.5-rc1, so the original the revert was
undone by commit 4fb867ee ("Revert "Revert "sched/core: Prevent
race condition between cpuset and __sched_setscheduler()"").

However, binder is still calling sched_setscheduler_nocheck()
potentially holding spinlocks (see: b/275379975), but as we don't
see major issues (as __sched_setscheduler already may *currently*
sleep), it seems there may be logical restrictions that prevent it
from actually occuring (seemingly due to binder not running as
deadline).

The binder call path however does not use do_sched_setscheduler(),
so revert the remaining portion of commit 44ee6786 ("Revert "sched/core:
Prevent race condition between cpuset and __sched_setscheduler()""),
moving the call to __sched_setscheduler() outside the rcu critical
section. This will address the reported issue above, while not changing
the current situation with binder calling __sched_setscheduler().

Bug: 408888661
Bug: 414473894
Fixes: 44ee6786 ("Revert "sched/core: Prevent race condition between cpuset and __sched_setscheduler()"")
Change-Id: Ibebf364586cc3dda3993e7d685b5fee3566ec806
Signed-off-by: default avatarJohn Stultz <jstultz@google.com>
(cherry picked from commit 6bd3b482)
parent 4cadbfbb
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment