ANDROID: sched: Reapply reverted portions of "sched/core: Prevent race...
ANDROID: sched: Reapply reverted portions of "sched/core: Prevent race condition between cpuset and __sched_setscheduler()" This reverts commit 44ee6786 The original change, commit 710da3c8 ("sched/core: Prevent race condition between cpuset and __sched_setscheduler()") added potential rwsem locking inside __sched_setscheduler() and moved the call to __sched_setscheduler() out of the rcu read lock section in do_sched_setschduler(). However, there was a complication with binder calling sched_setscheduler_nocheck() while holding the node spin lock as well as potentially the thread->prio_lock. So in commit 44ee6786 this was reverted in the Android tree, undoing the rwsem additions and moving __sched_setscheduler() back under the rcu read lock. Later, upstream in commit 111cd11b ("sched/cpuset: Bring back cpuset_mutex") and backported via 6.1-stable in commit 9bcfe152, the change reverted the original rwsem locking in __sched_setscheduler() replacing them with mutexes, used only in the SCHED_DEADLINE case. This resulted in the android tree having do_sched_setscheduler() code paths take an rcu_read_lock() and then eventually call into a mutex_lock(), triggering the following warning: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:293 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 13352, name: <test> preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 Call trace: dump_backtrace+0xf8/0x148 show_stack+0x18/0x24 dump_stack_lvl+0x60/0x7c dump_stack+0x18/0x38 __might_resched+0x1f0/0x2e8 __might_sleep+0x48/0x7c mutex_lock+0x24/0xfc cpuset_lock+0x18/0x28 __sched_setscheduler+0x2ec/0xb38 do_sched_setscheduler+0x180/0x1fc __arm64_sys_sched_setscheduler+0x20/0x3c invoke_syscall+0x58/0x118 el0_svc_common+0xb4/0xf4 do_el0_svc+0x24/0x80 el0_svc+0x2c/0x90 el0t_64_sync_handler+0x68/0xb4 el0t_64_sync+0x1a4/0x1a8 In the android-mainline tree, it was noted that the origial issue with binder had been resolved in 6.5-rc1, so the original the revert was undone by commit 4fb867ee ("Revert "Revert "sched/core: Prevent race condition between cpuset and __sched_setscheduler()""). However, binder is still calling sched_setscheduler_nocheck() potentially holding spinlocks (see: b/275379975), but as we don't see major issues (as __sched_setscheduler already may *currently* sleep), it seems there may be logical restrictions that prevent it from actually occuring (seemingly due to binder not running as deadline). The binder call path however does not use do_sched_setscheduler(), so revert the remaining portion of commit 44ee6786 ("Revert "sched/core: Prevent race condition between cpuset and __sched_setscheduler()""), moving the call to __sched_setscheduler() outside the rcu critical section. This will address the reported issue above, while not changing the current situation with binder calling __sched_setscheduler(). Bug: 408888661 Fixes: 44ee6786 ("Revert "sched/core: Prevent race condition between cpuset and __sched_setscheduler()"") Change-Id: Ibebf364586cc3dda3993e7d685b5fee3566ec806 Signed-off-by:John Stultz <jstultz@google.com>
Loading