ANDROID: sched: Handle blocked-waiter migration (and return migration)
Add logic to handle migrating a blocked waiter to a remote
cpu where the lock owner is runnable.
Additionally, as the blocked task may not be able to run
on the remote cpu, add logic to handle return migration once
the waiting task is given the mutex.
Because tasks may get migrated to where they cannot run, also
modify the scheduling classes to avoid sched class migrations on
mutex blocked tasks, leaving find_proxy_task() and related logic
to do the migrations and return migrations.
This was split out from the larger proxy patch, and
significantly reworked.
Credits for the original patch go to:
Peter Zijlstra (Intel) <peterz@infradead.org>
Juri Lelli <juri.lelli@redhat.com>
Valentin Schneider <valentin.schneider@arm.com>
Connor O'Brien <connoro@google.com>
NOTE: With this patch I've hit a few cases where we seem to miss
a BO_WAKING->BO_RUNNING transition (and return migration) that
I'd expect to happen in ttwu(). So I have logic in
find_proxy_task() to detect and to handle the return migration
later. However I'm quite not happy with that as it shouldn't be
necessary, and am still trying to understand where I'm losing
the wakeup & return migration.
Cc: Joel Fernandes <joelaf@google.com>
Cc: Qais Yousef <qyousef@layalina.io>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Zimuzo Ezeozue <zezeozue@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Metin Kaya <Metin.Kaya@arm.com>
Cc: Xuewen Yan <xuewen.yan94@gmail.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: kernel-team@android.com
Change-Id: I689ba1a5611acbe87fcb41e883d3d8bf400a3db5
Signed-off-by:
John Stultz <jstultz@google.com>
Bug: 306081722
---
v6:
* Integrated sched_proxy_exec() check in proxy_return_migration()
* Minor cleanups to diff
* Unpin the rq before calling __balance_callbacks()
* Tweak proxy migrate to migrate deeper task in chain, to avoid
tasks pingponging between rqs
v7:
* Fixup for unused function arguments
* Switch from that_rq -> target_rq, other minor tweaks, and typo
fixes suggested by Metin Kaya
* Switch back to doing return migration in the ttwu path, which
avoids nasty lock juggling and performance issues
* Fixes for UP builds
v8:
* More simplifications from Metin Kaya
* Fixes for null owner case, including doing return migration
* Cleanup proxy_needs_return logic
v9:
* Narrow logic in ttwu that sets BO_RUNNABLE, to avoid missed
return migrations
* Switch to using zap_balance_callbacks rathern then running
them when we are dropping rq locks for proxy_migration.
* Drop task_is_blocked check in sched_submit_work as suggested
by Metin (may re-add later if this causes trouble)
* Do return migration when we're not on wake_cpu. This avoids
bad task placement caused by proxy migrations raised by
Xuewen Yan
* Fix to call set_next_task(rq->curr) prior to dropping rq lock
to avoid rq->curr getting migrated before we have actually
switched from it
* Cleanup to re-use proxy_resched_idle() instead of open coding
it in proxy_migrate_task()
* Fix return migration not to use DEQUEUE_SLEEP, so that we
properly see the task as task_on_rq_migrating() after it is
dequeued but before set_task_cpu() has been called on it
* Fix to broaden find_proxy_task() checks to avoid race where
a task is dequeued off the rq due to return migration, but
set_task_cpu() and the enqueue on another rq happened after
we checked task_cpu(owner). This ensures we don't proxy
using a task that is not actually on our runqueue.
* Cleanup to avoid the locked BO_WAKING->BO_RUNNABLE transition
in try_to_wake_up() if proxy execution isn't enabled.
* Cleanup to improve comment in proxy_migrate_task() explaining
the set_next_task(rq->curr) logic
* Cleanup deadline.c change to stylistically match rt.c change
* Numerous cleanups suggested by Metin
v10:
* Drop WARN_ON(task_is_blocked(p)) in ttwu current case
v11:
* Include proxy_set_task_cpu from later in the series to this
change so we can use it, rather then reworking logic later
in the series.
* Fix problem with return migration, where affinity was changed
and wake_cpu was left outside the affinity mask.
* Avoid reading the owner's cpu twice (as it might change inbetween)
to avoid occasional migration-to-same-cpu edge cases
* Add extra WARN_ON checks for wake_cpu and return migration
edge cases.
* Typo fix from Metin
v13:
* As we set ret, return it, not just NULL (pulling this change
in from later patch)
* Avoid deadlock between try_to_wake_up() and find_proxy_task() when
blocked_on cycle with ww_mutex is trying a mid-chain wakeup.
* Tweaks to use new __set_blocked_on_runnable() helper
* Potential fix for incorrectly updated task->dl_server issues
* Minor comment improvements
* Add logic to handle missed wakeups, in that case doing return
migration from the find_proxy_task() path
* Minor cleanups
v14:
* Improve edge cases where we wouldn't set the task as BO_RUNNABLE
Loading
Please sign in to comment