Commit 9bad78b6 authored by Connor O'Brien's avatar Connor O'Brien Committed by John Stultz
Browse files

ANDROID: sched: Fix rt/dl load balancing via chain level balance



RT/DL balancing is supposed to guarantee that with N cpus available &
CPU affinity permitting, the top N RT/DL tasks will get spread across
the CPUs and all get to run. Proxy exec greatly complicates this as
blocked tasks remain on the rq but cannot be usefully migrated away
from their lock owning tasks. This has two major consequences:
1. In order to get the desired properties we need to migrate a blocked
task, its would-be proxy, and everything in between, all together -
i.e., we need to push/pull "blocked chains" rather than individual
tasks.
2. Tasks that are part of rq->curr's "blocked tree" therefore should
not be pushed or pulled. Options for enforcing this seem to include
a) create some sort of complex data structure for tracking
pushability, updating it whenever the blocked tree for rq->curr
changes (e.g. on mutex handoffs, migrations, etc.) as well as on
context switches.
b) give up on O(1) pushability checks, and search through the pushable
list every push/pull until we find a pushable "chain"
c) Extend option "b" with some sort of caching to avoid repeated work.

For the sake of simplicity & separating the "chain level balancing"
concerns from complicated optimizations, this patch focuses on trying
to implement option "b" correctly. This can then hopefully provide a
baseline for "correct load balancing behavior" that optimizations can
try to implement more efficiently.

Note:
The inability to atomically check "is task enqueued on a specific rq"
creates 2 possible races when following a blocked chain:
- If we check task_rq() first on a task that is dequeued from its rq,
  it can be woken and enqueued on another rq before the call to
  task_on_rq_queued()
- If we call task_on_rq_queued() first on a task that is on another
  rq, it can be dequeued (since we don't hold its rq's lock) and then
  be set to the current rq before we check task_rq().

Maybe there's a more elegant solution that would work, but for now,
just sandwich the task_rq() check between two task_on_rq_queued()
checks, all separated by smp_rmb() calls. Since we hold rq's lock,
task can't be enqueued or dequeued from rq, so neither race should be
possible.

Extensive comments on various pitfalls, races, etc. included inline.

This patch was broken out from a larger chain migration
patch originally by Connor O'Brien.

Cc: Joel Fernandes <joelaf@google.com>
Cc: Qais Yousef <qyousef@layalina.io>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Zimuzo Ezeozue <zezeozue@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Metin Kaya <Metin.Kaya@arm.com>
Cc: Xuewen Yan <xuewen.yan94@gmail.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: kernel-team@android.com
Change-Id: I96fe29e8566512016bb044130fe2855bf79299e2
Signed-off-by: default avatarConnor O'Brien <connoro@google.com>
[jstultz: split out from larger chain migration patch,
 majorly refactored for runtime conditionalization]
Signed-off-by: default avatarJohn Stultz <jstultz@google.com>
Bug: 306081722
---
v7:
* Split out from larger chain-migration patch in earlier
  versions of this series
* Larger rework to allow proper conditionalization of the
  logic when running with CONFIG_SCHED_PROXY_EXEC
v8:
* Smallish logic simplifications suggested by Metin Kaya
* Rename push_task_chain -> do_push_task
* BUG_ONs converted to WARN_ONs
v9:
* Fix improper conditionalization that was causing trouble when
  running with sched_proxy_exec=false, as pointed out by Metin
* Re-justify comment in __rt_revalidate_rq_state(), suggested by
  Metin
* Improve loop logic in pick_next_pushable_dl_task to use a
  while loop instead of gotos, also suggsted by Metin
* Few minor checkpatch fixups
v10:
* Name change from do_push_task to move_queued_task_locked
* Cleanups suggested by Metin
v11:
* Nit cleanups suggested by Metin
* Moved task_is_pushable tristate logic to here from earlier
  in the series, as suggested by Metin and others.
* More logic cleanups suggested by Metin
parent 493a9170
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment