UPSTREAM: sched/psi: Optimize psi_group_change() cpu_clock() usage
Dietmar reported that commit 3840cbe2 ("sched: psi: fix bogus pressure spikes from aggregation race") caused a regression for him on a high context switch rate benchmark (schbench) due to the now repeating cpu_clock() calls. In particular the problem is that get_recent_times() will extrapolate the current state to 'now'. But if an update uses a timestamp from before the start of the update, it is possible to get two reads with inconsistent results. It is effectively back-dating an update. (note that this all hard-relies on the clock being synchronized across CPUs -- if this is not the case, all bets are off). Combine this problem with the fact that there are per-group-per-cpu seqcounts, the commit in question pushed the clock read into the group iteration, causing tree-depth cpu_clock() calls. On architectures where cpu_clock() has appreciable overhead, this hurts. Instead move to a per-cpu seqcount, which allows us to have a single clock read for all group updates, increasing internal consistency and lowering update overhead. This comes at the cost of a longer update side (proportional to the tree depth) which can cause the read side to retry more often. Fixes: 3840cbe2 ("sched: psi: fix bogus pressure spikes from aggregation race") Reported-by:Dietmar Eggemann <dietmar.eggemann@arm.com> Change-Id: I4f1537ed7233ba46ee28250237d48551ffcbd9d8 Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Tested-by:
Dietmar Eggemann <dietmar.eggemann@arm.com>,> Link: https://lkml.kernel.org/20250522084844.GC31726@noisy.programming.kicks-ass.net (cherry picked from commit 570c8efd) Bug: 437846539 Signed-off-by:
Greg Kroah-Hartman <gregkh@google.com>
Loading