Commit 216714e4 authored by Lynn Boger's avatar Lynn Boger
Browse files

math/big: improve performance of mulAddVWW on ppc64x

This changes the assembly implementation on ppc64x
to improve performance by reordering some instructions.
It also eliminates an unnecessary move by changing an
ADDZE to use the correct target register.

Improvement on power9:

MulAddVWW/1         6.89ns ± 0%    7.30ns ± 0%   +5.95%  (p=1.000 n=1+1)
MulAddVWW/2         8.04ns ± 0%    8.06ns ± 0%   +0.25%  (p=1.000 n=1+1)
MulAddVWW/3         9.39ns ± 0%    9.39ns ± 0%     ~     (all equal)
MulAddVWW/4         9.76ns ± 0%    9.48ns ± 0%   -2.87%  (p=1.000 n=1+1)
MulAddVWW/5         10.5ns ± 0%    10.3ns ± 0%   -1.90%  (p=1.000 n=1+1)
MulAddVWW/10        15.4ns ± 0%    14.9ns ± 0%   -3.25%  (p=1.000 n=1+1)
MulAddVWW/100        149ns ± 0%     125ns ± 0%  -16.11%  (p=1.000 n=1+1)
MulAddVWW/1000      1.42µs ± 0%    1.28µs ± 0%   -9.74%  (p=1.000 n=1+1)
MulAddVWW/10000     14.2µs ± 0%    12.8µs ± 0%   -9.73%  (p=1.000 n=1+1)
MulAddVWW/100000     144µs ± 0%     129µs ± 0%  -10.10%  (p=1.000 n=1+1)

Change-Id: I0ae7002a69783ca19d7a4e3e42042ae75dc60069
Reviewed-on: https://go-review.googlesource.com/c/go/+/248721


Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: default avatarCarlos Eduardo Seo <cseo@linux.vnet.ibm.com>
Reviewed-by: default avatarPaul Murphy <murp@ibm.com>
parent a4017185
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment