Commit 96dfc17e authored Aug 08, 2018 by Adhemerval Zanella Committed by George Fang Oct 10, 2018

Use double-precision routines from arm-optimized-routines

This patch ues exp, exp2, log, log2, and pow from arm optimized
routines.  For pow on x86_64, although slight slower it simplifies
the code required on both bionic and arm-optimized-routines (so
there is no need to select and export the symbol based on
architecture).

Performance-wise the improvements are:

  x86_64    throughput    latency
  exp            1.16x      1.16x
  log            1.08x      0.95x
  exp2           1.27x      1.55x
  log2           1.40x      1.43x
  pow            0.77x      0.89x *

  * I tried to check if AVX2/FMA but without success.

  aarch64   throughput     latency
  exp            2.33x      2.16x
  exp2           1.99x      1.50x
  log            1.79x      1.43x
  log2           2.15x      1.80x
  pow            3.81x      3.07x

Test: ran bionic tests on static mode.
CRs-Fixed: 2328588
Change-Id: Ib16bf3280c5329fd257a3b3f0b6c4f2f3cb34deb
(cherry picked from commit f6b101d3)

parent e9e90eb4

Show whitespace changes

Inline Side-by-side

Please to comment