Use double-precision routines from arm-optimized-routines
This patch ues exp, exp2, log, log2, and pow from arm optimized routines. For pow on x86_64, although slight slower it simplifies the code required on both bionic and arm-optimized-routines (so there is no need to select and export the symbol based on architecture). Performance-wise the improvements are: x86_64 throughput latency exp 1.16x 1.16x log 1.08x 0.95x exp2 1.27x 1.55x log2 1.40x 1.43x pow 0.77x 0.89x * * I tried to check if AVX2/FMA but without success. aarch64 throughput latency exp 2.33x 2.16x exp2 1.99x 1.50x log 1.79x 1.43x log2 2.15x 1.80x pow 3.81x 3.07x Test: ran bionic tests on static mode. CRs-Fixed: 2328588 Change-Id: Ib16bf3280c5329fd257a3b3f0b6c4f2f3cb34deb (cherry picked from commit f6b101d3)
Loading
Please sign in to comment