Commit 7b37024e authored Jan 31, 2019 by Justin Lebar Committed by TensorFlower Gardener Jan 31, 2019

[XLA] Fix erf for f16 inputs, and improve precision in f32.

Switch to using the f32 implementations.  Previously we were using the f64
implementation with all constants truncated down to f32.

Part of the problem is that it was computing

  erf(x) = x * polynomial1(x^2) / polynomial2(x^2)

and assuming that we had enough precision that neither of the polynomials would
overflow.  On f16, polynomial2 would overflow to inf even for quite reasonable
values of x (e.g. 0.79), thus causing the result to be 0 incorrectly.

This may reduce precision for f64, but that's a trade-off we're willing to make
(especially since we were using f32 coefficients *anyway*).

erf and erfc also now call each other when their inputs are out of range.

Relevant to #25052.

PiperOrigin-RevId: 231896451

parent ca2a1cb4

Show whitespace changes

Inline Side-by-side

Please to comment