Commit 3d86d8ce authored by Phil's avatar Phil Committed by Rasmus Munk Larsen
Browse files

Add unsortedsegment(prod/min/max/sqrt_n/mean). (#15858)

* Add unsortedsegment(prod/min/max/sqrt_n/mean).

This commit adds CPU/GPU implementations for prod/min/max
ops and python implementations for mean/sqrt_n. Also, it adapts and unifies the
corresponding tests of all unsorted reductions.
Note: The new gradient of unsorted_segment_max fixes the crash occuring when
negative indices on CPU are used.

* update golden API

* Fix compilation of atomicAdd for cuda_arch < 600. \n This commit moves the std::complex specialization of atomicAdd below the double specialization of atomicAdd for cuda_arch 600.

* Enable bfloat16, change inline to EIGEN_STRONG_INLINE.

* fix includes of cuda_device_functions; fix typo
parent 8aa14cd6
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment