Commit 0f8be44d authored May 17, 2018 by Benjamin Kramer Committed by TensorFlower Gardener May 17, 2018

[XLA:GPU] Unroll multi-output loop fusions

This is easier than I thought because we can assume that all tuple members have
the same number of elements. LLVM doesn't do a great job of vectorizing the
resulting stores, but otherwise this is working fine.

PiperOrigin-RevId: 197019718

parent 18cb26be

Show whitespace changes

Inline Side-by-side

Please to comment