[TF:XLA] Use XLA CPU runtime functions to speed up R2 dot in the HLO evaluator.
This CL adds a fast-path for R2 dot. For now the fast implementation has certain limitations: 1. Only operands with default layout, and 2. float type It uses the XLA's CPU runtime functions which invoke eigen. PiperOrigin-RevId: 225372611
Loading
Please sign in to comment