Split strided slice GPU code into multiple files
This file was a bottleneck during compilation, often taking many minutes to compile. In local testing this change reduces the wall-clock build time for the affected kernels from >3 mins to approx 1 min. PiperOrigin-RevId: 230431569
Loading
Please sign in to comment