Remove heuristic caps on parallelism that should now be handled by cost model.
Adjust cost model for FloatToBFloat16 and BFloat16ToFloat. They do not take 100 cycles per element. This cl is a companion to cl/122779011, which makes the caps effective again, even with the nonblocking threadpool. Change: 123144919
Loading
Please sign in to comment