Optimize CuboidConvolutionBwdInput.
~25-30% speedup when compiled with AVX. * collapse inner dims before contraction * eval kernel tensor before contraction PiperOrigin-RevId: 211651030
Loading
Please sign in to comment
~25-30% speedup when compiled with AVX. * collapse inner dims before contraction * eval kernel tensor before contraction PiperOrigin-RevId: 211651030