Commit 418c7258 authored by Eugene Zhulenev's avatar Eugene Zhulenev Committed by TensorFlower Gardener
Browse files

Optimize Spatial&Cuboid backward kernel convolutions.

Without shuffle TensorExecutor uses optimized (specialized) gemm_pack_rhs to pack memory before contraction. Custom rhs packer is much faster than contracting by inner dimension with default packer.

  1. CuboidConvolutionBwdKernel: ~10x-25x speedup
  2. SpatialConvolutionBwdKernel: ~2x-10x speedup

PiperOrigin-RevId: 212506483
parent dad6912b
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment