Commit 9ff9c1f6 authored by A. Unique TensorFlower's avatar A. Unique TensorFlower Committed by TensorFlower Gardener
Browse files

Parallelize inner matrix multiplications of BatchMatMul on CPU when appropriate.

* Uses simple heuristics to choose between parallelizing outer (batch), inner (matmul) or both.
* Adds benchmarks for BatchMatMul.
* Switches matmul benchmark to use real time so GFlops reported are w.r.t. walltime and measure the effect of multi-threading.
* Fixes bug in cost_per_unit calculation. The old code calculated B*M*N instead of M*N*K.
Change: 134025273
parent d738806a
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment