Commit c82a933f authored Sep 14, 2017 by Sanjoy Das Committed by TensorFlower Gardener Sep 14, 2017

Lower vector-matrix dot to LLVM IR if the RHS of the dot can be made

column major.

The naive dot lowering to LLVM IR (already present in XLA today) is
cache efficient if the dot has LHS of shape [1,K]{1,0} and RHS of
shape [K x N]{0,1}.  This change teaches the layout assignment pass to
exploit this property by converting a constant RHS matrix to a column
major layout when possible.

Couple of related things I had to touch in this change:

 - In LayoutAssignmentTest.TupleLayout we used to generate a kCopy to satisfy
   the conflicting constraints between the result and the constant shapes, but
   with this change we change the layout of the constants themselves.  So the
   EXPECT_FALSE is now an EXPECT_TRUE.

 - The extra instruction layout constraints added at the end of
   CpuLayoutAssignment::AddBackendConstraints seemed redundant.  The layout
   assignment pass already tries to make all unconstrained buffers have the
   default row-major layout.  Moreover, they were blocking this optimization in
   some cases by introducing conflicting constraints.

 - The changes to literal_util.h have to be made to deal with the
   Literal::Relayout calls we now get on literals of various types.

PiperOrigin-RevId: 168761204

parent dd22dbc7

Show whitespace changes

Inline Side-by-side

Please to comment