Commit 23f82627 authored Jul 26, 2018 by Cao Zongyan

A faster BatchSelectFunctor for tf.where on CPU.

Op 'tf.where(c, t, e)' supports that 't' and 'e' are N-D tensors
while 'c' is a 1D tensor, which would call BatchSelectFunctor to
get the result. But its basic implementation broadcasts 'c' to the
same dimension with 't' and 'e', which would get bad efficiency on
CPU for large tensors. Here a loop-based implementation would be
adopted to make this operation faster on CPU.

parent 15b155e9

Show whitespace changes

Inline Side-by-side

Please to comment