Commit 0f063cab authored Feb 16, 2018 by A. Unique TensorFlower Committed by TensorFlower Gardener Feb 16, 2018

Optimized quantized LSTM cell runtime NEON implementation.

Notice: unlike many NEON paths that we have in this optimized_ops.h file,
which are enabled also on x86 by means of arm_neon_sse.h (#ifdef USE_NEON),
this one is only enabled on real NEON (#ifdef GEMMLOWP_NEON). The reason
for that is that gemmlowp's FixedPoint class is templatized in the
underlying raw integer/register type, e.g. here int16x8_t, and on SSE
there is only a single __m128i type for all integer types (both int16x8_t
and int32x4_t), making it non-trivial to support this on SSE without
contriving this code on NEON.
PiperOrigin-RevId: 186031054

parent 8dfaa05d

Show whitespace changes

Inline Side-by-side

Please to comment