Optimized quantized LSTM cell runtime NEON implementation.
Notice: unlike many NEON paths that we have in this optimized_ops.h file, which are enabled also on x86 by means of arm_neon_sse.h (#ifdef USE_NEON), this one is only enabled on real NEON (#ifdef GEMMLOWP_NEON). The reason for that is that gemmlowp's FixedPoint class is templatized in the underlying raw integer/register type, e.g. here int16x8_t, and on SSE there is only a single __m128i type for all integer types (both int16x8_t and int32x4_t), making it non-trivial to support this on SSE without contriving this code on NEON. PiperOrigin-RevId: 186031054
Loading
Please sign in to comment