libeth: xdp, xsk: access adjacent u32s as u64 where applicable
On 64-bit systems, writing/reading one u64 is faster than two u32s even when they're are adjacent in a struct. The compilers won't guarantee they will combine those; I observed both successful and unsuccessful attempts with both GCC and Clang, and it's not easy to say what it depends on. There's a few places in libeth_xdp winning up to several percent from combined access (both performance and object code size, especially when unrolling). Add __LIBETH_WORD_ACCESS and use it there on LE. Drivers are free to optimize HW-specific callbacks under the same definition. Signed-off-by:Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by:
Tony Nguyen <anthony.l.nguyen@intel.com>
Loading
Please sign in to comment