FROMLIST: crypto: x86/sha256-ni - add support for finup_mb
Add an implementation of finup_mb to sha256-ni, using an interleaving
factor of 2. It interleaves a finup operation for two equal-length
messages that share a common prefix. dm-verity and fs-verity will take
advantage of this for greatly improved performance on capable CPUs.
This increases the throughput of SHA-256 hashing 4096-byte messages by
the following amounts on the following CPUs:
AMD Zen 1: 84%
AMD Zen 4: 98%
Intel Ice Lake: 4%
Intel Sapphire Rapids: 20%
For now, this seems to benefit AMD much more than Intel. This seems to
be because current AMD CPUs support concurrent execution of the SHA-NI
instructions, but unfortunately current Intel CPUs don't, except for the
sha256msg2 instruction. Hopefully future Intel CPUs will support SHA-NI
on more execution ports. Zen 1 supports 2 concurrent sha256rnds2, and
Zen 4 supports 4 concurrent sha256rnds2, which suggests that even better
performance may be achievable on Zen 4 by interleaving more than two
hashes; however, doing so poses a number of trade-offs.
It's been reported that the method that achieves the highest SHA-256
throughput on Intel CPUs is actually computing 16 hashes simultaneously
using AVX512. That method would be quite different to the SHA-NI method
used in this patch. However, such a high interleaving factor isn't
practical for the use cases being targeted in the kernel.
Reviewed-by:
Sami Tolvanen <samitolvanen@google.com>
Acked-by:
Ard Biesheuvel <ardb@kernel.org>
Signed-off-by:
Eric Biggers <ebiggers@google.com>
Bug: 330611177
Link: https://lore.kernel.org/r/20240621165922.77672-5-ebiggers@kernel.org
Change-Id: I67204992677a80826c61e29ee3ca3c8be477d2f3
Signed-off-by:
Eric Biggers <ebiggers@google.com>
Loading
Please sign in to comment