Add hardware accelerated version of POLYVAL for x86-64 CPUs with PCLMULQDQ support. This implementation is accelerated using PCLMULQDQ instructions to perform the finite field computations. For added efficiency, 8 blocks of the message are processed simultaneously by precomputing the first 8 powers of the key. Schoolbook multiplication is used instead of Karatsuba multiplication because it was found to be slightly faster on x86-64 machines. Montgomery reduction must be used instead of Barrett reduction due to the difference in modulus between POLYVAL's field and other finite fields. More information on POLYVAL can be found in the HCTR2 paper: "Length-preserving encryption with HCTR2": https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Bug: 233652475 Link: https://lore.kernel.org/linux-arm-kernel/20220520181501.2159644-4-nhuck@google.com/T/ (cherry picked from commit 34f7f6c3011276313383099156be287ac745bcea) Change-Id: I18a532b8338e62ca7c85d8106ef22eeeab7b3355 Signed-off-by: Nathan Huckleberry <nhuck@google.com>
6.6 KiB
6.6 KiB