Skip to content

Commit ede9622

Browse files
ebiggersherbertx
authored andcommitted
crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS
Add an ARM NEON-accelerated implementation of Speck-XTS. It operates on 128-byte chunks at a time, i.e. 8 blocks for Speck128 or 16 blocks for Speck64. Each 128-byte chunk goes through XTS preprocessing, then is encrypted/decrypted (doing one cipher round for all the blocks, then the next round, etc.), then goes through XTS postprocessing. The performance depends on the processor but can be about 3 times faster than the generic code. For example, on an ARMv7 processor we observe the following performance with Speck128/256-XTS: xts-speck128-neon: Encryption 107.9 MB/s, Decryption 108.1 MB/s xts(speck128-generic): Encryption 32.1 MB/s, Decryption 36.6 MB/s In comparison to AES-256-XTS without the Cryptography Extensions: xts-aes-neonbs: Encryption 41.2 MB/s, Decryption 36.7 MB/s xts(aes-asm): Encryption 31.7 MB/s, Decryption 30.8 MB/s xts(aes-generic): Encryption 21.2 MB/s, Decryption 20.9 MB/s Speck64/128-XTS is even faster: xts-speck64-neon: Encryption 138.6 MB/s, Decryption 139.1 MB/s Note that as with the generic code, only the Speck128 and Speck64 variants are supported. Also, for now only the XTS mode of operation is supported, to target the disk and file encryption use cases. The NEON code also only handles the portion of the data that is evenly divisible into 128-byte chunks, with any remainder handled by a C fallback. Of course, other modes of operation could be added later if needed, and/or the NEON code could be updated to handle other buffer sizes. The XTS specification is only defined for AES which has a 128-bit block size, so for the GF(2^64) math needed for Speck64-XTS we use the reducing polynomial 'x^64 + x^4 + x^3 + x + 1' given by the original XEX paper. Of course, when possible users should use Speck128-XTS, but even that may be too slow on some processors; Speck64-XTS can be faster. Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
1 parent c8c3641 commit ede9622

File tree

4 files changed

+728
-0
lines changed

4 files changed

+728
-0
lines changed

arch/arm/crypto/Kconfig

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,4 +121,10 @@ config CRYPTO_CHACHA20_NEON
121121
select CRYPTO_BLKCIPHER
122122
select CRYPTO_CHACHA20
123123

124+
config CRYPTO_SPECK_NEON
125+
tristate "NEON accelerated Speck cipher algorithms"
126+
depends on KERNEL_MODE_NEON
127+
select CRYPTO_BLKCIPHER
128+
select CRYPTO_SPECK
129+
124130
endif

arch/arm/crypto/Makefile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ obj-$(CONFIG_CRYPTO_SHA1_ARM_NEON) += sha1-arm-neon.o
1010
obj-$(CONFIG_CRYPTO_SHA256_ARM) += sha256-arm.o
1111
obj-$(CONFIG_CRYPTO_SHA512_ARM) += sha512-arm.o
1212
obj-$(CONFIG_CRYPTO_CHACHA20_NEON) += chacha20-neon.o
13+
obj-$(CONFIG_CRYPTO_SPECK_NEON) += speck-neon.o
1314

1415
ce-obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o
1516
ce-obj-$(CONFIG_CRYPTO_SHA1_ARM_CE) += sha1-arm-ce.o
@@ -53,6 +54,7 @@ ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o
5354
crct10dif-arm-ce-y := crct10dif-ce-core.o crct10dif-ce-glue.o
5455
crc32-arm-ce-y:= crc32-ce-core.o crc32-ce-glue.o
5556
chacha20-neon-y := chacha20-neon-core.o chacha20-neon-glue.o
57+
speck-neon-y := speck-neon-core.o speck-neon-glue.o
5658

5759
quiet_cmd_perl = PERL $@
5860
cmd_perl = $(PERL) $(<) > $(@)

0 commit comments

Comments
 (0)