skip to main content
10.1145/3197507.3197511acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

SoK: A Performance Evaluation of Cryptographic Instruction Sets on Modern Architectures

Published: 23 May 2018 Publication History

Abstract

The latest processors have included extensions to the instruction set architecture tailored to speed up the execution of cryptographic algorithms. Like the AES New Instructions (AES-NI) that target the AES encryption algorithm, the release of the SHA New Instructions (SHA-NI), designed to support the SHA-256 hash function, introduces a new scenario for optimizing cryptographic software. In this work, we present a performance evaluation of several cryptographic algorithms, hash-based signatures and data encryption, on platforms that support AES-NI and/or SHA-NI. In particular, we revisited several optimization techniques targeting multiple-message hashing, and as a result, we reduce by 21% the running time of this task by means of a pipelined SHA-NI implementation. In public-key cryptography, multiple-message hashing is one of the critical operations of the XMSS and XMSS^MT post-quantum hash-based digital signatures. Using SHA-NI extensions, signatures are computed 4x faster; however, our pipelined SHA-NI implementation increased this speedup factor to 4.3x. For symmetric cryptography, we revisited the implementation of AES modes of operation and reduced by 12% and 7% the running time of CBC decryption and CTR encryption, respectively.

References

[1]
Onur Aciicmez. 2005. Fast hashing on Pentium SIMD architecture. Master's thesis. Oregon State University. http://ir.library.oregonstate.edu/concern/graduate_ thesis_or_dissertations/mk61rk723
[2]
Elena Andreeva, Andrey Bogdanov, Atul Luykx, Bart Mennink, Elmar Tischhauser, and Kan Yasuda. 2013. Parallelizable and Authenticated Online Ciphers. In Advances in Cryptology - ASIACRYPT 2013, Kazue Sako and Palash Sarkar (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 424--443.
[3]
ARM. 2017. ARM Architecture Reference Manual. ARMv8, for ARMv8-A architecture profile. ARM. https://static.docs.arm.com/ddi0487/ca/DDI0487C_a_armv8_arm. pdf
[4]
Daniel J. Bernstein (Ed.). 2013. CAESAR: Competition for Authenticated Encryption: Security, Applicability, and Robustness. Cryptographic competitions. https:// competitions.cr.yp.to/caesar-submissions.html
[5]
Daniel J Bernstein, Daira Hopwood, Andreas Hülsing, Tanja Lange, Ruben Niederhagen, Louiza Papachristodoulou, Michael Schneider, Peter Schwabe, and Zooko Wilcox-O'Hearn. 2015. SPHINCS: practical stateless hash-based signatures. In Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, Berlin, Heidelberg, 368--397.
[6]
Daniel J. Bernstein and Tanja Lange. 2017. eBACS: ECRYPT Benchmarking of Cryptographic Systems. (Dec. 2017). http://bench.cr.yp.to/supercop.html Published: Accessed on 20 December 2017.
[7]
Andrey Bogdanov, Martin M. Lauridsen, and Elmar Tischhauser. 2015. Comb to Pipeline: Fast Software Encryption Revisited. In Fast Software Encryption: 22nd International Workshop, FSE 2015, Istanbul, Turkey, March 8--11, 2015, Revised Selected Papers, Gregor Leander (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 150--171.
[8]
Johannes Buchmann, Erik Dahmen, and Andreas Hülsing. 2011. XMSS - A Practical Forward Secure Signature Scheme Based on Minimal Security Assumptions. In Post-Quantum Cryptography, Bo-Yin Yang (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 117--129.
[9]
Ana Karina D. S. de Oliveira and Julio López. 2015. An Efficient Software Implementation of the Hash-Based Signature Scheme MSS and Its Variants. In Progress in Cryptology -- LATINCRYPT 2015, Kristin Lauter and Francisco RodríguezHenríquez (Eds.). Springer International Publishing, Guadalajara, Mexico, 366-- 383.
[10]
Armando Faz-Hernández and Julio López. 2015. Fast Implementation of Curve25519 Using AVX2. In Progress in Cryptology -- LATINCRYPT 2015 (Lecture Notes in Computer Science), Kristin Lauter and Francisco Rodríguez-Henríquez (Eds.), Vol. 9230. Springer International Publishing, Guadalajara, Mexico, 329--345.
[11]
Agner Fog. 2017. Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs. Technical University of Denmark. http://www.agner.org/optimize/instruction_tables.pdf.
[12]
Vinodh Gopal, Sean Gulley, Wajdi Feghali, Dan Zimmerman, and Ilya Albrekht. 2015. Improving OpenSSL Performance. Technical Report. Intel Corporation. https://software.intel.com/en-us/articles/improving-openssl-performance
[13]
Vinodh Gopal, Jim Gullford, Wajdi Feghali, Erdinc Ozturk, Gil Wolrich, and Martin Dixon. 2010. Processing Multiple Buffers in Parallel to Increase Performance on ® Intel Architecture Processors. Technical Report 324101. Intel Corporation. https://www.intel.com/content/dam/www/public/us/en/documents/ white-papers/communications-ia-multi-buffer-paper.pdf
[14]
Shay Gueron. 2009. Intel's New AES Instructions for Enhanced Performance and Security. In Fast Software Encryption: 16th International Workshop, FSE 2009 Leuven, Belgium, February 22--25, 2009 Revised Selected Papers, Orr Dunkelman (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 51--66.
[15]
Shay Gueron. 2010. Intel® Advanced Encryption Standard (AES) New Instructions Set. Technical Report. Intel Corporation. http://www.intel.com/content/dam/doc/ white-paper/advanced-encryption-standard-new-instructions-set-paper.pdf
[16]
Shay Gueron and Michael Kounavis. 2010. Efficient implementation of the Galois Counter Mode using a carry-less multiplier and a fast reduction algorithm. Inform. Process. Lett. 110, 14 (2010), 549--553.
[17]
Shay Gueron and Vlad Krasnov. 2012. Parallelizing message schedules to accelerate the computations of hash functions. Journal of Cryptographic Engineering 2, 4 (01 Nov 2012), 241--253.
[18]
Shay Gueron and Vlad Krasnov. 2012. Simultaneous Hashing of Multiple Messages. Journal of Information Security 3, 4 (Oct. 2012), 319--325.
[19]
S. Gueron and V. Krasnov. 2016. Accelerating Big Integer Arithmetic Using Intel IFMA Extensions. In 2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH). IEEE, Santa Clara, CA, USA, 32--38.
[20]
Jim Guilford, Kirk Yap, and Vinodh Gopal. 2012. Fast SHA-256 Implementations on Intel ® Architecture Processors. Technical Report 327457-001. Intel Corporation. https://www.intel.com/content/dam/www/public/us/en/documents/ white-papers/sha-256-implementations-paper.pdf
[21]
Sean Gulley, Vinodh Gopal, Kirk Yap, Wajdi Feghali, Jim Gullford, and Gil Wolrich. 2013. Intel ® SHA Extensions New Instructions Supporting the Secure Hash Algorithm on Intel ® Architecture Processors. Technical Report. Intel Corporation. https://software.intel.com/sites/default/files/article/402097/ intel-sha-extensions-white-paper.pdf
[22]
Andreas Hülsing. 2013. W-OTS+ -- Shorter Signatures for Hash-Based Signature Schemes. In Progress in Cryptology -- AFRICACRYPT 2013, Amr Youssef, Abderrahmane Nitaj, and Aboul Ella Hassanien (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 173--188.
[23]
Andreas Hülsing, Denis Butin, Stefan-Lukas Gazdag, Joost Rijneveld, and Aziz Mohaisen. 2018. XMSS: Extended Hash-Based Signatures. Internet-Draft draft-irtfcfrg-xmss-hash-based-signatures-12. Internet Engineering Task Force. https: //datatracker.ietf.org/doc/draft-irtf-cfrg-xmss-hash-based-signatures Work in Progress.
[24]
Andreas Hülsing, Lea Rausch, and Johannes Buchmann. 2013. Optimal Parameters for XMSSMT . In Security Engineering and Intelligence Informatics, Alfredo Cuzzocrea, Christian Kittl, Dimitris E. Simos, Edgar Weippl, and Lida Xu (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 194--208.
[25]
Intel Corporation. 2009. Define SSE2, SSE3 and SSE4. http://www.intel.com/ support/processors/sb/CS-030123.htm. (Jan. 2009).
[26]
Intel Corporation. 2011. Intel® Advanced Vector Extensions Programming Reference. https://software.intel.com/sites/default/files/m/f/7/c/36945. (June 2011).
[27]
Intel Corporation. 2016. Intel® Architecture Instruction Set Extensions Programming Reference. Intel Corporation. https://software.intel.com/sites/default/files/ managed/b4/3a/319433-024.pdf
[28]
Jérémy Jean, Ivica Nikolić, and Thomas Peyrin. 2014. Tweaks and Keys for Block Ciphers: The TWEAKEY Framework. In Advances in Cryptology -- ASIACRYPT 2014: 20th International Conference on the Theory and Application of Cryptology and Information Security, Kaoshiung, Taiwan, R.O.C., December 7--11, 2014, Proceedings, Part II, Palash Sarkar and Tetsu Iwata (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 274--288.
[29]
National Institute of Standards and Technology. 2001. Advanced Encryption Standard (AES). Technical Report FIPS PUB 197. NIST, Gaithersburg, MD, USA.
[30]
National Institute of Standards and Technology. 2001. Recommendation for Block Cipher Modes of Operation. Technical Report NIST SP 800--38A. NIST, Gaithersburg, MD, USA.
[31]
National Institute of Standards and Technology. 2002. Secure Hash Standard. Technical Report FIPS PUB 180--2. NIST, Gaithersburg, MD, USA.
[32]
National Institute of Standards and Technology. 2015. FIPS PUB 202 SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions. Technical Report. Gaithersburg, MD, USA.
[33]
National Institute of Standards and Technology. 2016. Post-Quantum Cryptography Standardization. Technical Report. NIST, Gaithersburg, MD, USA. https://www.nist.gov/pqcrypto
[34]
N. Stephens, S. Biles, M. Boettcher, J. Eapen, M. Eyole, G. Gabrielli, M. Horsnell, G. Magklis, A. Martinez, N. Premillieu, A. Reid, A. Rico, and P. Walker. 2017. The ARM Scalable Vector Extension. IEEE Micro 37, 2 (Mar 2017), 26--39.
[35]
Hongjun Wu and Bart Preneel. 2014. AEGIS: A Fast Authenticated Encryption Algorithm. In Selected Areas in Cryptography -- SAC 2013: 20th International Conference, Burnaby, BC, Canada, August 14--16, 2013, Revised Selected Papers, Tanja Lange, Kristin Lauter, and Petr Lisoněk (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 185--201.

Cited By

View all
  • (2024)An Example of Parallel Merkle Tree Traversal: Post-Quantum Leighton-Micali Signature on the GPUACM Transactions on Architecture and Code Optimization10.1145/365920921:3(1-25)Online publication date: 16-Apr-2024
  • (2024)CUSPX: Efficient GPU Implementations of Post-Quantum Signature SPHINCS+IEEE Transactions on Computers10.1109/TC.2024.3457736(1-14)Online publication date: 2024
  • (2024)Fast Batched Asynchronous Distributed Key GenerationAdvances in Cryptology – EUROCRYPT 202410.1007/978-3-031-58740-5_13(370-400)Online publication date: 29-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
APKC '18: Proceedings of the 5th ACM on ASIA Public-Key Cryptography Workshop
May 2018
66 pages
ISBN:9781450357562
DOI:10.1145/3197507
  • Program Chairs:
  • Keita Emura,
  • Jae Hong Seo,
  • Yohei Watanabe
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 May 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. aes-ni
  2. data encryption
  3. hash-based digital signatures
  4. sha-ni
  5. vector instructions

Qualifiers

  • Research-article

Funding Sources

Conference

ASIA CCS '18
Sponsor:

Acceptance Rates

APKC '18 Paper Acceptance Rate 7 of 20 submissions, 35%;
Overall Acceptance Rate 36 of 103 submissions, 35%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)5
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)An Example of Parallel Merkle Tree Traversal: Post-Quantum Leighton-Micali Signature on the GPUACM Transactions on Architecture and Code Optimization10.1145/365920921:3(1-25)Online publication date: 16-Apr-2024
  • (2024)CUSPX: Efficient GPU Implementations of Post-Quantum Signature SPHINCS+IEEE Transactions on Computers10.1109/TC.2024.3457736(1-14)Online publication date: 2024
  • (2024)Fast Batched Asynchronous Distributed Key GenerationAdvances in Cryptology – EUROCRYPT 202410.1007/978-3-031-58740-5_13(370-400)Online publication date: 29-Apr-2024
  • (2023)Efficient GPU Implementations of Post-Quantum Signature XMSSIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.323334834:3(938-954)Online publication date: 1-Mar-2023
  • (2023)Algorithm for simplifying the SHA-256 operations tree2023 IEEE International Conference on Cyber Security and Resilience (CSR)10.1109/CSR57506.2023.10224939(592-597)Online publication date: 31-Jul-2023
  • (2020)Efficient and Secure Multiparty Computation from Fixed-Key Block Ciphers2020 IEEE Symposium on Security and Privacy (SP)10.1109/SP40000.2020.00016(825-841)Online publication date: May-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media