Skip to main content
Log in

Single-Cycle Bit Permutations with MOMR Execution

  • Special Section on Advanced Computer Systems Architecture
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Secure computing paradigms impose new architectural challenges for general-purpose processors. Cryptographic processing is needed for secure communications, storage, and computations. We identify two categories of operations in symmetric-key and public-key cryptographic algorithms that are not common in previous general-purpose workloads: advanced bit operations within a word and multi-word operations. We define MOMR (Multiple Operands Multiple Results) execution or datarich execution as a unified solution to both challenges. It allows arbitrary n-bit permutations to be achieved in one or two cycles, rather than O(n) cycles as in existing RISC processors. It also enables significant acceleration of multi-word multiplications needed by public-key ciphers. We propose two implementations of MOMR: one employs only hardware changes while the other uses Instruction Set Architecture (ISA) support. We show that MOMR execution leverages available resources in typical multi-issue processors with minimal additional cost. Multi-issue processors enhanced with MOMR units provide additional speedup over standard multi-issue processors with the same datapath. MOMR is a general architectural solution for word-oriented processor architectures to incorporate datarich operations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Schneier B. Applied Cryptography. 2nd Ed., John Wiley & Sons, Inc., 1996.

  2. NIST (National Institute of Standards and Technology). Advanced Encryption Standard (AES). FIPS Pub. 197, 2001.

  3. Lee R B, Shi Z, Yang X. Efficient permutation instructions for fast software cryptography. IEEE Micro, 2001, 21(6): 56–69.

    Article  Google Scholar 

  4. Yang X, Vachharajani M, Lee R B. Fast subword permutation instructions based on butterfly networks. In Proc. SPIE 2000, January 2000, pp.80–86.

  5. Yang X, Lee R B. Fast subword permutation instructions using omega and flip network stages. In Proc. the Int. Conf. Computer Design, September 2000, pp.15–22.

  6. Shi Z, Lee R B. Bit permutation instructions for accelerating software cryptography. In Proc. the IEEE Int. Conf. Application-Specific Systems, Architectures and Processors, July 2000, pp.138–148.

  7. McGregor J P, Lee R B. Architectural enhancements for fast subword permutations with repetitions in cryptographic applications. In Proc. the Int. Conf. Computer Design, September 2001, pp.453–461.

  8. Lee R B. Subword parallelism with MAX-2. IEEE Micro, August 1996, 16(4): 51–59.

    Article  Google Scholar 

  9. Diefendorff K et al. AltiVec extension to PowerPC accelerates media processing. IEEE Micro, March 2000, 20(2): 85–95.

    Article  Google Scholar 

  10. IA-64 application developer's architecture guide. Intel Corp., May 1999.

  11. Princeton Architecture Lab for Multimedia and Security. http://palms.ee.princeton.edu/.

  12. Burke J, McDonald J, Austin T. Architectural support for fast symmetric-key cryptography. In Proc. ASPLOS 2000, November 2000, pp.178–189.

  13. Wu L, Weaver C, Austin T. CryptoManiac: A fast flexible architecture for secure communication. In Proc. the 28th Int. Symp. Computer Architecture, June 2001, pp.110–119.

  14. Lee R B, Shi Z, Yang X. How a processor can permute n bits in O(1) cycles. In Proc. Hot Chips 14 –- A Symposium on High Performance Chips, August 2002.

  15. Lee R B, Yang X, Shi Z J. Validating word-oriented processors for bit and multi-word operations. In Proc. the Asia-Pacific Computer Systems Architecture Conference, 2004, pp.473–488.

  16. Palacharla S, Jouppi N P, Smith J E. Complexity-effective superscalar processors. In Proc. the 24th Annual Int. Symp. Computer Architecture, 1997, pp.206–218.

  17. Farell J A, Fischer T C. Issue logic for a 600-MHz out-of-order execution microprocessor. IEEE Journal of Solid-State Circuits, May 1998, 33(5): 707–712.

    Article  Google Scholar 

  18. Onder S, Gupta R. Superscalar execution with direct data forwarding. In Proc. the 1998 ACM/IEEE Conf. Parallel Architectures and Compilation Techniques, 1998, pp.130–135.

  19. Henry D S, Kuszmaul B C, Loh G H, Sami R. Circuits for wide-window superscalar processors. In Proc. the 27th Annual Int. Symp. Computer Architecture, 2000, pp.236–247.

  20. Canal R, Gonzalez A. A Low-complexity issue logic. In Proc. the 14th Int. Conf. Supercomputing, 2000, pp.327–335.

  21. Stark J, Brown M D, Patt Y N. On pipelining dynamic instruction scheduling logic. In Proc. the 33rd Annual ACM/IEEE Int. Symp. Microarchitecture, 2000, pp.57–66.

  22. Brown M D, Stark J, Patt Y N. Select-free instruction scheduling logic. In Proc. the 34th ACM/IEEE Int. Symp. Microarchitecture, December 2001, pp.204–213.

  23. Fiskiran A M, Lee R B. Evaluating instruction set extensions for fast arithmetic on binary finite fields. In Proc. the Int. Conf. Application-Specific Systems, Architectures, and Processors (ASAP), September 2004, pp.125–136.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruby B. Lee.

Additional information

This work was supported in part by the National Science Foundation, U.S.A., under Grant Nos. CCR-0326372 and CCR-0105677.

Ruby B. Lee is the Forrest G. Hamrick Professor of Engineering and Professor of Electrical Engineering at Princeton University, with an affiliated appointment in the Computer Science Department. She is the director of the Princeton Architecture Laboratory for Multimedia and Security (PALMS). Her current research is in designing security and new media support into core computer architecture and designing architectures resilient to Internet-scale epidemics. She is a Fellow of the ACM and the IEEE, Associate Editor-in-Chief of IEEE Micro and Editorial Board member of IEEE Security and Privacy. Prior to joining the Princeton faculty in 1998, Dr. Lee served as chief architect at Hewlett-Packard, responsible at different times for processor architecture, multimedia architecture and security architecture. She was a key architect of PA-RISC used in HP workstations and servers, and of multimedia instructions for microprocessors. She was Consulting Professor of Electrical Engineering at Stanford University. She received the Ph.D. degree in electrical engineering and the M.S. degree in computer science, both from Stanford University, and an A.B. with distinction from Cornell University, where she was a College Scholar. She has been granted over 115 United States and international patents.

Xiao Yang is a Ph.D. candidate in the Department of Electrical Engineering at Princeton University. His research area is in computer architecture with special focus on high performance, scalable architecture for 3D graphics processing. He received the M.S. degree in physics from Northwestern University and the B.S. degree in physics from Peking Univ., P.R. China.

Zhi-Jie Jerry Shi is an assistant professor in the Department of Computer Science and Engineering at the University of Connecticut. He received his Ph.D. degree in electrical engineering from Princeton University in 2004, and this work was done while he was a student at Princeton. He received his B.S. and M.S. degrees from the Computer Science and Technology Department at Tsinghua University, China, in 1992 and 1994, respectively. Dr. Shi is a member of the ACM and the IEEE. His research areas are in computer architecture, cryptography, and high performance, secure computer systems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, R.B., Yang, X. & Shi, ZJ.J. Single-Cycle Bit Permutations with MOMR Execution. J Comput Sci Technol 20, 577–585 (2005). https://doi.org/10.1007/s11390-005-0577-0

Download citation

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-005-0577-0

Keywords

Navigation