An Efficient Scalable RNS Architecture for Large Dynamic Ranges

Matutino, Pedro Miguens; Chaves, Ricardo; Sousa, Leonel

doi:10.1007/s11265-014-0875-2

An Efficient Scalable RNS Architecture for Large Dynamic Ranges

Published: 13 March 2014

Volume 77, pages 191–205, (2014)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Pedro Miguens Matutino¹,
Ricardo Chaves² &
Leonel Sousa²

343 Accesses
8 Citations
Explore all metrics

Abstract

This paper proposes an efficient scalable Residue Number System (RNS) architecture supporting moduli sets with an arbitrary number of channels, allowing to achieve larger dynamic range and a higher level of parallelism. The proposed architecture allows the forward and reverse RNS conversion, by reusing the arithmetic channel units. The arithmetic operations supported at the channel level include addition, subtraction, and multiplication with accumulation capability. For the reverse conversion two algorithms are considered, one based on the Chinese Remainder Theorem and the other one on Mixed-Radix-Conversion, leading to implementations optimized for delay and required circuit area. With the proposed architecture a complete and compact RNS platform is achieved . Experimental results suggest gains of 17 % in the delay in the arithmetic operations, with an area reduction of 23 % regarding the RNS state of the art. When compared with a binary system the proposed architecture allows to perform the same computation 20 times faster alongside with only 10 % of the circuit area resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on implementations of homomorphic encryption schemes

Article 14 April 2023

Secret Key Recovery Attack on Masked and Shuffled Implementations of CRYSTALS-Kyber and Saber

TFHE: Fast Fully Homomorphic Encryption Over the Torus

Article 25 April 2019

References

Alia, G., & Martinelli, E. (1996). Designing multioperand modular adders. Electronics Letters, 32(1), 22–23. doi:10.1049/el:19960026.
Article Google Scholar
Ananda Mohan, P. (2004). Reverse converters for the moduli sets {2^2N − 1, 2^N, 2^2N + 1} and {2^N − 3, 2^N + 1, 2^N − 1, 2^N + 3}. In SPCOM ’04 (pp. 188–192).
Barraclough, S., Sotheran, M., Burgin, K., Wise, A., Vadher, A., Robbins, W., Forsyth, R. (1989). The design and implementation of the IMS A110 image and signal processor (pp. 24.5/1 –24.5/4). doi:10.1109/CICC.1989.56826.
Wang, C.-L. (1994). New bit serial VLSI implementation of RNS FIR digital filters. IEEE Transactions Circuits Systems II, 41(11), 768–772.
Article Google Scholar
Chaves, R., & Sousa, L. (2004). {2ⁿ + 1, 2^n+k, 2ⁿ − 1}: a new RNS moduli set extension. In EUROMICRO systems on digital system design (pp. 210–217).
Chren, W. (1995). RNS-based enhancements for direct digital frequency synthesis. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 42(8), 516–524. doi:10.1109/82.404073.
Article MATH Google Scholar
Corporation, F.T. (2006). 90 nm technology standard cell library. Tech. rep., Faraday Technology Corporation.
Dale Gallaher, F.E.P., & Srinivasan, P. (1997). The digit paralell method for fast RNS to weighted number system conversion for specific moduli {2ⁿ − 1, 2ⁿ, 2ⁿ + 1}. IEEE Transactions on Circuits and Systems - II: Analog and Digital Signal Processing, 44(1), 53–57.
Article MATH Google Scholar
Di Claudio, E.D., Piazza, F., Orlandi, G. (1995). Fast combinatorial RNS processor for DSP applications. IEEE Transactions on Computers, 44(5), 624–633.
Article MATH Google Scholar
Ostermann, J., Bormans, J., List, P., Marpe, D., Narroschke, M., Pereira, F., Stockhammer, T., Wedi, T. (2004). Video coding with H.264/AVC: tools, performance, and complexity. Circuits and Systems Magazine, IEEE 4(1), 7–28.
Google Scholar
Matutino, P., Pettenghi, H., Chaves, R., Sousa, L. (2012). RNS arithmetic units for modulo {2ⁿ ± k}. In 15th euromicro conference on digital system design (DSD), 2012 (pp. 795 –802). doi:10.1109/DSD.2012.114.
Matutino, P., & Sousa, L. (2008). An RNS based specific processor for computing the minimum sum-of-absolute-differences. In 11th EUROMICRO conference on digital system design architectures, methods and tools, 2008. DSD ’08 (pp. 768–775). doi:10.1109/DSD.2008.107.
Matutino, P.M., Chaves, R., Sousa, L. (2010). Arithmetic units for RNS moduli {2ⁿ − 3} and {2ⁿ + 3} operations. In 13th EUROMICRO conference on digital system design: architectures, methods and tools (pp. 243–246). doi:10.1109/DSD.2010.77.
Matutino, P.M., Chaves, R., Sousa, L. (2011). Binary-to-RNS conversion units for moduli {2ⁿ ± 3}. In 14th EUROMICRO conference on digital system design: architectures, methods and tools (pp. 460–467).
Miguens Matutino, P., Chaves, R., Sousa, L. (2013). A compact and scalable rns architecture. In IEEE 24th international conference on application-specific systems, architectures and processors (ASAP), 2013 (pp. 125–132). doi:10.1109/ASAP.2013.6567565.
Mohan, P., & Premkumar, A. (2007). RNS-to-binary converters for two four-moduli sets 2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ − 1 and 2ⁿ − 1, 2ⁿ, 2ⁿ+1, 2ⁿ⁺¹ + 1. IEEE Transactions on Circuits and Systems I: Regular Papers, 54(6), 1245–1254. doi:10.1109/TCSI.2007.895515.
Article MathSciNet Google Scholar
Omondi, A., & Premkumar, B. (Eds.) (2007). Residue number systems: theory and implementation. London: Imperial College Press.
Google Scholar
Pettenghi, H., Chaves, R., Sousa, L. (2012). RNS reverse converters for moduli sets with dynamic ranges up to (8n+1)-bit. IEEE Transactions on Circuits and Systems I, PP(99), 1–14. doi:10.1109/TCSI.2012.2220460.
Google Scholar
Piestrak, S. (1994). Design of residue generators and multi operand modular adders using carry-save adders. IEEE Transactions on Computers, 43(1), 68–77. doi:10.1109/12.250610.
Article Google Scholar
Patel, R.A., Benaissa, M., Boussakta, S. (2006). Efficient new approach for modulo {2ⁿ − 1} addition in RNS. In IEE proceedings on computers and digital techniques (Vol. 153, pp. 399–405).
Patel, R.A., Benaissa, M., Boussakta, S., Powell, N. (2005). Power-delay-area efficient modulo {2ⁿ + 1} adder architecture for RNS. Electronics Letters, 41(5), 231–232.
Article Google Scholar
Sheu, M.H., Lin, S.H., Chen, C., Yang, S.W. (2004). An efficient VLSI design for a residue to binary converter for general balance moduli {2ⁿ − 3, 2ⁿ + 1, 2ⁿ − 1, 2ⁿ + 3}. IEEE Transactions on Circuits and Systems II: Express Briefs, 51(3), 152–155. doi:10.1109/TCSII.2003.821516.
Article Google Scholar
Skavantzos, A., & Abdallah, M. (1999). Implementation issues of the two-level residue number system with pairs of conjugate moduli. IEEE Transactions on Signal Processing, 47(3), 826–838.
Article Google Scholar
Skavantzos, A., Abdallah, M., Stouraitis, T., Schinianakis, D. (2009). Design of a balanced 8-modulus RNS. In 16th IEEE international conference on electronics, circuits, and systems, 2009. ICECS 2009 (pp. 61–64). doi:10.1109/ICECS.2009.5410923.
Skavantzos, A., & Wang, Y. (1999). Applications of new Chinese remainder theorems to RNS with two pairs of conjugate moduli. In 1999 IEEE Pacific Rim conference on communications, computers and signal processing, 165–168. doi:10.1109/PACRIM.1999.799503.
Slegel, T., & Veracca, R. (1991). Design and performance of the IBM enterprise system / 9000 type 9121 vector facility. IBM Journal of Research and Development, 35, 367–381.
Article Google Scholar
Sousa, L., & Antao, S. (2012). MRC-based RNS reverse converters for the four-moduli sets {2ⁿ + 1, 2ⁿ − 1, 2ⁿ, 2⁽2n + 1) − 1} and {2ⁿ + 1, 2ⁿ − 1, 2⁽2n),2⁽2n + 1) − 1}. IEEE Transactions on Circuits and Systems II: Express Briefs, 59(4), 244–248. doi:10.1109/TCSII.2012.2188456.
Article Google Scholar
Szabo, N., & Tanaka, R. (1967). Residue arithmetic and its applications to computer technology. McGraw-Hill.
Vergos, H., Bakalis, D., Efstathiou, C. (2008). Efficient modulo 2ⁿ + 1 multi-operand adders. In 15th IEEE international conference on electronics, circuits and systems, 2008. ICECS 2008 (pp. 694–697). doi:10.1109/ICECS.2008.4674948.
Wang, Y. (2000). Residue-to-binary converters based on new chinese remainder theorems. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 47(3), 197–205. doi:10.1109/82.826745.
Article MATH Google Scholar
Wang, Y., Song, X., Aboulhamid, M., Shen, H. (2002). Adder based residue to binary number converters for {2ⁿ − 1, 2ⁿ, 2ⁿ + 1}. IEEE Transactions on Signal Processing, 50(7), 1772–1779. doi:10.1109/TSP.2002.1011216.
Article MathSciNet Google Scholar
Wang, Z., Jullien, G.A., Miller, W.C. (1996). An efficient tree architecture for modulo 2ⁿ + 1 multiplication. Journal VLSI Signal Processing System, 14(3), 241–248. doi:10.1007/BF00929618.
Article Google Scholar
Zimmermann, R. (1999). Efficient VLSI implementation of modulo {2ⁿ ± 1} addition and multiplication. In 14th IEEE symposium on computer arithmetic (pp. 158–167).

Download references

Author information

Authors and Affiliations

ISEL / INESC-ID / IST, Universidade de Lisboa, Lisbon, Portugal
Pedro Miguens Matutino
INESC-ID / IST, Universidade de Lisboa, Lisbon, Portugal
Ricardo Chaves & Leonel Sousa

Authors

Pedro Miguens Matutino
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Chaves
View author publications
You can also search for this author in PubMed Google Scholar
Leonel Sousa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pedro Miguens Matutino.

Additional information

This work was partially supported by national funds through Fundação para a Ciência e a Tecnologia (FCT) under project PEst-OE/EEI/LA0021/2013, project “FARNuSyC - Framework for Automatic RNS-Based Computation” (reference number EXPL/EEI-ELC/1572/2013), and by the PROTEC Program funds under the research grant SFRH/PROTEC/49763/2009.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Matutino, P.M., Chaves, R. & Sousa, L. An Efficient Scalable RNS Architecture for Large Dynamic Ranges. J Sign Process Syst 77, 191–205 (2014). https://doi.org/10.1007/s11265-014-0875-2

Download citation

Received: 09 September 2013
Revised: 16 January 2014
Accepted: 06 February 2014
Published: 13 March 2014
Issue Date: October 2014
DOI: https://doi.org/10.1007/s11265-014-0875-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Efficient Scalable RNS Architecture for Large Dynamic Ranges

Abstract

Access this article

Similar content being viewed by others

A survey on implementations of homomorphic encryption schemes

Secret Key Recovery Attack on Masked and Shuffled Implementations of CRYSTALS-Kyber and Saber

TFHE: Fast Fully Homomorphic Encryption Over the Torus

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An Efficient Scalable RNS Architecture for Large Dynamic Ranges

Abstract

Access this article

Similar content being viewed by others

A survey on implementations of homomorphic encryption schemes

Secret Key Recovery Attack on Masked and Shuffled Implementations of CRYSTALS-Kyber and Saber

TFHE: Fast Fully Homomorphic Encryption Over the Torus

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation