Abstract
Gram–Schmidt orthogonalization is a popular fundamental technique of linear algebra, having wide-spread applications in state-of-the art and next-generation signal processing and communication technologies including Blind Source Separation, Independent Component Analysis, MIMO technology, Orthogonal Frequency Division Multiplexing, and QR Decomposition. On the other hand, Coordinate Rotation Digital Computer (CORDIC) is a technique being extensively used for the efficient implementation of complex arithmetic operations in various signal processing and communication modules. For all the aforementioned applications including FastICA and QR decomposition, CORDIC is being used widely for all the modules except GS where still costly multipliers, dividers, square root, and addition operations are being used. It motivated us to investigate the design for GS using CORDIC resulting in low-power and low-complex architecture of the entire design. In this paper, we propose a CORDIC-based low-complexity, low-power architecture design methodology for the n-dimensional GS algorithm where a single CORDIC unit can be re-used for implementation of several processing and communication modules on-chip. The proposed architecture precludes the use of additional arithmetic units to perform costly operations by recursive use of CORDIC, and thus significantly reduces its hardware complexity. The proposed architecture reduces the power consumption by 74–86\(\%\) and the area by 12–40\(\%\) for 3D to 6D GS, respectively, over the conventional approach.
Similar content being viewed by others
Availability of Data and Material
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
References
A. Acharyya et al., Coordinate rotation based low complexity N-D FastICA algorithm and architecture. IEEE Trans. Signal Process. 59(8), 119 (2011)
A. Acharyya et. al., Hardware reduction methodology for 2-dimensional kurtotic fastica based on algorithmic analysis and architectural symmetry, in IEEE Workshop on Signal Processing Systems (SiPS) 2009, pp. 69–74, (2009)
A. Acharyya et al., Memory reduction methodology for distributed arithmetic based DWT/IDWT exploiting data symmetry. IEEE Trans. Circuits Syst. II Express Briefs 56(4), 285–289 (2009)
B. Adapa et. al., Coordinate rotation-based low complexity \(K\)-means clustering architecture, in IEEE Transaction on VLSI System, pp. 1–5, (2017)
S. Bhardwaj et al., Simplex FastICA: an accelerated and low complex architecture design methodology for \(n\)D FastICA. IEEE Trans. Very Large Scale Integr. Syst. 27(5), 1124–1137 (2019)
S. Bhardwaj et al., Vector cross product and coordinate rotation based n D hybrid FastICA. J. Low Power Electron. 14(2), 351–364 (2018)
S. Bhardwaj et. al., Low complexity hardware accelerator for nD FastICA based on coordinate rotation, in IEEE Workshop on Signal Processing Systems, pp. 1–6, (2017)
S. Bhardwaj et. al., Low complexity single channel ICA architecture design methodology for pervasive healthcare applications, in IEEE International Workshop on Signal Processing Systems (SiPS), Dallas, USA, pp. 39–44, (2016)
Å. Björck, Solving linear least squares problems by Gram-Schmidt orthogonalization. BIT Numer. Math. 7(1), 1–21 (1967)
Å. Björck et al., Loss and recapture of orthogonality in the modified Gram-Schmidt algorithm. SIAM J. Matrix Anal. Appl. 13(1), 176–190 (1992)
Å, Björck, Numerics of gram-schmidt orthogonalization, linear Al-gebra and its applications, Science 197: 297–316 (1994)
C. Chou, T. Chen, W. Fang, FPGA Implementation of EEG System-on-Chip with Automatic Artifacts Removal Based on BSS-CCA Method, 2016 IEEE Biomedical Circuits and Systems Conference (BioCAS) (Shanghai, China, 2016), pp. 224–227
M. Clint et al., Efficient Gram-Schmidt orthogonalisation on an array processor. Comput. Sci. 854, 218–228 (1994)
M.E. Davies et al., Source separation using single channel ICA. Signal Process. 87, 1819–1832 (2007)
J. E. Garcia-Bracamonte et. al., An approach on MCSA-based fault detection using independent component analysis and neural networks, in IEEE Transactions on Instrumentation and Measurement, pp. 1–9 (2019)
Y. He et al., Scalable low-complexity GPS and DGPS positioning using approximate QR decomposition. Elsevier Signal Process. 94, 445–455 (2013)
K. Hwang, Computer Arithmetic: Principles, Architecture and Design (Wiley Publishing, London, 1979)
A. Hyvarinen, Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 1–96 (1999)
F.J. Lingen, Efficient Gram-Schmidt orthonormalisation on parallel computers. Commun. Numer. Methods Eng. 16, 57–66 (2000)
C. Liu et al., QR Decomposition Architecture Using the Iteration Look-Ahead Modified Gram-Schmidt Algorithm (Devices & Systems, IET Circuits, 2016)
P. Luethi et al., VLSI Implementation of a High-Speed Iterative Sorted MMSE QR Decomposition, in IEEE International Symposium on Circuits and Systems, pp. 1421–1424 (2007)
P. Luethi et al., Gram-Schmidt-based QR decomposition for MIMO detection: VLSI implementation and comparison, in IEEE Asia Pacific Conference on Circuits and Systems, 2008. APCCAS 2008, pp. 830–833 (2008)
J.C. Majithia, Pipeline array for square-root extraction. Electron. Lett. 9(1), 4–5 (1973)
P.K. Meher et al., 50 Years of CORDIC: algorithms, architectures, and applications. IEEE Trans. Circuits Syst. I 56(9), 1893–1907 (2009)
S. Mopuri, S. Bhardwaj, A. Acharyya, Coordinate rotation based design methodology for square root and division computation. IEEE Trans. Circuits Syst. II Express Briefs 66(7), 1227–1231 (2019)
G.R. Naik et al., Transradial amputee gesture classification using an optimal number of sEMG Sensors: an approach using ICA clustering. IEEE Trans. Neural Syst. Rehabili. Eng. 24(8), 837–846 (2016)
E. Oja et al., The FastICA algorithm revisited: convergence analysis. IEEE Trans. Neural Netw. 17(6), 68 (2006)
C.C. Paige et al., Modified Gram-Schmidt (MGS). Least squares, and backward stability of MGS-GMRES. SIAM J. Matrix Anal. Appl. 28, 264–284 (2006)
J. Stuller, Ordered modified Gram-Shmidt orthogonalization revised. J. Comput. Appl. Math. 63, 221–227 (1995)
Tze-Yun Sun et al. Doubly pipelined CORDIC array for digital signal processing algorithms, in IEEE International Conference on Acoustics, Speech and Signal Processing, (1986)
L.D. Van et al., Energy-efficient FastICA implementation for biomedical signal separation. IEEE Trans. Neural Netw. 22(11), 1809–1822 (2011)
L. Van, et. al., Hardware-oriented memory-limited online Fastica algorithm and hardware architecture for signal separation, in 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, pp. 1438–1442, (2019)
M. Vishwanath et al., VLSI architectures for the discrete wavelet transform. IEEE Trans. Circuits Syst. II Analog Digit. Signal Process 42(5), 305–316 (1995)
J.E. Volder, The CORDIC trignometric computing techniques. IRE Trans. Electron. Comput. 8(3), 330–334 (1959)
J. S. Walther, A unified algorithm for elementary functions, in Spring Joint Computer Conference, pp. 379–385 (1971)
N.H.E. Weste et al., CMOS VLSI Design: A Circuits and Systems Perspective, 3rd edn. (Pearson-Addison Wesley, New York, 2005)
C. H. Yang et. al., An 81.6 \(\mu W\) FastICA Processor for Epileptic Seizure Detection, in IEEE Transactions on Biomedical Circuits and Systems, Vol. 9, No. 1 pp. 60–71 (2015)
J. B. Yerrapragada et. al., Coordinate Rotation based Low Complexity Architecture for 3D Single Channel Independent Component Analysis, in International Conference IEEE EMBS, pp. 7322–7325, Osaka, Japan, (2013)
C. Zhang et. al., A heterogeneous reconfigurable cell array for MIMO signal processing, in IEEE Trans. Circuits and Syst.-I, Vol. 62, No. 3, (2015)
Acknowledgements
This work is partially supported by Ministry of Electronics and Information Technology (Govt of India) funded “Indigenous Intelligent and Scalable Neuromorphic Multi Chip for AI Training and Inference Solutions” project dated March 2021. CAD Tools are supported under MeitY SMDPC2S Program, Government of India.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bhardwaj, S., Raghuraman, S., Yerrapragada, J.B. et al. Low-Complex and Low-Power n-dimensional Gram–Schmidt Orthogonalization Architecture Design Methodology. Circuits Syst Signal Process 41, 1633–1659 (2022). https://doi.org/10.1007/s00034-021-01852-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-021-01852-0