Skip to main content
Log in

Low-Complex and Low-Power n-dimensional Gram–Schmidt Orthogonalization Architecture Design Methodology

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Gram–Schmidt orthogonalization is a popular fundamental technique of linear algebra, having wide-spread applications in state-of-the art and next-generation signal processing and communication technologies including Blind Source Separation, Independent Component Analysis, MIMO technology, Orthogonal Frequency Division Multiplexing, and QR Decomposition. On the other hand, Coordinate Rotation Digital Computer (CORDIC) is a technique being extensively used for the efficient implementation of complex arithmetic operations in various signal processing and communication modules. For all the aforementioned applications including FastICA and QR decomposition, CORDIC is being used widely for all the modules except GS where still costly multipliers, dividers, square root, and addition operations are being used. It motivated us to investigate the design for GS using CORDIC resulting in low-power and low-complex architecture of the entire design. In this paper, we propose a CORDIC-based low-complexity, low-power architecture design methodology for the n-dimensional GS algorithm where a single CORDIC unit can be re-used for implementation of several processing and communication modules on-chip. The proposed architecture precludes the use of additional arithmetic units to perform costly operations by recursive use of CORDIC, and thus significantly reduces its hardware complexity. The proposed architecture reduces the power consumption by 74–86\(\%\) and the area by 12–40\(\%\) for 3D to 6D GS, respectively, over the conventional approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Availability of Data and Material

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. A. Acharyya et al., Coordinate rotation based low complexity N-D FastICA algorithm and architecture. IEEE Trans. Signal Process. 59(8), 119 (2011)

    Article  MathSciNet  Google Scholar 

  2. A. Acharyya et. al., Hardware reduction methodology for 2-dimensional kurtotic fastica based on algorithmic analysis and architectural symmetry, in IEEE Workshop on Signal Processing Systems (SiPS) 2009, pp. 69–74, (2009)

  3. A. Acharyya et al., Memory reduction methodology for distributed arithmetic based DWT/IDWT exploiting data symmetry. IEEE Trans. Circuits Syst. II Express Briefs 56(4), 285–289 (2009)

    Article  Google Scholar 

  4. B. Adapa et. al., Coordinate rotation-based low complexity \(K\)-means clustering architecture, in IEEE Transaction on VLSI System, pp. 1–5, (2017)

  5. S. Bhardwaj et al., Simplex FastICA: an accelerated and low complex architecture design methodology for \(n\)D FastICA. IEEE Trans. Very Large Scale Integr. Syst. 27(5), 1124–1137 (2019)

    Article  Google Scholar 

  6. S. Bhardwaj et al., Vector cross product and coordinate rotation based n D hybrid FastICA. J. Low Power Electron. 14(2), 351–364 (2018)

    Article  Google Scholar 

  7. S. Bhardwaj et. al., Low complexity hardware accelerator for nD FastICA based on coordinate rotation, in IEEE Workshop on Signal Processing Systems, pp. 1–6, (2017)

  8. S. Bhardwaj et. al., Low complexity single channel ICA architecture design methodology for pervasive healthcare applications, in IEEE International Workshop on Signal Processing Systems (SiPS), Dallas, USA, pp. 39–44, (2016)

  9. Å. Björck, Solving linear least squares problems by Gram-Schmidt orthogonalization. BIT Numer. Math. 7(1), 1–21 (1967)

    Article  MathSciNet  Google Scholar 

  10. Å. Björck et al., Loss and recapture of orthogonality in the modified Gram-Schmidt algorithm. SIAM J. Matrix Anal. Appl. 13(1), 176–190 (1992)

    Article  MathSciNet  Google Scholar 

  11. Å, Björck, Numerics of gram-schmidt orthogonalization, linear Al-gebra and its applications, Science 197: 297–316 (1994)

  12. C. Chou, T. Chen, W. Fang, FPGA Implementation of EEG System-on-Chip with Automatic Artifacts Removal Based on BSS-CCA Method, 2016 IEEE Biomedical Circuits and Systems Conference (BioCAS) (Shanghai, China, 2016), pp. 224–227

  13. M. Clint et al., Efficient Gram-Schmidt orthogonalisation on an array processor. Comput. Sci. 854, 218–228 (1994)

    Google Scholar 

  14. M.E. Davies et al., Source separation using single channel ICA. Signal Process. 87, 1819–1832 (2007)

    Article  Google Scholar 

  15. J. E. Garcia-Bracamonte et. al., An approach on MCSA-based fault detection using independent component analysis and neural networks, in IEEE Transactions on Instrumentation and Measurement, pp. 1–9 (2019)

  16. Y. He et al., Scalable low-complexity GPS and DGPS positioning using approximate QR decomposition. Elsevier Signal Process. 94, 445–455 (2013)

    Article  Google Scholar 

  17. K. Hwang, Computer Arithmetic: Principles, Architecture and Design (Wiley Publishing, London, 1979)

    Google Scholar 

  18. A. Hyvarinen, Fast and robust fixed-point algorithms for independent component analysis. IEEE Trans. Neural Netw. 10(3), 1–96 (1999)

    Article  Google Scholar 

  19. F.J. Lingen, Efficient Gram-Schmidt orthonormalisation on parallel computers. Commun. Numer. Methods Eng. 16, 57–66 (2000)

    Article  Google Scholar 

  20. C. Liu et al., QR Decomposition Architecture Using the Iteration Look-Ahead Modified Gram-Schmidt Algorithm (Devices & Systems, IET Circuits, 2016)

  21. P. Luethi et al., VLSI Implementation of a High-Speed Iterative Sorted MMSE QR Decomposition, in IEEE International Symposium on Circuits and Systems, pp. 1421–1424 (2007)

  22. P. Luethi et al., Gram-Schmidt-based QR decomposition for MIMO detection: VLSI implementation and comparison, in IEEE Asia Pacific Conference on Circuits and Systems, 2008. APCCAS 2008, pp. 830–833 (2008)

  23. J.C. Majithia, Pipeline array for square-root extraction. Electron. Lett. 9(1), 4–5 (1973)

    Article  Google Scholar 

  24. P.K. Meher et al., 50 Years of CORDIC: algorithms, architectures, and applications. IEEE Trans. Circuits Syst. I 56(9), 1893–1907 (2009)

    Article  MathSciNet  Google Scholar 

  25. S. Mopuri, S. Bhardwaj, A. Acharyya, Coordinate rotation based design methodology for square root and division computation. IEEE Trans. Circuits Syst. II Express Briefs 66(7), 1227–1231 (2019)

    Article  Google Scholar 

  26. G.R. Naik et al., Transradial amputee gesture classification using an optimal number of sEMG Sensors: an approach using ICA clustering. IEEE Trans. Neural Syst. Rehabili. Eng. 24(8), 837–846 (2016)

    Article  Google Scholar 

  27. E. Oja et al., The FastICA algorithm revisited: convergence analysis. IEEE Trans. Neural Netw. 17(6), 68 (2006)

    Article  Google Scholar 

  28. C.C. Paige et al., Modified Gram-Schmidt (MGS). Least squares, and backward stability of MGS-GMRES. SIAM J. Matrix Anal. Appl. 28, 264–284 (2006)

    Article  MathSciNet  Google Scholar 

  29. J. Stuller, Ordered modified Gram-Shmidt orthogonalization revised. J. Comput. Appl. Math. 63, 221–227 (1995)

    Article  MathSciNet  Google Scholar 

  30. Tze-Yun Sun et al. Doubly pipelined CORDIC array for digital signal processing algorithms, in IEEE International Conference on Acoustics, Speech and Signal Processing, (1986)

  31. L.D. Van et al., Energy-efficient FastICA implementation for biomedical signal separation. IEEE Trans. Neural Netw. 22(11), 1809–1822 (2011)

    Article  Google Scholar 

  32. L. Van, et. al., Hardware-oriented memory-limited online Fastica algorithm and hardware architecture for signal separation, in 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, pp. 1438–1442, (2019)

  33. M. Vishwanath et al., VLSI architectures for the discrete wavelet transform. IEEE Trans. Circuits Syst. II Analog Digit. Signal Process 42(5), 305–316 (1995)

    Article  Google Scholar 

  34. J.E. Volder, The CORDIC trignometric computing techniques. IRE Trans. Electron. Comput. 8(3), 330–334 (1959)

    Article  Google Scholar 

  35. J. S. Walther, A unified algorithm for elementary functions, in Spring Joint Computer Conference, pp. 379–385 (1971)

  36. N.H.E. Weste et al., CMOS VLSI Design: A Circuits and Systems Perspective, 3rd edn. (Pearson-Addison Wesley, New York, 2005)

    Google Scholar 

  37. C. H. Yang et. al., An 81.6 \(\mu W\) FastICA Processor for Epileptic Seizure Detection, in IEEE Transactions on Biomedical Circuits and Systems, Vol. 9, No. 1 pp. 60–71 (2015)

  38. J. B. Yerrapragada et. al., Coordinate Rotation based Low Complexity Architecture for 3D Single Channel Independent Component Analysis, in International Conference IEEE EMBS, pp. 7322–7325, Osaka, Japan, (2013)

  39. C. Zhang et. al., A heterogeneous reconfigurable cell array for MIMO signal processing, in IEEE Trans. Circuits and Syst.-I, Vol. 62, No. 3, (2015)

Download references

Acknowledgements

This work is partially supported by Ministry of Electronics and Information Technology (Govt of India) funded “Indigenous Intelligent and Scalable Neuromorphic Multi Chip for AI Training and Inference Solutions” project dated March 2021. CAD Tools are supported under MeitY SMDPC2S Program, Government of India.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amit Acharyya.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhardwaj, S., Raghuraman, S., Yerrapragada, J.B. et al. Low-Complex and Low-Power n-dimensional Gram–Schmidt Orthogonalization Architecture Design Methodology. Circuits Syst Signal Process 41, 1633–1659 (2022). https://doi.org/10.1007/s00034-021-01852-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-021-01852-0

Keywords

Navigation