Abstract
Field-programmable gate arrays (FPGAs) are one attractive hardware platform for computing the eigenvalue decomposition of low-dimensional symmetric matrices. For this, one popular method is using the parallel Jacobi algorithm based on coordinate rotations digital computer (CORDIC). We here present a novel efficient FPGA architecture for computing the eigenvalue decomposition, whose main idea is from the fact that rotation matrices in Jacobi’s method belong to a category of special sparse matrices. Based on the above characteristic, matrix multiplications in the parallel Jacobi algorithm can be performed by FPGA efficiently. In addition, we provide one solution for Jacobi’s method to decompose the complex Hermitian matrix. Then, our proposed design is compared with state-of-the-arts on one Xilinx XC7V690T FPGA. Due to the high real-time requirement, we finally take the subspace-based direction of arrival (DOA) estimation in wireless communication as an application example.
Similar content being viewed by others
Data Availibility
We provide one MATLAB demo to illustrate the performance of our proposed method in this paper, which are available for the readers. Meanwhile, this demo is also available from the corresponding author on request.
References
M.V. Athi, S.R. Zekavat, A.A. Struthers, Real-time signal processing of massive sensor arrays via a parallel fast converging SVD algorithm: latency, throughput, and resource analysis. IEEE Sens. J. 16(8), 2519–2526 (2016)
I. Bravo, C. Vazquez, A. Gardel, J.L. Lazaro, E. Palomar, High level synthesis FPGA implementation of the Jacobi algorithm to solve the eigen problem. Math. Probl. Eng. 2015, 1–11 (2015)
R.P. Brent, F.T. Luk, C. Van Loan, Computation of the singular value decomposition using mesh-connected processors. J. VLSI Comput. Syst. 1(3), 242–270 (1985)
J. Demmel, K. Veselić, Jacobi’s method is more accurate than QR. SIAM J. Matrix Anal. Appl. 13(4), 1204–1245 (1992)
J. Gotze, S. Paul, M. Sauer, An efficient Jacobi-like algorithm for parallel eigenvalue com-putation. IEEE Trans. Comput. 42(9), 1058–1065 (1993)
R. Granat, B. Kågström, D. Kressner, A novel parallel QR algorithm for hybrid distributed memory HPC systems. SIAM J. Sci. Comput. 32(4), 2345–2378 (2010)
K. Gupta, M. Wajid, R. Muzammil, S.J. Arif, Hardware architecture for eigenvalues computation using the modified Jacobi algorithm on FPGA. in 5-Th IEEE Inter. Conf. Signal Proc., Comp. & Control, India (2019), pp. 10–14
X. Hu, Z.H. Lu, A configurable hardware architecture for runtime application of network calculus. Int. J. Parallel Prog. 43 (2021)
S.K. Jha, R.D.S. Yadava, Denoising by singular value decomposition and its application to electronic nose data processing. IEEE Sens. J. 11(1), 35–44 (2011)
S. Kasap, S. Redif, Novel field-programmable gate array architecture for computing the eigenvalue decomposition of para-Hermitian polynomial matrices. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 22(3), 522–536 (2014)
I. Koseoglu, E. Ozturk, T. Ayhan, M.E. Yalcin, An FPGA implementation of Givens rotation based on digital architecture for computing eigenvalues of asymmetric matrix, in 13-Th Inter. Conf. Elect. & Elect. Engin., Turkey, 1–4 (2021)
Z. Li, W. Wang, R. Jiang, S. Ren, Hardware acceleration of MUSIC algorithm for sparse arrays and uniform linear arrays. IEEE Trans. Circuits Syst. I: Reg. Pap. 69(7), 2941–2954 (2022)
S. Pal, S. Pathak, S. Rajasekaran, On speeding-up parallel Jacobi iterations for SVDs, in Proc. IEEE Int. Conf. High Perform. Comput. Commun., Sydney, NWS, Australia, 9–16 (2016)
M. Pesavento, A.B. Gerhman, M. Haardt, Unitary root-MUSIC with a real-valued eigendecom-position: a theoretical and experimental performance study. IEEE Trans. Signal Process. 48(5), 1306–1315 (2000)
R.O. Schmidt, Multiple emitter and signal parameter estimation. IEEE Trans. Antennas Propag. 34(3), 271–280 (1986)
M. Shabany, D. Patel, P. Gulak, A low-latency low-power QR decomposition ASIC implementation in 0.13µm CMOS. IEEE Trans. Circuits Syst. I, Reg. Pap. 60(2), 327–340 (2013)
Z. Shi, Q. He, Y. Liu, Accelerating parallel Jacobi method for matrix eigenvalue decomposition in DOA estimation algorithm. IEEE Trans. Veh. Technol. 69(6), 6275–6285 (2020)
S. Singer, V. Novaković, D. Davidović, K. Bokulić, A. Ušćumlić, Three-level parallel Jacobi algorithms for Hermitian matrices. Appl. Math. Comput. 218(9), 5704–5725 (2012)
A. Srivatsa, M. Mansour, S. Rheindt, D. Gabriel, A. Herkersdorf, DynaCo: dynamic coherence management for tiled manycore architectures. Int. J. Parallel Prog. 49, 570–599 (2021)
G.W. Stewart, A Jacobi-like algorithm for computing the Schur decomposition of a non-Hermitian matrix. SIAM J. Sci. Comput. 6(4), 835–864 (1985)
B. Sukhwani, M. Thoennes, H. Min, P. Duve, B. Brezzo, S. Asaad, D. Dillenberger, A hardware/software approach for database query acceleration with FPGAs. Int. J. Parallel Prog. 43, 1129–1159 (2015)
M.K. Tekleohannes, V. Rybalkin, M.M. Ghaffar, N. Wehn, A. Dengel, iDocChip: a configurable hardware architecture for historical document image processing. Int. J. Parallel Prog. 49, 253–284 (2021)
W. Vanderbauwhedge, S.W. Nabi, C. Urlea, Type-driven automated program transformations and cost modeling for optimising streaming programs on FPGAs. Int. J. Parallel Prog. 47, 114–136 (2019)
J.E. Volder, The birth of CORDIC. J. VLSI Signal Process. 25, 101–105 (2000)
D.S. Watkins, Fundamentals of matrix computations, 2nd edn. (John Wiely & Sons Inc, New York, 2002)
L. Weisner, Algebraic equations: an introduction to the theories of Lagrange and Galois. Am. Math. Mon. 38(2), 103–105 (1931)
Xilinx, https://www.xilinx.com
D. Yan, W.X. Wang, L. Zuo, X.W. Zhang, A novel scheme for real-time max/min-set-selection sorters on FPGA. IEEE Trans. Circuits Syst. II: Express Briefs 68(7), 1–5 (2021)
X.W. Zhang, L. Zuo, M. Li, J.X. Guo, High-throughput FPGA implementation of matrix inversion for control systems. IEEE Trans. Ind. Electron. 68(7), 6205–6216 (2021)
X.W. Zhang, L. Zuo, D.D. Yang, J.X. Guo, Coherent-like integration for PD radar target detection based on short-time Fourier transform. IET Radar Sonar Navig. 14(1), 156–166 (2020)
L. Zuo, M. Li, X.W. Zhang, P. Zhang, Y. Wu, CFAR detection of range-spread targets based on the time-frequency decomposition feature of two adjacent returned signals. IEEE Trans. Signal Process. 61(24), 6307–6319 (2013)
Acknowledgements
The authors acknowledge the support of science research project of department of transport of Shaanxi province in 2020: research and application of refined maintenance evaluation and decay model based on 3D pavement (No. 20-24K).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yan, D., Wang, WX. & Zhang, XW. High-Performance Matrix Eigenvalue Decomposition Using the Parallel Jacobi Algorithm on FPGA. Circuits Syst Signal Process 42, 1573–1592 (2023). https://doi.org/10.1007/s00034-022-02180-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-022-02180-7