Skip to main content
Log in

High-Performance System-on-Chip-Based Accelerator System for Polynomial Matrix Multiplications

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Polynomial matrix computations, such as polynomial matrix multiplication (PMM) and eigenvalue factorization of parahermitian matrices, have played an important role in a growing number of applications, in recent times. However, the computational complexity and expense of such operations impose a profound limit on their applicability. In a recent paper, we introduced a systolic array-based parallel architecture for PMM, which was adequately efficient, but limited in its application. In this paper, we propose a second-generation hardware solution which boasts more versatility, efficiency and scalability compared to our previous design. This is achieved through the design of a highly versatile PMM accelerator which supports polynomial matrices of any size, as a component of the embedded system developed within the Xilinx Zynq-7000 AP SoC. Experimental results demonstrate the efficiency and effectiveness of our novel SoC-based PMM accelerator in the context of subband coding, where maximum speedups of \(85\times \) and \(33\times \) are accomplished, without compromising the accuracy, in comparison with two highly optimized and multi-threaded software-only implementations running on a dual-core ARM Cortex-A9 processor and a Intel Core i7-4510U CPU, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. M.A. Alrmah, S. Weiss, S. Lambotharan, An extension of the MUSIC algorithm to broadband scenarios using a polynomial eigenvalue decomposition. in Proceedings of European Signal Processing Conference, pp. 629–633 (2011)

  2. R. Bracewell, The Fourier Transform and Its Applications (McGraw-Hill Higher Education, New York, 1999)

    MATH  Google Scholar 

  3. R. Brandt, M. Bengtsson, Wideband MIMO channel diagonalization in the time domain. inProceedings of (IEEE International Symposium on Personal, Indoor and Mobile Radio Communications, 2011), pp. 1958–1962

  4. Digilent inc (2019) ZedBoard Zynq-7000 ARM/FPGA SoC Development Board. URL http://store.digilentinc.com/zedboard-zynq-7000-arm-fpga-soc-development-board/

  5. J. Foster, J.G. McWhirter, S. Lambotharan, I. Proudler, M. Davies, J. Chambers, Polynomial matrix QR decomposition for the decoding of frequency selective multiple-input multiple-output communication channels. IET Signal Process. 6(7), 704–71 (2012)

    Article  MathSciNet  Google Scholar 

  6. G.H. Golub, C.F.V. Loan, Matrix Computations (John Hopkins University Press, Baltimore, 1996)

    MATH  Google Scholar 

  7. T. Kailath, Linear Systems (Prentice Hall, Upper Saddle River, 1980)

    MATH  Google Scholar 

  8. S. Kasap, S. Redif, Novel field-programmable gate array architecture for computing the eigenvalue decomposition of para-Hermitian polynomial matrices. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 22(3), 522–536 (2014)

    Article  Google Scholar 

  9. V.K.P. Kumar, Y.C. Tsai, On synthesizing optimal family of linear systolic arrays for matrix multiplication. IEEE Trans. Comput. 40(6), 770–774 (1991)

    Article  Google Scholar 

  10. S.Y. Kung, Y. Wu, X. Zhang, Bezout space-time precoders and equalizers for MIMO channels. IEEE Trans. Signal Process. 50(10), 2499–2514 (2002)

    Article  Google Scholar 

  11. RH. Lambert, M. Joho, H. Mathis, Polynomial singular values for number of wideband source estimation and principal components analysis. in Proceedings of International Conference on Independent Component Analysis, pp. 379–383 (2001)

  12. J.G. McWhirter, P.D. Baxter, T. Cooper, S. Redif, J. Foster, An EVD algorithm for para-Hermitian polynomial matrices. IEEE Trans. Signal Process. 55(5), 2158–2169 (2007)

    Article  MathSciNet  Google Scholar 

  13. N. Moret, A. Tonello, S. Weiss, MIMO precoding for filter bank modulation systems based on PSVD. in Proceedings of IEEE Vehicular Technology Conference, pp. 1–5 (2011)

  14. P. Moulin, M.K. Mihcak, Theory and design of signal-adapted FIR paraunitary filter banks. IEEE Trans. Signal Process. 46(4), 920–929 (1998)

    Article  Google Scholar 

  15. A.V. Oppenheim, C.J. Weinstein, Effects of finite register length in digital filtering and the fast Fourier transform. Proc. IEEE 60(8), 957–976 (1972)

    Article  Google Scholar 

  16. A. Papoulis, Probability, Random Variables, and Stochastic Processes (McGraw-Hill, New York, 1991)

    MATH  Google Scholar 

  17. S. Redif, Fetal electrocardiogram estimation using polynomial eigenvalue decomposition. Turk. J. Electr. Eng. Comput. Sci. 24(4), 2483–2497 (2014)

    Google Scholar 

  18. S. Redif, Convolutive blind signal separation via polynomial matrix generalised eigenvalue decomposition. Electron. Lett. 53(2), 87–89 (2017)

    Article  Google Scholar 

  19. S. Redif, S. Kasap, Novel reconfigurable hardware architecture for polynomial matrix multiplications. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 23(3), 454–465 (2015)

    Article  Google Scholar 

  20. S. Redif, S. Weiss, J.G. McWhirter, An approximate polynomial matrix eigenvalue decomposition algorithm for para-Hermitian matrices. in Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pp. 421–425 (2011)

  21. S. Redif, S. Weiss, J.G. McWhirter, Sequential matrix diagonalization algorithms for polynomial EVD of Parahermitian matrices. IEEE Trans. Signal Process. 63(1), 81–89 (2015)

    Article  MathSciNet  Google Scholar 

  22. P.A. Regalia, P. Loubaton, Rational subspace estimation using adaptive lossless filters. IEEE Trans. Signal Process. 40(10), 2392–2405 (1992)

    Article  Google Scholar 

  23. A. Tkacenko, Approximate eigenvalue decomposition of para-Hermitian systems through successive FIR paraunitary transformations. in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4074–4077 (2010)

  24. P.P. Vaidyanathan, Multirate Systems and Filter Banks (Prentice Hall, Upper Saddle River, 1993)

    MATH  Google Scholar 

  25. P.P. Vaidyanathan, Theory of optimal orthonormal subband coders. IEEE Trans. Signal Process. 46(6), 1528–1543 (1998)

    Article  Google Scholar 

  26. Z. Wang, J.G. McWhirter, S. Weiss, Multichannel spectral factorization algorithm using polynomial matrix eigenvalue decomposition. in Proceedings of Asilomar Conference on Signals, Systems and Computers, pp. 1714–1718 (2015)

  27. S. Weiss, S. Redif, T. Cooper, C. Liu, P. Baxter, J.G. McWhirter, Paraunitary oversampled filter bank design for channel coding. EURASIP J. Appl. Signal Process. 2006, 1–10 (2006)

    MATH  Google Scholar 

  28. S. Weiss, M. Alrmah, S. Lambotharan, J.G. McWhirter, M. Kaveh, Broadband angle of arrival estimation methods in a polynomial matrix decomposition framework. in Proceedings of IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, pp. 109–112 (2013)

  29. S. Weiss, J. Pestana, I.K. Proudler, On the existence and uniqueness of the eigenvalue decomposition of a Parahermitian matrix. IEEE Trans. Signal Process. 66(10), 2659–2672 (2018)

    Article  MathSciNet  Google Scholar 

  30. Xilinx inc (2011) AXI Reference Guide. URL https://www.xilinx.com/support/documentation/ip_documentation/ug761_axi_reference_guide.pdf

  31. Xilinx inc (2014) PetaLinux Tools Documentation: Reference Guide. URL https://www.xilinx.com/support/documentation/sw_manuals/petalinux2014_4/ug1144-petalinux-tools-reference-guide.pdf

  32. Xilinx inc (2015) FIFO Generator v12.0 LogiCORE IP Product Guide. URL https://www.xilinx.com/support/documentation/ip_documentation/fifo_generator/pg057-fifo-generator.pdf

  33. Xilinx inc (2017) Fast Fourier Transform v9.0 LogiCORE IP Product Guide. URL https://www.xilinx.com/support/documentation/ip_documentation/xfft/v9_0/pg109-xfft.pdf

  34. Xilinx inc (2018a) 7 Series DSP48E1 Slice User Guide. URL https://www.xilinx.com/support/documentation/user_guides/ug479_7Series_DSP48E1.pdf

  35. Xilinx inc (2018b) AXI DMA v7.1 LogiCORE IP Product Guide. URL https://www.xilinx.com/support/documentation/ip_documentation/axi_dma/v7_1/pg021_axi_dma.pdf

  36. Xilinx inc (2018c) Zynq-7000 All Programmable SoC Data Sheet: Overview Data. URL https://www.xilinx.com/support/documentation/data_sheets/ds190-Zynq-7000-Overview.pdf

  37. Xilinx inc (2018d) Zynq UltraScale+ MPSoC Data Sheet: Overview. URL https://www.xilinx.com/support/documentation/data_sheets/ds891-zynq-ultrascale-plus-overview.pdf

  38. Xilinx inc (2019a) Vivado Design Suite. URL https://www.xilinx.com/products/design-tools/vivado.html

  39. Xilinx inc (2019b) Vivado System Generator for DSP. URL https://www.xilinx.com/products/design-tools/vivado/integration/sysgen.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Server Kasap.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kasap, S., Redif, S. High-Performance System-on-Chip-Based Accelerator System for Polynomial Matrix Multiplications. Circuits Syst Signal Process 38, 5755–5785 (2019). https://doi.org/10.1007/s00034-019-01150-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-019-01150-w

Keywords

Navigation