Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design—Implementation of Finite Interval Constant Modulus Algorithm

Šůcha, Přemysl; Hanzálek, Zdeněk; Heřmánek, Antonín; Schier, Jan

doi:10.1007/s11265-006-0004-y

Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design—Implementation of Finite Interval Constant Modulus Algorithm

Published: 09 January 2007

Volume 46, pages 35–53, (2007)
Cite this article

The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology Aims and scope Submit manuscript

Přemysl Šůcha¹,
Zdeněk Hanzálek¹,
Antonín Heřmánek² &
…
Jan Schier²

142 Accesses
3 Citations
Explore all metrics

Abstract

This paper deals with the optimization of iterative algorithms with matrix operations or nested loops for hardware implementation in Field Programmable Gate Arrays (FPGA), using Integer Linear Programming (ILP). The method is demonstrated on an implementation of the Finite Interval Constant Modulus Algorithm. It is an equalization algorithm, suitable for modern communication systems (4G and behind). For the floating-point calculations required in the algorithm, two arithmetic libraries were used in the FPGA implementation: one based on the logarithmic number system, the other using floating-point number system in the standard IEEE format. Both libraries use pipelined modules. Traditional approaches to the scheduling of nested loops lead to a relatively large code, which is unsuitable for FPGA implementation. This paper presents a new high-level synthesis methodology, which models both, iterative loops and imperfectly nested loops, by means of the system of linear inequalities. Moreover, memory access is considered as an additional resource constraint. Since the solutions of ILP formulated problems are known to be computationally intensive, an important part of the article is devoted to the reduction of the problem size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Synthesis Time Reconfigurable Floating Point Unit for Transprecision Computing

High-Speed Inversion Using $$x^{4^{n}}$$ Units

Fixed-point iterative linear inverse solver with extended precision

Article Open access 30 March 2023

References

D. N. Godard, “Self-Recovering Equalization and Carrier Tracking in Two-Dimensional Data Communication Systems,” IEEE Trans. Commun., vol. 28, November 1980, pp. 1867–1875.
Article Google Scholar
P. A. Regalia, “A Finite Interval Constant Modulus Algorithm,” in Proc. International Conference on Acoustics, Speech, and Signal Processing(ICASSP-2002), volume III, Orlando, FL, May 13–17 2002, pp. 2285–2288.
Celoxica Ltd, Platform Developer’s Kit: Pipelined Floating-point Library Manual, 2004. http://www.celoxica.com.
R. Matoušek, M. Tichý, Z. Pohl, J. Kadlec, and C. Softley, “Logarithmic Number System and Floating-Point Arithmetics on FPGA,” in Field-Programmable Logic and Applications: Reconfigurable Computing is Going Mainstream, vol. 2438 of Lecture Notes in Computer Science, M. Glesner, P. Zipf, and M. Renovell (Eds.), Springer, Berlin Heidelberg New York, 2002, pp. 627–636.
Google Scholar
P. Šůcha and Z. Hanzálek, Optimization of Iterative Algorithms with Matrix Operations: Case Studies, Technical report, CTU FEL DCE, Prague, 2005. http://dce.felk.cvut.cz/sucha/articles/sucha05ficmaCS.pdf.
M. A. Bayoumi, G. A. Jullien, and W. C. Miller, “Hybrid VLSI Architecture of FIR Filters using Residue Number Systems,” Electron. Lett., vol. 21, no. 8, January 1985, pp. 358–359.
Article Google Scholar
J. G. McWhirter, “Systolic Array for Recursive Least-Squares Minimisation,” Electron. Lett., vol. 19, no. 18, 1983, pp. 729–730.
Article Google Scholar
I. K. Proudler, J. G. McWhirter, M. Moonen, and G. Hekstra, “The Formal Derivation of a Systolic Array for Recursive Least Squares Estimation,” IEEE Trans. Circuits Syst. 2: Analog Digit. Signal Process, vol. 43, no. 3, 1996, pp. 247–254.
Article Google Scholar
M. Moonen, P. Van Dooren, and J. Vandewalle, “Systolic Algorithm for QSVD Updating,” Signal Process., vol. 25, no. 2, 1991, pp. 203–213.
Article MATH Google Scholar
G. Lightbody, R. Walke, R. Woods, and J. McCanny, “Parameterizable qr core,” in Asilomar Conference on Signals, Systems and Computers, Conference Record, vol. 1, 1999, pp. 120–124.
Google Scholar
R. L. Walke and R. W. M. Smith, “20 GFLOPS QR Processor on a Xilinx Virtex-E FPGA,” in Advanced Signal Processing Algorithms, Architectures, and Implementations X, vol. 4116, F. T. Luk (Ed.), SPIE, 2000.
S. L. Sindorf and S. H. Gerez, “An Integer Linear Programming Approach to the Overlapped Scheduling of Iterative Data-Flow Graphs for Target Architectures with Communication Delays,” in PROGRESS 2000 Workshop on Embedded Systems, Utrecht, The Netherlands, 2000.
C. Hanen and A. Munier, “A Study of the Cyclic Scheduling Problem on Parallel Processors,” Discrete Appl. Math., vol. 57, February 1995, pp. 167–192.
Article MATH MathSciNet Google Scholar
A. Munier, “The Complexity of a Cyclic Scheduling Problem with Identical Machines,” Eur. J. Oper. Res., vol. 91, June 1996, pp. 471–480.
Article MATH Google Scholar
Dirk Fimmel and Jan Müller, “Optimal Software Pipelining Under Resource Constraints,” Int. J. Found. Comput. Sci., vol. 12, no. 6, 2001, pp. 697–718.
Article Google Scholar
P. Šůcha, Z. Pohl, and Z. Hanzálek, “Scheduling of Iterative Algorithms on FPGA with Pipelined Arithmetic Unit,” in 10th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2004), Toronto, Canada, 2004.
Z. Pohl, P. Šůcha, J. Kadlec, and Z. Hanzálek, “Performance Tuning of Iterative Algorithms in Signal Processing,” in The International Conference on Field-Programmable Logic and Applications (FPL’05), Tampere, Finland, August 2005.
M. Lam, Software Pipelining: An Effective Scheduling Technique for VLIW Machines,” in PLDI ’88: Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language design and Implementation, 1988, pp. 318–328.
B. R. Rau and C. D. Glaeser, “Some Scheduling Techniques and an Easily Schedulable Horizontal Architecture for High Performance Scientific Computing,” in MICRO 14: Proceedings of the 14th Annual Workshop on Microprogramming, IEEE Press, Piscataway, NJ, USA, 1981, pp. 183–198.
Google Scholar
S. Gupta, N. Dutt, R. Gupta, and A. Nicolau, “Loop Shifting and Compaction for the High-Level Synthesis of Designs with Complex Control Flow,” in Design, Automation and Test in Europe Conference and Exhibition (DATE’04), Paris, France, February 2004.
A. Darte and Guillaume Huard, “Loop Shifting for Loop Compaction,” Int. J. Parallel Program., vol. 28, no. 5, 2000, pp. 499–534.
Article Google Scholar
S. Carr, C. Ding, and P. Sweany, “Improving Software Pipelining with Unroll-and-Jam,” in Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS’96), January 1996.
D. Petkov, R. Harr, and S. Amarasinghe, “Efficient Pipelining of Nested Loops: Unroll-and-Squash,” in 16th International Parallel and Distributed Processing Symposium (IPDPS’02), Fort Lauderdale, California, April 2002.
M. J. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley Longman, Boston, MA, USA, 1995.
Google Scholar
N. Ahmed, N. Mateev, and K. Pingali, “Tiling Imperfectly-Nested Loop Nests,” in Proceedings of the IEEE/ACM SC2000 Conference, Dallas, Texas, November 2000.
R. Schreiber, S. Aditya, S. Mahlke, V. Kathail, B. Rau, D. Cronquist, and M. Sivaraman, “Pico-npa: High-Level Synthesis of Nonprogrammable Hardware Accelerators,” J. VLSI Signal Process., vol. 31, no. 2, 2002, pp. 127–142.
Article MATH Google Scholar
A. Heřmánek, J. Schier, and P. A. Regalia, “Architecture Design for FPGA Implementation of Finite Interval CMA,” in Proc. European Signal Processing Conference, Wiena, Austria, September 2004, pp. 2039–2042.
W. Givens, “Computation of Plane Unitary Rotations Transforming a General Matrix to Triangular Form,” J. Soc. Ind. Appl. Math., vol. 6, 1958, pp. 26–50.
Article MATH MathSciNet Google Scholar
A. Heřmánek, Study of the next generation equalization algorithms and their implementation. PhD thesis, Université Paris XI, UFR Scientifique d’Orsay, 2005.
A. Makhorin, GLPK (GNU Linear Programming Kit) Version 4.6, 2004. http://www.gnu.org/software/glpk/.
ILOG, Inc. CPLEX Version 8.0, 2002. http://www.ilog.com/products/cplex/.

Download references

Author information

Authors and Affiliations

Centre for Applied Cybernetics, Department of Control Engineering, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
Přemysl Šůcha & Zdeněk Hanzálek
Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Prague, Czech Republic
Antonín Heřmánek & Jan Schier

Authors

Přemysl Šůcha
View author publications
You can also search for this author in PubMed Google Scholar
Zdeněk Hanzálek
View author publications
You can also search for this author in PubMed Google Scholar
Antonín Heřmánek
View author publications
You can also search for this author in PubMed Google Scholar
Jan Schier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Přemysl Šůcha.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Šůcha, P., Hanzálek, Z., Heřmánek, A. et al. Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design—Implementation of Finite Interval Constant Modulus Algorithm. J VLSI Sign Process Syst Sign Image Video Technol 46, 35–53 (2007). https://doi.org/10.1007/s11265-006-0004-y

Download citation

Received: 17 February 2006
Accepted: 15 August 2006
Published: 09 January 2007
Issue Date: January 2007
DOI: https://doi.org/10.1007/s11265-006-0004-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design—Implementation of Finite Interval Constant Modulus Algorithm

Abstract

Access this article

Similar content being viewed by others

Synthesis Time Reconfigurable Floating Point Unit for Transprecision Computing

High-Speed Inversion Using $$x^{4^{n}}$$ Units

Fixed-point iterative linear inverse solver with extended precision

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design—Implementation of Finite Interval Constant Modulus Algorithm

Abstract

Access this article

Similar content being viewed by others

Synthesis Time Reconfigurable Floating Point Unit for Transprecision Computing

High-Speed Inversion Using $$x^{4^{n}}$$ Units

Fixed-point iterative linear inverse solver with extended precision

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation