Skip to main content
Log in

Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design—Implementation of Finite Interval Constant Modulus Algorithm

  • Published:
The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology Aims and scope Submit manuscript

Abstract

This paper deals with the optimization of iterative algorithms with matrix operations or nested loops for hardware implementation in Field Programmable Gate Arrays (FPGA), using Integer Linear Programming (ILP). The method is demonstrated on an implementation of the Finite Interval Constant Modulus Algorithm. It is an equalization algorithm, suitable for modern communication systems (4G and behind). For the floating-point calculations required in the algorithm, two arithmetic libraries were used in the FPGA implementation: one based on the logarithmic number system, the other using floating-point number system in the standard IEEE format. Both libraries use pipelined modules. Traditional approaches to the scheduling of nested loops lead to a relatively large code, which is unsuitable for FPGA implementation. This paper presents a new high-level synthesis methodology, which models both, iterative loops and imperfectly nested loops, by means of the system of linear inequalities. Moreover, memory access is considered as an additional resource constraint. Since the solutions of ILP formulated problems are known to be computationally intensive, an important part of the article is devoted to the reduction of the problem size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. N. Godard, “Self-Recovering Equalization and Carrier Tracking in Two-Dimensional Data Communication Systems,” IEEE Trans. Commun., vol. 28, November 1980, pp. 1867–1875.

    Article  Google Scholar 

  2. P. A. Regalia, “A Finite Interval Constant Modulus Algorithm,” in Proc. International Conference on Acoustics, Speech, and Signal Processing(ICASSP-2002), volume III, Orlando, FL, May 13–17 2002, pp. 2285–2288.

  3. Celoxica Ltd, Platform Developer’s Kit: Pipelined Floating-point Library Manual, 2004. http://www.celoxica.com.

  4. R. Matoušek, M. Tichý, Z. Pohl, J. Kadlec, and C. Softley, “Logarithmic Number System and Floating-Point Arithmetics on FPGA,” in Field-Programmable Logic and Applications: Reconfigurable Computing is Going Mainstream, vol. 2438 of Lecture Notes in Computer Science, M. Glesner, P. Zipf, and M. Renovell (Eds.), Springer, Berlin Heidelberg New York, 2002, pp. 627–636.

    Google Scholar 

  5. P. Šůcha and Z. Hanzálek, Optimization of Iterative Algorithms with Matrix Operations: Case Studies, Technical report, CTU FEL DCE, Prague, 2005. http://dce.felk.cvut.cz/sucha/articles/sucha05ficmaCS.pdf.

  6. M. A. Bayoumi, G. A. Jullien, and W. C. Miller, “Hybrid VLSI Architecture of FIR Filters using Residue Number Systems,” Electron. Lett., vol. 21, no. 8, January 1985, pp. 358–359.

    Article  Google Scholar 

  7. J. G. McWhirter, “Systolic Array for Recursive Least-Squares Minimisation,” Electron. Lett., vol. 19, no. 18, 1983, pp. 729–730.

    Article  Google Scholar 

  8. I. K. Proudler, J. G. McWhirter, M. Moonen, and G. Hekstra, “The Formal Derivation of a Systolic Array for Recursive Least Squares Estimation,” IEEE Trans. Circuits Syst. 2: Analog Digit. Signal Process, vol. 43, no. 3, 1996, pp. 247–254.

    Article  Google Scholar 

  9. M. Moonen, P. Van Dooren, and J. Vandewalle, “Systolic Algorithm for QSVD Updating,” Signal Process., vol. 25, no. 2, 1991, pp. 203–213.

    Article  MATH  Google Scholar 

  10. G. Lightbody, R. Walke, R. Woods, and J. McCanny, “Parameterizable qr core,” in Asilomar Conference on Signals, Systems and Computers, Conference Record, vol. 1, 1999, pp. 120–124.

    Google Scholar 

  11. R. L. Walke and R. W. M. Smith, “20 GFLOPS QR Processor on a Xilinx Virtex-E FPGA,” in Advanced Signal Processing Algorithms, Architectures, and Implementations X, vol. 4116, F. T. Luk (Ed.), SPIE, 2000.

  12. S. L. Sindorf and S. H. Gerez, “An Integer Linear Programming Approach to the Overlapped Scheduling of Iterative Data-Flow Graphs for Target Architectures with Communication Delays,” in PROGRESS 2000 Workshop on Embedded Systems, Utrecht, The Netherlands, 2000.

  13. C. Hanen and A. Munier, “A Study of the Cyclic Scheduling Problem on Parallel Processors,” Discrete Appl. Math., vol. 57, February 1995, pp. 167–192.

    Article  MATH  MathSciNet  Google Scholar 

  14. A. Munier, “The Complexity of a Cyclic Scheduling Problem with Identical Machines,” Eur. J. Oper. Res., vol. 91, June 1996, pp. 471–480.

    Article  MATH  Google Scholar 

  15. Dirk Fimmel and Jan Müller, “Optimal Software Pipelining Under Resource Constraints,” Int. J. Found. Comput. Sci., vol. 12, no. 6, 2001, pp. 697–718.

    Article  Google Scholar 

  16. P. Šůcha, Z. Pohl, and Z. Hanzálek, “Scheduling of Iterative Algorithms on FPGA with Pipelined Arithmetic Unit,” in 10th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2004), Toronto, Canada, 2004.

  17. Z. Pohl, P. Šůcha, J. Kadlec, and Z. Hanzálek, “Performance Tuning of Iterative Algorithms in Signal Processing,” in The International Conference on Field-Programmable Logic and Applications (FPL’05), Tampere, Finland, August 2005.

  18. M. Lam, Software Pipelining: An Effective Scheduling Technique for VLIW Machines,” in PLDI ’88: Proceedings of the ACM SIGPLAN 1988 Conference on Programming Language design and Implementation, 1988, pp. 318–328.

  19. B. R. Rau and C. D. Glaeser, “Some Scheduling Techniques and an Easily Schedulable Horizontal Architecture for High Performance Scientific Computing,” in MICRO 14: Proceedings of the 14th Annual Workshop on Microprogramming, IEEE Press, Piscataway, NJ, USA, 1981, pp. 183–198.

    Google Scholar 

  20. S. Gupta, N. Dutt, R. Gupta, and A. Nicolau, “Loop Shifting and Compaction for the High-Level Synthesis of Designs with Complex Control Flow,” in Design, Automation and Test in Europe Conference and Exhibition (DATE’04), Paris, France, February 2004.

  21. A. Darte and Guillaume Huard, “Loop Shifting for Loop Compaction,” Int. J. Parallel Program., vol. 28, no. 5, 2000, pp. 499–534.

    Article  Google Scholar 

  22. S. Carr, C. Ding, and P. Sweany, “Improving Software Pipelining with Unroll-and-Jam,” in Proceedings of the 29th Hawaii International Conference on System Sciences (HICSS’96), January 1996.

  23. D. Petkov, R. Harr, and S. Amarasinghe, “Efficient Pipelining of Nested Loops: Unroll-and-Squash,” in 16th International Parallel and Distributed Processing Symposium (IPDPS’02), Fort Lauderdale, California, April 2002.

  24. M. J. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley Longman, Boston, MA, USA, 1995.

    Google Scholar 

  25. N. Ahmed, N. Mateev, and K. Pingali, “Tiling Imperfectly-Nested Loop Nests,” in Proceedings of the IEEE/ACM SC2000 Conference, Dallas, Texas, November 2000.

  26. R. Schreiber, S. Aditya, S. Mahlke, V. Kathail, B. Rau, D. Cronquist, and M. Sivaraman, “Pico-npa: High-Level Synthesis of Nonprogrammable Hardware Accelerators,” J. VLSI Signal Process., vol. 31, no. 2, 2002, pp. 127–142.

    Article  MATH  Google Scholar 

  27. A. Heřmánek, J. Schier, and P. A. Regalia, “Architecture Design for FPGA Implementation of Finite Interval CMA,” in Proc. European Signal Processing Conference, Wiena, Austria, September 2004, pp. 2039–2042.

  28. W. Givens, “Computation of Plane Unitary Rotations Transforming a General Matrix to Triangular Form,” J. Soc. Ind. Appl. Math., vol. 6, 1958, pp. 26–50.

    Article  MATH  MathSciNet  Google Scholar 

  29. A. Heřmánek, Study of the next generation equalization algorithms and their implementation. PhD thesis, Université Paris XI, UFR Scientifique d’Orsay, 2005.

  30. A. Makhorin, GLPK (GNU Linear Programming Kit) Version 4.6, 2004. http://www.gnu.org/software/glpk/.

  31. ILOG, Inc. CPLEX Version 8.0, 2002. http://www.ilog.com/products/cplex/.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Přemysl Šůcha.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Šůcha, P., Hanzálek, Z., Heřmánek, A. et al. Scheduling of Iterative Algorithms with Matrix Operations for Efficient FPGA Design—Implementation of Finite Interval Constant Modulus Algorithm. J VLSI Sign Process Syst Sign Image Video Technol 46, 35–53 (2007). https://doi.org/10.1007/s11265-006-0004-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-006-0004-y

Keywords

Navigation