Skip to main content
Log in

Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

Multimedia applications such as video and image processing are often characterized as computation intensive applications. For these applications the word-length of data and instructions is different throughout the application. Generating hardware architectures is not a straightforward task since it requires a deep word-length analysis in order to properly determine what hardware resources are needed. In this paper we suggest an automated design methodology based on high-level synthesis which takes care of data word-length and interconnection resource cost in order to generate area and power efficient fixed-point architectures for DSP applications. Both ASIC and FPGA technologies are targeted. Experimental results show that our proposed approach reduces area by 6% to 42% on FPGA technology and by 9% to 48 % on ASIC compared to previous approaches. Power saving can reach up to 44% on FPGA technology and 36% on ASIC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13

Similar content being viewed by others

Notes

  1. FPGA devices particularly

  2. http://www.enseirb.fr/∼legal/wp_graphlab

  3. At the hardware design level, when the width of a data is smaller than the resource’s one, a data expansion is required. The VHDL “resize” standard function (IEEE.NUMERIC_STD library) is used in practice to automate this data expansion during the hardware generation step of the design flow (Fig. 7).

  4. In that case, no re-scheduling is realized in order to obtain a solution in a short run-time. The set of resources is increased and the scheduling of the next operations benefits of this update.

  5. Because area is not always linear to the data-width (a multiplier for example, see Tables 1 and 2), we use polynomial models to compute resource area. These models come from our characterizations of the platforms we target.

  6. In practice, it also reduces the actual interconnection cost

  7. For comprehensive comparison of the resource usage, we enforce multipliers to be implemented using logic elements, not DSP blocks, with FPGA targets.

  8. depending on the values of coefficients Cu, Cv.

  9. Profiles 1, 2 and 3 respectively require (worst case) resources with 24, 32 and 26 bits when they are separately considered.

References

  1. Stephenson, M., Babb, J., & Amarasinghe, S. (2000). Bitwidth analysis with application to silicon compilation. In the Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 108–120.

  2. Srivastava, M., & Potkonjak, M. (1995). Optimum and heuristic transformation techniques for simultaneous optimization of latency and throughput. IEEE Transactions on VLSI Systems, 3(1), 2–19.

    Article  Google Scholar 

  3. Casseau, E., Le Gal, B., Bomel, P., Jégo, C., Huet, S., & Martin, E. (2005). C based rapid prototyping for digital signal processing. In the Proceedings of EUSIPCO, Antalya, Turquie, 4–8 September.

  4. Coussy, P., & Takach, A. (Ed.) (2009). Special issue on high-level synthesis. IEEE Design & Test of Computers, 26(4).

  5. Elliott, J. P. (2000). Understanding behavioral synthesis. A practical guide to high-level design. Kluwer Academic Publishers.

  6. Gupta,S., Gupta, R., Dutt, N., Nicolau, A. (2004). SPARK: a parallelizing approach to the high-level synthesis of digital circuits. Springer Publishers, 202 pages, ISBN: 978-1-4020-7837-8.

  7. Gorjiara, B., & Gajski, D. (2008) Automatic architecture refinement techniques for customizing processing elements. Proceedings of the 45th annual conference on Design Automation, June 08–13, Anaheim, California.

  8. Coussy, P., & Morawiec, A. (2008). High-level synthesis from algorithm to digital circuit. Springer Publishers, ISBN: 978-1-4020-8587-1.

  9. Kuon, I., & Rose, J. (2007). Measuring the gap between FPGAs and ASICs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems., 26(2), 203–215.

    Article  Google Scholar 

  10. Cong, J., & Xu, J. (2008) Simultaneous FU and register binding based on network flow method. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE '08), pp. 1057-1062, Munich, Germany, March 10–14.

  11. Nayak, A., Haldar, M., Choudhary, A., & Banerjee, P. (2001). Precision and error analysis of Matlab applications during automated hardware synthesis for FPGAs. In the Proceedings of the DATE Conference, pp. 722–728.

  12. Tallam, S., & Gupta, R. (2003). Bitwidth aware global register allocation. In 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. pp. 85–2.

  13. Lee, D.-U., Gaffar, A. A., Cheung, R. C., Mencer, O., Luk, W., & Constantinides, G. A. (2006). Accuracy guaranteed bit-width optimization. IEEE Transactions on Computer Aided Design, 25, 1990–2000.

    Article  Google Scholar 

  14. Carreras, C., López, J. A., & Nieto-Taladriz, O. (1999). Bit-width selection for data-path implementations. In Proceedings of the 12th international Symposium on System Synthesis (ISS). IEEE Computer Society, Washington, DC, p. 114–119, 1999.

  15. Žilinskas, J., & Lockhart Bogle, I. D. (2007). A survey of methods for the estimation ranges of functions using interval arithmetic. Models and algorithms for global optimization, vol. 4 (pp. 97–108). ISBN 978-0-387-36720-0. Springer US.

  16. Kim, S., & Sung, W. (1998). Fixed-point error analysis and word length optimization of 8x8 idct architectures. Transactions on Circuits and Systems for Video Technology, 8(8), 935–940.

    Article  Google Scholar 

  17. Lee, D.-U., & Villasenor, J. (2007). A bit-width optimization methodology for polynomial-based function evaluation. IEEE Transactions on Computers, 56, 567–571.

    Article  MathSciNet  Google Scholar 

  18. Bouganis, C.-S., Constantinides, G. A. (2008). Synthesis of DSP algorithms from infinite precision specifications. In P. Coussy & A. Morawiec (Eds.), High-level synthesis, from algorithm to digital circuit. Springer Publishers, August 2008.

  19. Menard, D., Serizel, R., Rocher, R., & Sentieys, O. (2008). Accuracy constraint determination in fixed-point system design. EURASIP Journal on Embedded Systems, 2008, Article ID 242584.

  20. Sllame, A., & Drabek, V. (2002). An efficient list-based scheduling algorithm for high-level synthesis. In Proceedings of the Euromicro Symposium on Digital Systems Design, DSD.

  21. Yi, Y., Milward, M., Khawam, S., Nousias, I., Arslan, T. (2005). Automatic synthesis and scheduling of multirate DSP algorithms. Asia South Pacific Design Automation Conference, ASP-DAC '05, pp. 635–638.

  22. Lim, P., & Kim, T. (2006). Thermal-aware high-level synthesis based on network flow method. In Proceedings of CODES + ISSS '06 (pp. 124–129). New York, NY.

  23. Kim, T., Liu, X. (2007). Compatibility path based binding algorithm for interconnect reduction in high level synthesis. In Proceedings of the International Conference on Computer-Aided Design (ICCAD’07), (pp. 435–441), San Jose, California, November 05 – 08.

  24. Le Gal, B., Casseau, E., & Huet, S. (2008). Dynamic memory access management for high-performance DSP applications using high-level synthesis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 16(11), 454–1464.

    Google Scholar 

  25. Cong, J., Fan, Y., Han, G., Lin, Y., Xu, J., Zhang, Z., et al. (2005). Bitwidth-aware scheduling and binding in high-level synthesis. In the Proceedings of the ASP-DAC, Asia and South Pacific Design Automation Conference, 856–861.

  26. Cong, J., Fan, Y., Han, G., Yang, X., & Zhang, Z. (2004). Architecture and synthesis for on-chip multicycle communication. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

  27. Constantinides, G. A., Cheung, P. Y. K., & Luk, W. (2000). Optimal datapath allocation for multiple-wordlength systems. Electronics Letters, Issue 17, 1508–1509.

    Article  Google Scholar 

  28. Constantinides, G., Cheung, P., & Luk, W. (2001). Heuristic datapath allocation for multiple wordlength systems. In the Proceedings of the Design, Automation and Test in Europe (DATE’01) Conference, 791–796.

  29. Kum, K., & Sung, W. (1998). Word-length optimization for high-level synthesis of digital signal processing systems. In the Proceedings of the IEEE Workshop on Signal Processing Systems, 569–578.

  30. Kum, K., & Sung, W. (2001). Combined word-length optimization and high-level synthesis of digital signal processing systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20(8), 921–930.

    Article  Google Scholar 

  31. Le Gal, B., Andriamisaina, C., & Casseau, E. (2006) Bit-width aware high-level synthesis for digital signal processing systems. Proceeding of IEEE System-On-Chip Conference, Austin Texas, September 24–27.

  32. Coussy, P., Lhairech-Lebreton, G., & Heller, D. (2008). Multiple word-length high-level synthesis. EURASIP Journal on Embedded Systems 2008, 11. Article ID 916867.

  33. Huang, C-Y., Chen, Y., Lin, Y., Hsu, Y. (1990). Data path allocation based on bipartite weighted matching. In Proceedings ACM/IEEE Design Automation Conference (DAC), 499–504.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bertrand Le Gal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Le Gal, B., Casseau, E. Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis. J Sign Process Syst 62, 341–357 (2011). https://doi.org/10.1007/s11265-010-0467-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-010-0467-8

Keywords

Navigation