Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis

Le Gal, Bertrand; Casseau, Emmanuel

doi:10.1007/s11265-010-0467-8

Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis

Published: 07 April 2010

Volume 62, pages 341–357, (2011)
Cite this article

Journal of Signal Processing Systems Aims and scope Submit manuscript

Bertrand Le Gal¹ &
Emmanuel Casseau²

242 Accesses
9 Citations
Explore all metrics

Abstract

Multimedia applications such as video and image processing are often characterized as computation intensive applications. For these applications the word-length of data and instructions is different throughout the application. Generating hardware architectures is not a straightforward task since it requires a deep word-length analysis in order to properly determine what hardware resources are needed. In this paper we suggest an automated design methodology based on high-level synthesis which takes care of data word-length and interconnection resource cost in order to generate area and power efficient fixed-point architectures for DSP applications. Both ASIC and FPGA technologies are targeted. Experimental results show that our proposed approach reduces area by 6% to 42% on FPGA technology and by 9% to 48 % on ASIC compared to previous approaches. Power saving can reach up to 44% on FPGA technology and 36% on ASIC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

FPGA devices particularly
http://www.enseirb.fr/∼legal/wp_graphlab
At the hardware design level, when the width of a data is smaller than the resource’s one, a data expansion is required. The VHDL “resize” standard function (IEEE.NUMERIC_STD library) is used in practice to automate this data expansion during the hardware generation step of the design flow (Fig. 7).
In that case, no re-scheduling is realized in order to obtain a solution in a short run-time. The set of resources is increased and the scheduling of the next operations benefits of this update.
Because area is not always linear to the data-width (a multiplier for example, see Tables 1 and 2), we use polynomial models to compute resource area. These models come from our characterizations of the platforms we target.
In practice, it also reduces the actual interconnection cost
For comprehensive comparison of the resource usage, we enforce multipliers to be implemented using logic elements, not DSP blocks, with FPGA targets.
depending on the values of coefficients C_u, C_v.
Profiles 1, 2 and 3 respectively require (worst case) resources with 24, 32 and 26 bits when they are separately considered.

References

Stephenson, M., Babb, J., & Amarasinghe, S. (2000). Bitwidth analysis with application to silicon compilation. In the Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, 108–120.
Srivastava, M., & Potkonjak, M. (1995). Optimum and heuristic transformation techniques for simultaneous optimization of latency and throughput. IEEE Transactions on VLSI Systems, 3(1), 2–19.
Article Google Scholar
Casseau, E., Le Gal, B., Bomel, P., Jégo, C., Huet, S., & Martin, E. (2005). C based rapid prototyping for digital signal processing. In the Proceedings of EUSIPCO, Antalya, Turquie, 4–8 September.
Coussy, P., & Takach, A. (Ed.) (2009). Special issue on high-level synthesis. IEEE Design & Test of Computers, 26(4).
Elliott, J. P. (2000). Understanding behavioral synthesis. A practical guide to high-level design. Kluwer Academic Publishers.
Gupta,S., Gupta, R., Dutt, N., Nicolau, A. (2004). SPARK: a parallelizing approach to the high-level synthesis of digital circuits. Springer Publishers, 202 pages, ISBN: 978-1-4020-7837-8.
Gorjiara, B., & Gajski, D. (2008) Automatic architecture refinement techniques for customizing processing elements. Proceedings of the 45th annual conference on Design Automation, June 08–13, Anaheim, California.
Coussy, P., & Morawiec, A. (2008). High-level synthesis from algorithm to digital circuit. Springer Publishers, ISBN: 978-1-4020-8587-1.
Kuon, I., & Rose, J. (2007). Measuring the gap between FPGAs and ASICs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems., 26(2), 203–215.
Article Google Scholar
Cong, J., & Xu, J. (2008) Simultaneous FU and register binding based on network flow method. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE '08), pp. 1057-1062, Munich, Germany, March 10–14.
Nayak, A., Haldar, M., Choudhary, A., & Banerjee, P. (2001). Precision and error analysis of Matlab applications during automated hardware synthesis for FPGAs. In the Proceedings of the DATE Conference, pp. 722–728.
Tallam, S., & Gupta, R. (2003). Bitwidth aware global register allocation. In 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. pp. 85–2.
Lee, D.-U., Gaffar, A. A., Cheung, R. C., Mencer, O., Luk, W., & Constantinides, G. A. (2006). Accuracy guaranteed bit-width optimization. IEEE Transactions on Computer Aided Design, 25, 1990–2000.
Article Google Scholar
Carreras, C., López, J. A., & Nieto-Taladriz, O. (1999). Bit-width selection for data-path implementations. In Proceedings of the 12th international Symposium on System Synthesis (ISS). IEEE Computer Society, Washington, DC, p. 114–119, 1999.
Žilinskas, J., & Lockhart Bogle, I. D. (2007). A survey of methods for the estimation ranges of functions using interval arithmetic. Models and algorithms for global optimization, vol. 4 (pp. 97–108). ISBN 978-0-387-36720-0. Springer US.
Kim, S., & Sung, W. (1998). Fixed-point error analysis and word length optimization of 8x8 idct architectures. Transactions on Circuits and Systems for Video Technology, 8(8), 935–940.
Article Google Scholar
Lee, D.-U., & Villasenor, J. (2007). A bit-width optimization methodology for polynomial-based function evaluation. IEEE Transactions on Computers, 56, 567–571.
Article MathSciNet Google Scholar
Bouganis, C.-S., Constantinides, G. A. (2008). Synthesis of DSP algorithms from infinite precision specifications. In P. Coussy & A. Morawiec (Eds.), High-level synthesis, from algorithm to digital circuit. Springer Publishers, August 2008.
Menard, D., Serizel, R., Rocher, R., & Sentieys, O. (2008). Accuracy constraint determination in fixed-point system design. EURASIP Journal on Embedded Systems, 2008, Article ID 242584.
Sllame, A., & Drabek, V. (2002). An efficient list-based scheduling algorithm for high-level synthesis. In Proceedings of the Euromicro Symposium on Digital Systems Design, DSD.
Yi, Y., Milward, M., Khawam, S., Nousias, I., Arslan, T. (2005). Automatic synthesis and scheduling of multirate DSP algorithms. Asia South Pacific Design Automation Conference, ASP-DAC '05, pp. 635–638.
Lim, P., & Kim, T. (2006). Thermal-aware high-level synthesis based on network flow method. In Proceedings of CODES + ISSS '06 (pp. 124–129). New York, NY.
Kim, T., Liu, X. (2007). Compatibility path based binding algorithm for interconnect reduction in high level synthesis. In Proceedings of the International Conference on Computer-Aided Design (ICCAD’07), (pp. 435–441), San Jose, California, November 05 – 08.
Le Gal, B., Casseau, E., & Huet, S. (2008). Dynamic memory access management for high-performance DSP applications using high-level synthesis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 16(11), 454–1464.
Google Scholar
Cong, J., Fan, Y., Han, G., Lin, Y., Xu, J., Zhang, Z., et al. (2005). Bitwidth-aware scheduling and binding in high-level synthesis. In the Proceedings of the ASP-DAC, Asia and South Pacific Design Automation Conference, 856–861.
Cong, J., Fan, Y., Han, G., Yang, X., & Zhang, Z. (2004). Architecture and synthesis for on-chip multicycle communication. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
Constantinides, G. A., Cheung, P. Y. K., & Luk, W. (2000). Optimal datapath allocation for multiple-wordlength systems. Electronics Letters, Issue 17, 1508–1509.
Article Google Scholar
Constantinides, G., Cheung, P., & Luk, W. (2001). Heuristic datapath allocation for multiple wordlength systems. In the Proceedings of the Design, Automation and Test in Europe (DATE’01) Conference, 791–796.
Kum, K., & Sung, W. (1998). Word-length optimization for high-level synthesis of digital signal processing systems. In the Proceedings of the IEEE Workshop on Signal Processing Systems, 569–578.
Kum, K., & Sung, W. (2001). Combined word-length optimization and high-level synthesis of digital signal processing systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20(8), 921–930.
Article Google Scholar
Le Gal, B., Andriamisaina, C., & Casseau, E. (2006) Bit-width aware high-level synthesis for digital signal processing systems. Proceeding of IEEE System-On-Chip Conference, Austin Texas, September 24–27.
Coussy, P., Lhairech-Lebreton, G., & Heller, D. (2008). Multiple word-length high-level synthesis. EURASIP Journal on Embedded Systems 2008, 11. Article ID 916867.
Huang, C-Y., Chen, Y., Lin, Y., Hsu, Y. (1990). Data path allocation based on bipartite weighted matching. In Proceedings ACM/IEEE Design Automation Conference (DAC), 499–504.

Download references

Author information

Authors and Affiliations

IMS Laboratory, CNRS - UMR 5218, Bordeaux Polytechnic Institute, University of Bordeaux, Talence, France
Bertrand Le Gal
French National Institute for Research in Computer Science and Control, INRIA/IRISA, Lannion, France
Emmanuel Casseau

Authors

Bertrand Le Gal
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Casseau
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bertrand Le Gal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Le Gal, B., Casseau, E. Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis. J Sign Process Syst 62, 341–357 (2011). https://doi.org/10.1007/s11265-010-0467-8

Download citation

Received: 05 October 2009
Revised: 05 October 2009
Accepted: 01 March 2010
Published: 07 April 2010
Issue Date: March 2011
DOI: https://doi.org/10.1007/s11265-010-0467-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis

Abstract

Access this article

Similar content being viewed by others

Efficient System-Level Hardware Synthesis of Dataflow Programs Using Shared Memory Based FIFO

Dataflow-Based, Cross-Platform Design Flow for DSP Applications

A Quick Tour of High-Level Synthesis Solutions for FPGAs

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Word-Length Aware DSP Hardware Design Flow Based on High-Level Synthesis

Abstract

Access this article

Similar content being viewed by others

Efficient System-Level Hardware Synthesis of Dataflow Programs Using Shared Memory Based FIFO

Dataflow-Based, Cross-Platform Design Flow for DSP Applications

A Quick Tour of High-Level Synthesis Solutions for FPGAs

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation