Skip to main content
Log in

Handling Global Constraints in Compiler Strategy

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

To achieve high-performance on processors featuring ILP, most compilers apply locally a set of heuristics. This leads to a potentially high-performance on separate code fragments. Unfortunately, most optimizations also increase code size, which may lead to a global net performance loss. In this paper, we propose a Global Constraints-Driven Strategy (GCDS) for guiding code optimization. When using GCDS, the final code optimization decision is taken according to global criteria rather than local criteria. For instance, such criteria might be performance, code size, instruction cache behavior, etc. The performance/code size trade-off is a particularly important problem for embedded systems. We show how GCDS can be used to master code size while optimizing performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. David F. Bacon, Susan L. Graham, and Oliver J. Sharp, Compiler transformation for high-performance computing, ACM Computing Surveys, 26(4):345–420 (December 1994).

    Google Scholar 

  2. David G. Bradlee, Susan J. Eggers, and Robert R. Henry, Integrating register allocation and instruction scheduling for RISCs, Proc. Fourth Int'l. Conf. Architectural Support Progr. Lang. Operat. Syst., pp. 122–131, Santa Clara, California (April 8-11, 1991). ACM SIGARCH, SIGPLAN, SIGOPS, and the IEEE Computer Society.

    Google Scholar 

  3. William Y. Chen, Pohua P. Chang, Thomas M. Conte, and Wen-mei W. Hwu, The effect of code expanding optimizations on instruction cache design, Trans. Computers, 42(9): 1045–1057 (September 1993).

    Google Scholar 

  4. Jack W. Davidson and Anne M. Holler, Subprogram inlining: A study of its effects on program execution time, IEEE Trans. Software Engng. 18(2):89–101 (February 1992).

    Google Scholar 

  5. Jack W. Davidson and Sanjay Jinturkar, Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation, Proc. 28th Ann. Int'l. Symp. Microarchitecture, pp. 125–132, Ann Arbor, Michigan, November 29-December 1, 1995. IEEE Computer Society TC-MICRO and ACM SIGMICRO.

    Google Scholar 

  6. Wen-mei W. Hwu, Scott A. Mahlke, William Y. Chen, Pohua P. Chang, Nancy J. Warter, Roger A. Bringmann, Roland G. Ouellette, Richard E. Hank, Tokuzo Kiyohara, Grant E. Haab, John G. Holm, and Daniel M. Lavery, The superblock: An effective technique for VLIW and superscalar compilation, J. Supercomputing, 8:229–248 (May 1993).

    Google Scholar 

  7. M. Lam, Software pipelining: An effective scheduling technique for VLIW machines, SIGPLAN Conf. Progr. Lang. Design and Implementation, Atlanta, ACM, pp. 318–328 (1988).

    Google Scholar 

  8. Scott A. Mahlke, David C. Lin, William Y. Chen, Richard E. Hank, and Roger A. Bringmann, Effective compiler support for predicated execution using the hyperblock, Proc. 25th Ann. Int'l. Symp. on Microarchitecture, pp. 45–54, Portland, Oregon (December 1-4).

  9. Scott McFarling, Procedure merging with instruction caches, ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, Toronto, Canada, pp. 71–79 (June 1991).

  10. Todd C. Mowry, Monica S. Lam, and Anoop Gupta, Design and evaluation of a compiler algorithm for prefetching, Conf. Architecture Support Progr. Lang. Operat. Syst., pp. 62–73 (October 1992).

  11. B. R. Rau, Iterative modulo scheduling: An algorithm for software pipelining loops, Proc. 27th Int'l. Symp. Microarchitecture, pp. 63–74 (December 1994).

  12. Stanford SUIF Compiler Group, SUIF: A parallelizing and optimizing research compiler, Technical Report CSL-TR-94-620, Computer Systems Laboratory, Stanford University (May 1994).

  13. Pohua P. Chang, Scott A. Mahlke, William Y. Chen, Nancy J. Warter, and Wen-mei W. Hwu, IMPACT: An architectural framework for multiple-instruction-issue processors, Int'l. Symp. Computer Architecture, pp. 266–275 (1991).

  14. Jiang Wang, Andreas Krall, and M. Anton Ertl, Decomposed software pipelining with reduced register requirement, Lubomir Bic, Wim Bohm, Paraskevas Evripidou, and Jean-Luc Gaudiot, (eds.), Proc. IFIP WG 10.3 Working Conf. Parallel Architectures and Compilation Techniques, PACT'95, pp. 277–280, Limassol, Cyprus, June 27-29, 1995. ACM Press.

    Google Scholar 

  15. Digital Semiconductor, White paper: How DIGITAL FX!32 works. http://www.digital. com/semiconductor/amt/fx32/fx-white.html (September 1997).

  16. Brian Case, Philips hopes to displace DSPs with VLIW, Microprocessor Report, pp. 12–15 (December 1994).

  17. Franco Gasperoni, Scheduling for horizontal systems: The VLIW paradigm in perspective. Ph.D. thesis, New York University (1991).

  18. E. Rohou, F. Bodin, A. Seznec, G. Le Fol, F. Charot, and F. Raimbault, SALTO: System for assembly-language transformation and optimization (http://www.irisa.fr/caps/Salto). Technical Report 1032, IRISA (1996).

  19. F. Bodin and E. Rohou, D2.3a: Definition of the low-level-high-level interface language. Technical Report, Esprit Project OCEANS Deliverable (1997).

  20. Michel Berkelaar, lp_solve software. Available at ftp://ftp.es.ele.tue.nl/pub/lp_solve.

  21. Daniel R. Kerns and Susan J. Eggers, Balanced scheduling: Instruction scheduling when memory latency is uncertain, SIGPLAN Notices, 28(6):278–289 (June 1993). Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation.

    Google Scholar 

  22. B. R. Rau and C. D. Glaeser, Some scheduling techniques and an easily schedulable horizontal architecture for high performance scientific computing, Proc. 14th Ann. Workshop on Microprogramming, IEEE, pp. 183–198 (1981).

  23. Wen-mei W. Hwu, Richard E. Hank, David M. Gallagher, Scott A. Mahlke, Daniel M. Lavery, Grant E. Haab, John C. Gyllenhaal, and David I. August, Compiler technology for future microprocessors, Proc. IEEE, 83:1625–1639 (December 1995).

    Google Scholar 

  24. James R. Goodman and Wei-Chung Hsu, Code scheduling and register allocation in large basic blocks, Int'l. Conf. Supercomputing, pp. 442–452 (1998).

  25. Karl Olav Lillevold, H263 Software. Available at http://www.nta.no/brukere/DVC/h263_software/ (1995) Copyright © 1995 Telenor R6D.

  26. Robert (4er@iems.nwu.edu) Fourer and John W. (ashbury@skypoint.com) Gregory, Linear Programming FAQ, World Wide Web http://www.mcs.anl.gov/home/otc/faq/ linear-programming-faq.html, Usenet sci.answers, anonymous FTP/pub/usenet/sci. answers/linear-programming-faq from rtfm.mit.edu (1997).

  27. David W. Wall, Predicting program behavior using real or estimated profiles, Conf. Progr. Lang. Design and Implementation, pp. 59–70 (June 1991).

  28. Steve Carr, Combining optimization for cache and instruction-level parallelism, Proc. Conf. Parallel Architectures and Compilation Techniques (PACT'96), pp. 238–247, Boston, Massachusetts (October 20-23, 1996). IEEE Computer Society Press.

    Google Scholar 

  29. Michael E. Wolf, Dror E. Maydan, and Ding-Kai Chen, Combining loop transforma-tions considering caches and scheduling, Proc. 29th Ann. Int'l. Symp. Microarchitecture, pp. 274–286, Paris, France (December 2-4, 1996). IEEE Computer Society TC-MICRO and ACM SIGMICRO.

    Google Scholar 

  30. D. A. Berson, P. Chang, R. Gupta, and M. L. Soffa, Integrating program optimizations and transformations with the scheduling of instruction level parallelism, Lecture Notes in Computer Science, 1239 (1997).

  31. J. A. Fisher, Trace scheduling: A technique for global microcode compaction, IEEE Trans. Computers, pp. 478–490 (July 1981).

  32. R. Gupta and M. L. Soffa, Region scheduling: An approach for detecting and redistributing parallelism, IEEE Trans. Software Engng. 16(4):421–431 (April 1990).

    Google Scholar 

  33. Richard E. Hank, Wen-mei W. Hwu, and B. Ramakrishna Rau, Region-based compilation: An introduction and motivation, Proc. 28th Ann. Int'l. Symp. Microarchitecture, pp. 158–168, Ann Arbor, Michigan (November 29-December 1, 1995). IEEE Computer Society TC-MICRO and ACM SIGMICRO.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rohou, E., Bodin, F., Eisenbeis, C. et al. Handling Global Constraints in Compiler Strategy. International Journal of Parallel Programming 28, 325–345 (2000). https://doi.org/10.1023/A:1007502921104

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1007502921104