Skip to main content
Log in

The complementary relationship of interprocedural register allocation and inlining

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Inline expansion and interprocedural register allocation are two general approaches used for interprocedural optimization. However, there are certain situations which prevent either of them from being applied smoothly to procedure calls. Especially, interactions between inlining and register allocation can cause an inlined version of a program to run more slowly than its noninlined counterpart. This paper describes a method of integrating inlining and interprocedural register allocation to reduce the procedure call overhead without this negative effect. We use profile information to identify the heavy called procedures regions and the register usage information of each code site to optimize the placement of the register save/restore code. This method also takes full advantage offree-use registers at each procedure call site. The average performance improvement is 1.21 compared with the previous schemes that performed either of them independently.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Anne M. Holler, A study of the Effects of Subprogram Inlining, Ph.D. Thesis, University of Virginia (1991).

  2. J. W. Davidson and A. M. Holler, Subprogram Inlining: A Study of Its Effects on Program Execution Time,IEEE Trans. on Software Engineering 18(2):89–102 (February 1991).

    Article  Google Scholar 

  3. M. D. Smith, M. S. Lam, and M. A. Horowitz, Boosting Beyond Static Scheduling in a Superscalar Processor,Proc. of the 17th Ann. Int. Symp. on Computer Architecture, pp. 344–354 (May 1990).

  4. S. McFarling, Procedure Merging with Instruction Caches,Proc. of the ACM SIGPLAN '91 Conf. on Programming Language Design and Implementation, pp. 71–79 (June 1991).

  5. P. P. Chang, S. A. Mahlke, and W. Y. Chen, Profile-guided Automatic Inline Expansion for C Programs.Software-Practice and Experience 22(5):349–369 (May 1992).

    Article  Google Scholar 

  6. J. W. Davidson and A. M. Holler, A Study of a C Function Inliner,Software-Practice and Experience 18(8):775–790 (August 1988).

    Article  Google Scholar 

  7. K. D. Cooper, M. W. Hall, and L. Torczon, An Experiment with Inline Substitution,Software-Practice and Experience 21(6):581–601 (June 1991).

    Article  Google Scholar 

  8. D. W. Wall, Global Register Allocation at Link Time,Proc. of the SIGPLAN'86 Symp. on Compiler Construction, pp. 264–275 (June 1986).

  9. F. C. Chow, Minimizing Register Usage Penalty at Procedure Calls,Proc. of the SIGPLAN '88 Conf. Programming Language Design and Implementation, pp. 85–94 (June 1988).

  10. V. Santhanam and D. Odnert, Register Allocation Across Procedure and Module Boundaries,Proc. of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, pp. 28–39 (June 1990).

  11. P. J. Plauger,The Standard C Library, Prentice Hall, Reading (1992).

    Google Scholar 

  12. S. L. Graham, P. B. Kessler, and M. K. McKusick, gprof: a Call Execution Profiler,Proc. of the SIGPLAN '82 Symp. on Compiler Construction, pp. 120–126 (1982).

  13. H. Zima and B. Chapman,Supercompilers for Parallel and Vector Computers, ACM Press, Addison-Wesley, Reading (1990).

    Google Scholar 

  14. M.-C. Chang, F. Lai, and R.-J. Shang, Exploiting Instruction-Level Parallelism with the Conjugate Register File Scheme,Proc. of the 25th Ann. Int. Symp. on Microarchitecture, pp. 29–32 (December 1992).

  15. W.-C. Hsu, Register Allocation and Code Scheduling for Load/Store Architectures, Computer Sciences Technical Report 722, Ph.D. Thesis, Computer Sciences Department, University of Wisconsin-Madison (October 1987).

  16. A. V. Aho, R. Sethi, and J. D. Ullman,Compilers: Principles, Techniques, and Tools, Addison-Wesley, Reading, Massachusetts (1986).

    Google Scholar 

  17. R. M. Stallman,Using and Porting GNU C, Free Software Foundation, Inc., (January 1991).

  18. J. Hennessy and D. Patterson,Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, Inc., Reading, San Mateo (1990).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lai, F., Chao, Yk. & Hsieh, CJ. The complementary relationship of interprocedural register allocation and inlining. Int J Parallel Prog 22, 409–434 (1994). https://doi.org/10.1007/BF02577739

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02577739

Key Words