Skip to main content

Data locality optimization of interference graphs based on polyhedral computations

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In achieving high performance on modern architectures it is critical to make effective use of the memory hierarchy. There are compiler-directed locality enhancement techniques that allow the transformation of program to achieve a higher locality: loop transformations, which are constrained by data dependences and data layout transformations, which have a global impact on the program locality. Due to these drawbacks, there must be a unification of the two techniques to achieve the benefits of both. In this paper, a novel unification of these techniques is presented. Using a model based on parameterized polyhedra and introducing new concepts, we propose a data locality optimization algorithm.

In comparison with the other approaches, the technique proposed is capable of solving more conflicts and optimizing more references, a subtle way is proposed to optimize incompatible references to the same array, in the same loop, and also references in a cycle in the interference graph. Using parameterized cost functions, our technique estimates the importance of each sub-graph and optimizes data locality. Our experimental results show a significant improvement over the prior approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Kandemir M, Choudhary A, Ramanujam J, Banerjee P (1999) A matrix-based approach to global locality optimization. J Parallel Distrib Comput 58(2):190–235

    Article  Google Scholar 

  2. Loechner V, Meister B, Clauss P (2002) Precise data locality optimization of nested loops. J Supercomput 21(1):37–76

    Article  MATH  Google Scholar 

  3. Allen R, Kennedy K (2001) Optimizing compilers for modern architectures. Kaufmann, San Mateo

    Google Scholar 

  4. Bastoul C, Feautrier P (2003) Improving data locality by chunking. In: CC’12 int conf on compiler construction. Lecture notes in computer science, vol 2622, pp 320–335

    Chapter  Google Scholar 

  5. Bik A, Knijnenburg P, Wijshoff H (1994) Reshaping access patterns for generating sparse codes. In: Proc of the 7th int workshop on languages and compilers for parallel computing, NY, USA, pp 406–422

    Google Scholar 

  6. Coleman S, McKinley K (1995) Tile size selection using cache organization and data layout. In: Proc of the ACM SIGPLAN conf on programming language design and implementation (PLDI ’95), La Jolla, California, USA, pp 279–290

    Chapter  Google Scholar 

  7. Li W (1993) Compiling for NUMA parallel machines. PhD thesis, Computer Science Department, Cornell University, NY

  8. Manjikian N, Abdelrahman T (1995) Fusion of loops for parallelism and locality. In: Proc of the 24th int conf on parallel processing (ICPP’95), Oconomowoc, Wisconsin, vol II, pp 19–28

    Google Scholar 

  9. McKinley K, Carr S, Tseng C (1996) Improving data locality with loop transformations. ACM Trans Program Lang Syst, 18(4):424–453

    Article  Google Scholar 

  10. Wolf M, Lam M (1991) A data locality optimizing algorithm. In: Proc of the SIGPLAN ’91 conf on programming language design and implementation, Toronto, Ontario, pp 30–44

    Google Scholar 

  11. Wolfe M (1989) More iteration space tiling. In: Proc of supercomputing ’89, Reno, Nevada, pp 655–664

    Google Scholar 

  12. Kandemir M, Choudhary A, Shenoy N, Banerjee P, Ramanujam J (1998) A hyperplane based approach for optimizing spatial locality in loop nests. In: Proc of the 1998 ACM int conf on supercomputing (ICS’98), Melbourne, Australia, pp 69–76

    Google Scholar 

  13. Kandemir M, Choudhary A, Shenoy N, Banerjee P, Ramanujam J (1999) A linear algebra framework for automatic determination of optimal data layouts. IEEE Trans Parallel Distrib Syst 10(2):115–135

    Article  Google Scholar 

  14. O’Boyle M, Knijnenburg P (1999) Non-singular data transformations: definition, validity, and applications. Int J Parallel Program 27(1):131–159

    Article  Google Scholar 

  15. Rivera G, Tseng C (1998) Data transformations for eliminating conflict misses. In: The ACM SIGPLAN conf on programming language design and implementation (PLDI’98), Montreal, Canada, pp 38–49

    Google Scholar 

  16. Kandemir M (2004) Improving whole-program locality using intra-procedural and inter-procedural transformations. J Parallel Distrib Comput 65(7):564–582

    Google Scholar 

  17. Kandemir M, Choudhary A, Ramanujam J, Banerjee P (1998) Improving locality using loop and data transformations in an integrated framework. In: Proc of the 31st int symp on microarchitecture (MICRO-31), Dallas, Texas, pp 285–296

    Google Scholar 

  18. O’Boyle M, Knijnenburg P (2002) Integrating loop and data transformations for global optimization. J Parallel Distrib Comput 62(4):563–590

    Article  MATH  Google Scholar 

  19. Kandemir M, Choudhary A, Ramanujam J, Banerjee P (1999) A graph based framework to detect optimal memory layouts for improving data locality. In: Proc of the 1999 int parallel processing symp (IPPS’99), San Juan, Puerto Rico, pp 738–743

    Google Scholar 

  20. Blume W, Eigenmann R (1994) An overview of symbolic analysis techniques needed for the effective parallelization of the PERFECT benchmarks. In: Proc of the 1994 int conf on parallel processing (ICPP’94), North Carolina State University, vol II, pp 233–238

    Chapter  Google Scholar 

  21. Loechner V, Meister B, Clauss P (2001) Data sequence locality: a generalization of temporal locality. In: Proc of the 7th int Euro-Par conf Manchester on parallel processing, UK, pp 262–272

    Google Scholar 

  22. Loechner V, Wilde D (1997) Parameterized polyhedra and their vertices. Int J Parallel Program, 25(6):525–549

    Article  Google Scholar 

  23. Anderson J, Lam M (1993) Global optimizations for parallelism and locality on scalable parallel machines. In: Proc SIGPLAN conf on programming language design and implementation (PLDI’93), Albuquerque, New Mexico, pp 112–125

    Google Scholar 

  24. Wilde D (1993) A library for doing polyhedral operations. Technical report PI 785, IRISA, Rennes, France

  25. Bastoul C, Cohen A, Girbal A, Sharma S, Temam O (2003) Putting polyhedral loop transformations to work. In: Proc of the workshop on languages and compilers for parallel computing (LCPC’03), Texas, USA. Lecture notes in computer science, vol 2558, pp 23–30

    Google Scholar 

  26. Feautrier P (1996) Automatic parallelization in the polytope model. In: The data parallel programming model. Lecture notes in computer science, vol 1132, pp 79–103

    Chapter  Google Scholar 

  27. Ramanujam J (1992) A linear algebraic view of loop transformations and their interaction. In: Sorensen D (ed) Proc of the 5th SIAM conf on parallel processing for scientific computing. SIAM, Philadelphia, pp 543–548

    Google Scholar 

  28. Schrijver A (1986) Theory of linear and integer programming. Wiley, New York

    MATH  Google Scholar 

  29. Ancourt C, Irigoin F (1991) Scanning polyhedra with DO loops. In: Proc of the 3rd ACM SIGPLAN symp principle and practice of parallel programming, Williamsburg, USA, pp 39–50

    Google Scholar 

  30. Bastoul C (2002) Generating loops for scanning polyhedra—CLooG user’s guide. Technical report 2002/23, PRiSM, Versailles University, France

  31. Le Verge H, Van Dongen V, Wilde D (1994) Loop nest synthesis using the polyhedral library. Technical report 830, IRISA, Rennes, France

  32. Rajopadhye S, Quilleré F, Wilde D (2000) Generation of efficient nested loops from polyhedra. Int J Parallel Program, 28(5):469–498

    Article  Google Scholar 

  33. Wolfe M (1996) High performance compilers for parallel computing. Addison-Wesley, Reading

    MATH  Google Scholar 

  34. Sam SV (2009) A bijective proof for a theorem of Ehrhart. Am Math Mon 116(8):688–701

    Article  MathSciNet  MATH  Google Scholar 

  35. Clauss P (1996) Counting solutions to linear and nonlinear constraints through Ehrhart polynomials: applications to analyze and transform scientific programs. Research report ICPS 96-03, 10th ACM int conf on Supercomputing (ICS’96), Philadelphia, Pennsylvania, USA, pp 278–285

  36. Ancourt C, Irigoin F, Yang Y (1995) Minimal data dependence abstractions for loop transformations. Int J Parallel Program, 23(4):359–388

    Article  Google Scholar 

  37. Kandemir M (2004) Impact of data transformations on memory bank locality. In: Proc of the design, automation and test in Europe conf and exhibition (DATE’ 04), Paris, France, vol 1, pp 10506–10511

    Google Scholar 

  38. Gilbert W (1993) Bricklaying and the Hermite normal form. Am Math Mon 100(3):242–245

    Article  MATH  Google Scholar 

  39. Zhan X (2006) Completion of a partial integral matrix to a unimodular matrix. Linear Algebra Appl 414(1):373–377

    Article  MathSciNet  MATH  Google Scholar 

  40. Mohar B, Poljak S (1993) Eigenvalues in combinatorial optimization: combinatorial and graph-theoretical problems in linear algebra. IMA Vol Math Appl 50:107–151

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hassan Motallebi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Motallebi, H., Parsa, S. Data locality optimization of interference graphs based on polyhedral computations. J Supercomput 61, 935–965 (2012). https://doi.org/10.1007/s11227-011-0660-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-011-0660-y

Keywords