Skip to main content

A Comparison of Locality Transformations for Irregular Codes

  • Conference paper
  • First Online:
Languages, Compilers, and Run-Time Systems for Scalable Computers (LCR 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1915))

Abstract

Researchers have proposed several data and computation transformations to improve locality in irregular scientific codes. We ex- perimentally compare their performance and present gpart, a new tech- nique based on hierarchical clustering. Quality partitions are constructed quickly by clustering multiple neighboring nodes with priority on nodes with high degree, and repeating a few passes. Overhead is kept low by clustering multiple nodes in each pass and considering only edges between partitions. Experimental results show gpart matches the performance of more sophisticated partitioning algorithms to with 6%-8%, with a small fraction of the overhead. It is thus useful for optimizing programs whose running times are not known. This research was supported in part by NSF CAREER Development Award

#ASC9625531 in New Technologies, NSF CISE Institutional Infrastructure Award #CDA9401151, and NSF cooperative agreement ACI-9619020 with NPACI and NCSA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. I. Al-Furaih and S. Ranka. Memory hierarchy management for iterative graph structures. In Proceedings of the 12th International Parallel Processing Symposium, Orlando, FL, April 1998.

    Google Scholar 

  2. M. Berger and S. Bokhari. A partitioning strategy for pdes across multiprocessors. In Proceedings of the 1985 International Conference on Parallel Processing, August 1985.

    Google Scholar 

  3. M. Berger and S. Bokhari. A partitioning strategy for non-uniform problems on multiprocessors. IEEE Transactions on Computers, 37(12):570–580, 1987.

    Article  Google Scholar 

  4. S. Chandra and J.R. Larus. Optimizing communication in HPF programs for fine-grain distributed shared memory. In Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV, June 1997.

    Google Scholar 

  5. E. Cuthill and J. McKee. Reducing the bandwidth of sparse symmetric matrices. In Proceedings of the 24th National Conference of the ACM, ACM Publication P-69, Association for Computing Machinery, NY, 1969.

    Google Scholar 

  6. R. Das, D. Mavriplis, J. Saltz, S. Gupta, and R. Ponnusamy. The design and implementation of a parallel unstructured Euler solver using software primitives. In Proceedings of the 30th Aerospace Sciences Meeting and Exhibit, Reno, NV, January 1992.

    Google Scholar 

  7. R. Das, M. Uysal, J. Saltz, and Y.-S. Hwang. Communication optimizations for irregular scientific computations on distributed memory architectures. Journal of Parallel and Distributed Computing, 22(3):462–479, September 1994.

    Article  Google Scholar 

  8. C. Ding and K. Kennedy. Improving cache performance of dynamic applications with computation and data layout transformations. In Proceedings of the SIG-PLAN ’99 Conference on Programming Language Design and Implementation, Atlanta, GA, May 1999.

    Google Scholar 

  9. C. Ding and K. Kennedy. Inter-array data regrouping. In Proceedings of the Twelfth Workshop on Languages and Compilers for Parallel Computing, San Diego, August 1999.

    Google Scholar 

  10. H. Han and C.-W. Tseng. Improving compiler and run-time support for adap-tive irregular codes. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Paris, France, October 1998.

    Google Scholar 

  11. H. Han and C.-W. Tseng. Improving locality for adaptive irregular scientific codes. Technical Report CS-TR-4039, Dept. of Computer Science, University of Maryland at College Park, September 1999.

    Google Scholar 

  12. H. Han and C.-W. Tseng. Improving locality for adaptive irregular codes. In Proceedings of the Thirteenth Workshop on Languages and Compilers for Parallel Computing, White Plains, NY, August 2000.

    Google Scholar 

  13. R. v. Hanxleden. Handling irregular problems with Fortran D—A preliminary report. In Proceedings of the Fourth Workshop on Compilers for Parallel Computers, Delft, The Netherlands, December 1993.

    Google Scholar 

  14. R. v. Hanxleden and K. Kennedy. Give-N-Take—A balanced code placement framework. In Proceedings of the SIGPLAN ’94 Conference on Programming Language Design and Implementation, Orlando, FL, June 1994.

    Google Scholar 

  15. Y. Hu, S. L. Johnsson, and S.-H. Teng. High Performance Fortran for highly irregular problems. In Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Las Vegas, NV, June 1997.

    Google Scholar 

  16. Y.-S. Hwang, B. Moon, S. Sharma, R. Ponnusamy, R. Das, and J. Saltz. Runtime and language support for compiling adaptive irregular programs on distributed memory machines. Software Practice and Experience, 25(6):597–621, June 1995.

    Article  Google Scholar 

  17. E. Im and K. Yelick. Model-based memory hierarchy optimizations for sparse matrices. In Proceedings of the 1998 Workshop on Profile and Feedback-Directed Compilation, Paris, France, October 1998.

    Google Scholar 

  18. M. Kandemir, A. Choudhary, J. Ramanujam, and P. Banerjee. Improving locality using loop and data transformations in an integrated framework. In Proceedings of the 31th IEEE/ACM International Symposium on Microarchitecture, Dallas, TX, November 1998.

    Google Scholar 

  19. G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. In Proceedings of the 24th International Conference on Parallel Processing, Oconomowoc, WI, August 1995.

    Google Scholar 

  20. G. Karypis and V. Kumar. Multi-level k-way hypergraph partitioning. In Proceedings of SC’98, Orlando, FL, November 1998.

    Google Scholar 

  21. A. Lain and P. Banerjee. Exploiting spatial regularity in irregular iterative applications. In Proceedings of the 9th International Parallel Processing Symposium, Santa Barbara, CA, April 1995.

    Google Scholar 

  22. Y. Lin and D. Padua. On the automatic parallelization of sparse and irregular Fortran programs. In Proceedings of the 4th Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers, Pittsburgh, PA, May 1998.

    Google Scholar 

  23. Y. Lin and D. Padua. Compiler analysis of irregular memory accesses. In Proceedings of the SIGPLAN ’00 Conference on Programming Language Design and Implementation, Vancouver, Canada, June 2000.

    Google Scholar 

  24. W. Liu and A. Sherman. Comparative analysis of the cuthill-mckee and the reverse cuthill-mckee ordering algorithms for sparse matrices. SIAM Journal on Numerical Analysis, 13(2):198–213, April 1976.

    Article  MATH  MathSciNet  Google Scholar 

  25. B. Lu and J. Mellor-Crummey. Compiler optimization of implicit reductions for distributed memory multiprocessors. In Proceedings of the 12th International Par-allel Processing Symposium, Orlando, FL, April 1998.

    Google Scholar 

  26. K. S. McKinley, S. Carr, and C.-W. Tseng. Improving data locality with loop transformations. ACM Transactions on Programming Languages and Systems, 18(4):424–453, July 1996.

    Article  Google Scholar 

  27. J. Mellor-Crummey, D. Whalley, and K. Kennedy. Improving memory hierarchy performance for irregular applications. In Proceedings of the 1999 ACM International Conference on Supercomputing, Rhodes, Greece, June 1999.

    Google Scholar 

  28. N. Mitchell, L. Carter, and J. Ferrante. Localizing non-affine array references. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Newport Beach, LA, October 1999.

    Google Scholar 

  29. M. Rinard and P. Diniz. Commutativity analysis: A new analysis technique for par-allelizing compilers. ACM Transactions on Programming Languages and Systems, 19(6):942–992, November 1997.

    Article  Google Scholar 

  30. G. Rivera and C.-W. Tseng. Data transformations for eliminating conflict misses. In tiProceedings of the SIGPLAN ’98 Conference on Programming Language Design and Implementation, Montreal, Canada, June 1998.

    Google Scholar 

  31. H. Simon. Partitioning of unstructured mesh problems for parallel processing. In Proceedings of the Conference on Parallel Methods on Large Scale Structural Analysis and Physics Applications. Permagon Press, 1991.

    Google Scholar 

  32. J. P. Singh, C. Holt, T. Totsuka, A. Gupta, and J. Hennessy. Load balancing and data locality in adaptive hierarchical n-body methods: Barnes-hut, fast multipole, and radiosity. Journal of Parallel and Distributed Computing, June 1995.

    Google Scholar 

  33. M. E. Wolf and M. Lam. A data locality optimizing algorithm. In Proceedings of the SIGPLAN ’91 Conference on Programming Language Design and Implementation, Toronto, Canada, June 1991.

    Google Scholar 

  34. H. Yu and L. Rauchwerger. Adaptive reduction parallelization techniques. In Proceedings of the 2000 ACM International Conference on Supercomputing, Santa Fe, NM, May 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Han, H., Tseng, CW. (2000). A Comparison of Locality Transformations for Irregular Codes. In: Dwarkadas, S. (eds) Languages, Compilers, and Run-Time Systems for Scalable Computers. LCR 2000. Lecture Notes in Computer Science, vol 1915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-40889-4_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-40889-4_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41185-7

  • Online ISBN: 978-3-540-40889-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics