Skip to main content

Smashing: Folding Space to Tile through Time

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5335))

Abstract

Partial differential equation solvers spend most of their computation time performing nearest neighbor (stencil) computations on grids that model spatial domains. Tiling is an effective performance optimization for improving the data locality and enabling course-grain parallelization for such computations. However, when the domains are periodic, tiling through time is not directly applicable due to wrap-around dependencies. It is possible to tile within the spatial domain, but tiling across time (i.e. time skewing) is not legal since no constant skewing can render all loops fully permutable. We introduce a technique called smashing that maps a periodic domain to computer memory without creating any wrap-around dependencies. For a periodic cylinder domain where time skewing improves performance, the performance of smashing is comparable to another method, circular skewing, which also handles the periodicity of a cylinder. Unlike circular skewing, smashing can remove wrap-around dependencies for an icosahedron model of earth’s atmosphere.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adcroft, A., Campin, J.-M., Hill, C., Marshall, J.: Implementation of an atmosphere-ocean general circulation model on the expanded spherical cube. Monthly Weather Review, 2845–2863 (2004)

    Google Scholar 

  2. Ahmed, N., Mateev, N., Pingali, K.: Synthesizing transformations for locality enhancement of imperfectly-nested loop nests. In: Conference Proceedings of the 2000 International Conference on Supercomputing, Santa Fe, New Mexico, May 2000, pp. 141–152 (2000)

    Google Scholar 

  3. Bassetti, F., Davis, K., Quinlan, D.: Optimizing transformations of stencil operations for parallel object-oriented scientific frameworks on cache-based architectures. In: Caromel, D., Oldehoeft, R.R., Tholburn, M. (eds.) ISCOPE 1998. LNCS, vol. 1505, pp. 107–118. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  4. Douglas, C.C., Hu, J., Kowarschik, M., Rüde, U., Weiß, C.: Cache Optimization for Structured and Unstructured Grid Multigrid. Electronic Transaction on Numerical Analysis, 21–40 (February 2000)

    Google Scholar 

  5. Frigo, M., Strumpen, V.: Cache oblivious stencil computations. In: Proceedings of the 19th Annual International Conference on Supercomputing (ICS), pp. 361–366. ACM, New York (2005)

    Chapter  Google Scholar 

  6. Irigoin, F., Triolet, R.: Supernode partitioning. In: Proceedings of the 15th Annual ACM SIGPLAN Symposium on Priniciples of Programming Languages, pp. 319–329 (1988)

    Google Scholar 

  7. Jin, G., Mellor-Crummey, J., Fowler, R.: Increasing temporal locality with skewing and recursive blocking. In: High Performance Networking and Computing (SC), Denver, Colorodo, November 2001. ACM Press and IEEE Computer Society Press (2001)

    Google Scholar 

  8. Kamil, S., Datta, K., Williams, S., Oliker, L., Shalf, J., Yelick, K.: Implicit and explict optimizations for stencil computations. In: Memory Systems Performance and Correctness (2006)

    Google Scholar 

  9. Kamil, S., Husbands, P., Oliker, L., Shalf, J., Yelick, K.: Impact of modern memory subsystems on cache optimizations for stencil computations. In: Proceedings of the Workshop on Memory System Performance, pp. 36–43. ACM Press, New York (2005)

    Google Scholar 

  10. Krishnamoorthy, S., Baskaran, M., Bondhugula, U., Ramanujam, J., Rountev, A.: Effective automatic parallelization of stencil computations. In: Proceedings of Programming Languages Design and Implementation (PLDI), pp. 235–244. ACM, New York (2007)

    Google Scholar 

  11. Randall, D.A., Ringler, T.D., Heikes, R.P., Jones, P., Baumgardner, J.: Climate modeling with spherical geodesic grids. Computing in Science and Engineering 4(5), 32–41 (2002)

    Article  Google Scholar 

  12. Schreiber, R., Dongarra, J.J.: Automatic blocking of nested loops. Technical Report UT-CS-90-108, Department of Computer Science, University of Tennessee (1990)

    Google Scholar 

  13. Sellappa, S., Chatterjee, S.: Cache-efficient multigrid algorithms. In: Proceedings of the 2001 International Conference on Computational Science, San Francisco, CA, USA, May 28-30, 2001. LNCS. Springer, Heidelberg (2001)

    Google Scholar 

  14. Song, Y., Li, Z.: New tiling techniques to improve cache temporal locality. ACM SIGPLAN Notices (PLDI) 34(5), 215–228 (1999)

    Article  Google Scholar 

  15. Wolf, M.E., Lam, M.S.: A data locality optimizing algorithm. In: Programming Language Design and Implementation (1991)

    Google Scholar 

  16. Wolfe, M.J.: Iteration space tiling for memory hierachies. In: Proc. of the 3rd SIAM Conf. on Parallel Processing for Scientific Computing, pp. 357–361 (1987)

    Google Scholar 

  17. Wolfe, M.J.: High Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1996)

    MATH  Google Scholar 

  18. Wonnacott, D.: Achieving scalable locality with time skewing. International Journal of Parallel Programming 30(3), 181–221 (2002)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Osheim, N., Strout, M.M., Rostron, D., Rajopadhye, S. (2008). Smashing: Folding Space to Tile through Time. In: Amaral, J.N. (eds) Languages and Compilers for Parallel Computing. LCPC 2008. Lecture Notes in Computer Science, vol 5335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89740-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89740-8_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89739-2

  • Online ISBN: 978-3-540-89740-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics