Skip to main content

Automatic Parameter Tuning of Three-Dimensional Tiled FDTD Kernel

  • Conference paper
  • First Online:
High Performance Computing for Computational Science -- VECPAR 2014 (VECPAR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8969))

Abstract

This paper introduces an automatic tuning method for the tiling parameters required in an implementation of the three-dimensional FDTD method based on time-space tiling. In this tuning process, an appropriate range for the tile size is first determined by trial experiments using cubic tiles. The tile shape is then optimized by using the Monte Carlo method. The tiled FDTD kernel was multi-threaded and its performance with the tuned parameters was evaluated on multi-core processors. When compared with a naively implemented kernel, the performance of the tuned FDTD kernel was improved by more than a factor of two.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Lu, J., Thiel, D., Saario, S.: FDTD analysis of dielectric-embedded electronically switched multiple-beam (DE-ESMB) antenna array. IEEE Trans. Magn. 38, 701–704 (2002)

    Article  Google Scholar 

  2. Ala, G., Di Piazza, M.C., Tine, G., Viola, F., Vitale, G.: Numerical simulation of radiated EMI in 42 V electrical automotive architectures. IEEE Trans. Magn. 42, 879–882 (2006)

    Article  Google Scholar 

  3. Chew, K.C., Fusco, V.F.: A parallel implementation of the finite difference time-domain algorithm. Int. J. Numer. Model. 8, 293–299 (1995)

    Article  Google Scholar 

  4. Wolf, M.: More iteration space tiling. In: Proceedings of the Supercomputing 1989, pp. 655–664 (1989)

    Google Scholar 

  5. Wonnacott, D.: Using time skewing to eliminate idle time due to memory bandwidth and network limitations. In: Proceedings of the IPDPS 2000 (2000)

    Google Scholar 

  6. Strzodka, R., et al.: Cache oblivious parallelograms in iterative stencil computations. In: Proceedings of the ICS 2010, pp. 49–59 (2010)

    Google Scholar 

  7. Bondhugula, U., Hartono, A., Ramanujam, J., Sadayaooan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: Proceedings of the 2008 ACM SIGPLAN Programming Language Design and Implementation (PLDI), pp. 101–113 (2008)

    Google Scholar 

  8. Minami, T., et al.: Temporal and spatial tiling method without redundant calculations for three-dimensional FDTD method. IPSJ Tran. Adv. Comput. Syst. (In Japanese) (to appear)

    Google Scholar 

  9. Hiraishi, T., et al.: Xcrypt: a perl extension for job level parallel programming. In: Proceedings of the WHIST 2012 (2012)

    Google Scholar 

  10. Whaley, R.C., Petitet, A., Dongarra, J.: Automated empirical optimization of software and the ATLAS project. Parallel Comput. 27, 3–35 (2001)

    Article  MATH  Google Scholar 

  11. Vuduc, R., Demmel, J., Yelick, K.: OSKI: a library of automatically tuned sparse matrix kernels. In: Proceedings of the SciDAC 2005, Journal of Physics: Conference Series, vol. 16, pp. 521–530 (2005)

    Google Scholar 

  12. Datta, K., et al.: Stencil computation optimization and auto-tuning on state-of-the-art muticore architectures. In: Proceedings of the SC 2008 (2008)

    Google Scholar 

  13. Datta, K., et al.: Auto-tuning the 27-point stencil for multicore. In: Proceedings of the iWAPT 2009 (2009)

    Google Scholar 

  14. Shirako, J., Sharma, K., Fauzia, N., Pouchet, L.-N., Ramanujam, J., Sadayappan, P., Sarkar, V.: Analytical bounds for optimal tile size selection. In: O’Boyle, M. (ed.) CC 2012. LNCS, vol. 7210, pp. 101–121. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  15. Maruyama, N., Nomura, T., Sato, K., Matsuoka, S.: Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. In: Proceedings of the SC 2011 (2008)

    Google Scholar 

  16. Wellein, G., et al.: Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. In: Proceedings of the COMPSAC 2009, pp. 579–586 (2009)

    Google Scholar 

  17. Wittmann, M., Hager, G., Wellein, G.: Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory. In: Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing. WS and Phd Forum (IPDPSW) (2010)

    Google Scholar 

  18. Orozco, D., Gau, G.: Mapping the FDTD application to many-core chip architectures. In: Proceedings of the 2009 International Conference on Parallel Processing (ICPP), pp. 309–316 (2009)

    Google Scholar 

  19. PLUTO - An automatic parallelizer and locality optimizer for multicores. http://pluto-compiler.sourceforge.net

  20. Nguyen, A., Satish, N., Chhugani, J., Changkyu, K., Dubey, P.: 3.5-D blocking optimization for stencil computations on modern CPUs and GPUs. In: Proceedings of the SC 2010 (2010)

    Google Scholar 

  21. Jin, G., Endo, T., Matsuoka, S.: A multi-level optimization method for stencil computation on the domain that is bigger than memory capacity of GPU. In: Proceedings of the 2013 27th IEEE International Symposium on Parallel and Distributed Processing. WS and Phd Forum (IPDPSW), pp. 1080–1087 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takeshi Iwashita .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Minami, T., Hibino, M., Hiraishi, T., Iwashita, T., Nakashima, H. (2015). Automatic Parameter Tuning of Three-Dimensional Tiled FDTD Kernel. In: Daydé, M., Marques, O., Nakajima, K. (eds) High Performance Computing for Computational Science -- VECPAR 2014. VECPAR 2014. Lecture Notes in Computer Science(), vol 8969. Springer, Cham. https://doi.org/10.1007/978-3-319-17353-5_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17353-5_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17352-8

  • Online ISBN: 978-3-319-17353-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics