Skip to main content

Coarse Grain Task Parallelization of Earthquake Simulator GMS Using OSCAR Compiler on Various Cc-NUMA Servers

  • Conference paper
  • First Online:
  • 620 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9519))

Abstract

This paper proposes coarse grain task parallelization for a earthquake simulation program using Finite Difference Method to solve the wave equations in 3-D heterogeneous structure or the Ground Motion Simulator (GMS) on various cc-NUMA servers using IBM, Intel and Fujitsu multicore processors. The GMS has been developed by the National Research Institute for Earth Science and Disaster Prevention (NIED) in Japan. Earthquake wave propagation simulations are important numerical applications to save lives through damage predictions of residential areas by earthquakes. Parallel processing with strong scaling has been required to precisely calculate the simulations quickly. The proposed method uses the OSCAR compiler for exploiting coarse grain task parallelism efficiently to get scalable speed-ups with strong scaling. The OSCAR compiler can analyze data dependence and control dependence among coarse grain tasks, such as subroutines, loops and basic blocks. Moreover, locality optimizations considering the boundary calculations of FDM and a new static scheduler that enables more efficient task schedulings on cc-NUMA servers are presented. The performance evaluation shows 110 times speed-up using 128 cores against the sequential execution on a POWER7 based 128 cores cc-NUMA server Hitachi SR16000 VM1, 37.2 times speed-up using 64 cores against the sequential execution on a Xeon E7-8830 based 64 cores cc-NUMA server BS2000, 19.8 times speed-up using 32 cores against the sequential execution on a Xeon X7560 based 32 cores cc-NUMA server HA8000/RS440, 99.3 times speed-up using 128 cores against the sequential execution on a SPARC64 VII based 256 cores cc-NUMA server Fujitsu M9000, 9.42 times speed-up using 12 cores against the sequential execution on a POWER8 based 12 cores cc-NUMA server Power System S812L.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Akcelik, V., Bielak, J., Biros, G., Epanomeritakis, I., Fernandez, A., Ghattas, O., Kim, E.J., Lopez, J., O’Hallaron, D.R., Tu, T., Urbanic, J.: Highresolution forward and inverse earthquake modeling of terascale computers. In: Proceedings of the ACM/IEEE SC2003 (2003)

    Google Scholar 

  2. Aoi, S., Fujiwara, H.: 3-D finite difference method using discontinuous grids. Bull. Seismol. Soc. Am. 89, 918–930 (1999)

    Google Scholar 

  3. Tiankai, T., David, R.O., Omar, G.: Scalable parallel octree meshing for terascale applications. In: Proceedings of ACM/IEEE SC2005 (2005)

    Google Scholar 

  4. Aoi, S., Nishizawa, N., Aoki, T.: Large scale simulation of seismic wave propagation using GPGPU. In: Proceedings of the Fifthteenth World Conference on Earthquake Engineering (2012)

    Google Scholar 

  5. Kasahara, H., Obata, M., Ishizaka, K.: Automatic coarse grain task parallel processing on SMP Using OpenMP. In: Midkiff, S.P., Moreira, J.E., Gupta, M., Chatterjee, S., Ferrante, J., Prins, J.F., Pugh, B., Tseng, C.-W. (eds.) LCPC 2000. LNCS, vol. 2017, pp. 189–207. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  6. Obata, M., Shirako, J., Kaminaga, H., Ishizaka, K., Kasahara, H.: Hierarchical parallelism control for multigrain parallel processing. In: Pugh, B., Tseng, C.-W. (eds.) LCPC 2002. LNCS, vol. 2481, pp. 31–44. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  7. GMS Homepage. http://www.gms.bosai.go.jp

  8. The HDF Group. http://www.hdfgroup.org/

  9. Monica, D.L., Edward, E.R., Michael, E.W.: The cache performance and optimizations of blocked algorithms. In: Proceedings of the Fourth International Conference on Architectural Support For Programming Languages and Operating Systems, pp. 63–74 (1991)

    Google Scholar 

  10. Apan, Q., Ken, K.: A cache-consciout profitability model for empirical tuning of loop fusion. In: 18th International Workshop, LCpPC 2005, Hawthorne, NY, USA, October 20–22, 2005, pp. 106–120 (2005)

    Google Scholar 

  11. OSCAR ApPI 2.0. http://www.kasahara.elec.waseda.ac.jp/api2/regist_en.html

  12. Jaswinder, P.S., Truman, J., Anoop, G., John, L.H.: An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH Multiprocessors. In: Supercomputing 1993 Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, pp. 214–225 (1993)

    Google Scholar 

  13. Cahill, J.J., Nguyen, T., Vega, M., Baska, D., Szerdi, D., Pross, H., Arroyo, R.X., Nguyen, H., Mueller, M.J., Henderson, D.J., Moreira, J.: IBM power systems build with the POWER8 architecture and processors. IBM J. Res. Dev. 59(1), 1–10 (2015)

    Article  Google Scholar 

Download references

Acknowledgment

The authors would like to thank the members of the Hitachi-Waseda collaborative research project and the Hitachi, Ltd. and the NIED for their support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mamoru Shimaoka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Shimaoka, M., Wada, Y., Kimura, K., Kasahara, H. (2016). Coarse Grain Task Parallelization of Earthquake Simulator GMS Using OSCAR Compiler on Various Cc-NUMA Servers. In: Shen, X., Mueller, F., Tuck, J. (eds) Languages and Compilers for Parallel Computing. LCPC 2015. Lecture Notes in Computer Science(), vol 9519. Springer, Cham. https://doi.org/10.1007/978-3-319-29778-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-29778-1_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-29777-4

  • Online ISBN: 978-3-319-29778-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics