Skip to main content

Parallel OpenMP and CUDA Implementations of the N-Body Problem

  • Conference paper
  • First Online:
Computational Science and Its Applications – ICCSA 2019 (ICCSA 2019)

Abstract

The N-body problem, in the field of astrophysics, predicts the movements of the planets and their gravitational interactions. This paper aims at developing efficient and high-performance implementations of two versions of the N-body problem. Adaptive tree structures are widely used in N-body simulations. Building and storing the tree and the need for work-load balancing pose significant challenges in high-performance implementations. Our implementations use various cores in CPU and GPU via efficient work-load balancing with data and task parallelization. The contributions include OpenMP and Nvidia CUDA implementations to parallelize force computation and mass distribution, and achieve competitive performance in terms of speedup and running time which is empirically justified and graphed. This research not only aids as an alternative to complex simulations but also to other big data applications requiring work-load distribution and computationally expensive procedures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this paper, we have considered quad-tree to implement the Barnes-Hut algorithm.

References

  1. COS 126 Programming Assignment: N-Body Simulation, September 2004. http://www.cs.princeton.edu/courses/archive/fall04/cos126/assignments/nbody.html. Accessed 07 May 2018

  2. CUDA C Programming Guide, October 2018. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-general-purpose-parallel-computing-architecture. Accessed 07 May 2018

  3. n-body problem - Wikipedia, October 2018. https://en.wikipedia.org/wiki/N-body_problem. Accessed 7 May 2018

  4. The Barnes-Hut Algorithm: 15–418 Spring 2013, October 2018. http://15418.courses.cs.cmu.edu/spring2013/article/18. Accessed 7 May 2018

  5. The Barnes-Hut Galaxy Simulator, October 2018. http://beltoforion.de/article.php?a=barnes-hut-galaxy-simulator. Accessed 7 May 2018

  6. World’s Largest Supercomputer Simulation Explains Growth of Galaxies, October 2018. https://phys.org/news/2005-06-world-largest-supercomputer-simulation-growth.html. Accessed 7 May 2018

  7. Appel, A.W.: An efficient program for many-body simulation. SIAM J. Sci. Stat. Comput. 6(1), 85–103 (1985)

    Article  MathSciNet  Google Scholar 

  8. Barnes, J., Hut, P.: A hierarchical O (N log N) force-calculation algorithm. Nature 324(6096), 446 (1986)

    Article  Google Scholar 

  9. Barney, B.: OpenMP, Jun 2018. https://computing.llnl.gov/tutorials/openMP. Accessed 07 May 2018

  10. Bhatt, S., Liu, P., Fernandez, V., Zabusky, N.: Tree codes for vortex dynamics: application of a programming framework. In: International Parallel Processing Symposium. Citeseer (1995)

    Google Scholar 

  11. Blelloch, G., Narlikar, G.: A practical comparison of TV-body algorithms. In: Parallel Algorithms: Third DIMACS Implementation Challenge, 17–19 October 1994, vol. 30, p. 81 (1997)

    Chapter  Google Scholar 

  12. Board Jr., J.A., Hakura, Z.S., Elliott, W.D., Rankin, W.T.: Scalable variants of multipole-accelerated algorithms for molecular dynamics applications. Technical report. Citeseer (1994)

    Google Scholar 

  13. Burtscher, M., Pingali, K.: An efficient CUDA implementation of the tree-based barnes hut N-body algorithm. In: GPU computing Gems Emerald edition, pp. 75–92. Elsevier (2011)

    Google Scholar 

  14. Carugati, N.J.: The parallelization and optimization of the N-body problem using OpenMP and OpenMPI (2016)

    Google Scholar 

  15. Chanduka, B., Gangavarapu, T., Jaidhar, C.D.: A single program multiple data algorithm for feature selection. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds.) ISDA 2018 2018. AISC, vol. 940, pp. 662–672. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-16657-1_62

    Chapter  Google Scholar 

  16. Dagum, L., Menon, R.: OpenMP: an industry standard api for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)

    Article  Google Scholar 

  17. Damgov, V., Gotchev, D., Spedicato, E., Del Popolo, A.: N-body gravitational interactions: a general view and some heuristic problems. arXiv preprint astro-ph/0208373 (2002)

    Google Scholar 

  18. Del Sozzo, E., Di Tucci, L., Santambrogio, M.D.: A highly scalable and efficient parallel design of N-body simulation on FPGA. In: 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 241–246. IEEE (2017)

    Google Scholar 

  19. Fernandez, V.M., Zabusky, N.J., Liu, P., Bhatt, S., Gerasoulis, A.: Filament surgery and temporal grid adaptivity extensions to a parallel tree code for simulation and diagnosis in 3D vortex dynamics. In: ESAIM: Proceedings. vol. 1, pp. 197–211. EDP Sciences (1996)

    Google Scholar 

  20. Garland, M., et al.: Parallel computing experiences with CUDA. IEEE Micro 4, 13–27 (2008)

    Article  Google Scholar 

  21. Greengard, L., Rokhlin, V.: A fast algorithm for particle simulations. J. comput. Phys. 73(2), 325–348 (1987)

    Article  MathSciNet  Google Scholar 

  22. Greengard, L., Rokhlin, V.: A new version of the fast multipole method for the laplace equation in three dimensions. Acta Numer. 6, 229–269 (1997)

    Article  MathSciNet  Google Scholar 

  23. Heggie, D., Hut, P.: The gravitational million-body problem: a multidisciplinary approach to star cluster dynamics (2003)

    Google Scholar 

  24. Liu, P., Bhatt, S.N.: Experiences with parallel N-body simulation. IEEE Trans. Parallel Distrib. Syst. 11(12), 1306–1323 (2000)

    Article  Google Scholar 

  25. Mills, P.H., Nyland, L.S., Prins, J.F., Reif, J.H.: Prototyping N-body simulation in Proteus. In: Proceedings of the Sixth International Parallel Processing Symposium, pp. 476–482. IEEE (1992)

    Google Scholar 

  26. Nyland, L.S., Prins, J.F., Reif, J.H.: A data-parallel implementation of the adaptive fast multipole algorithm. In: Proceedings of the DAGS 1993 Symposium (1993)

    Google Scholar 

  27. Nylons, L.: Fast N-body simulation with CUDA (2007)

    Google Scholar 

  28. Pringle, G.J.: Numerical study of three-dimensional flow using fast parallel particle algorithms. Ph.D. thesis, Napier University of Edinburgh (1994)

    Google Scholar 

  29. Salmon, J.K.: Parallel hierarchical N-body methods. Ph.D. thesis, California Institute of Technology (1991)

    Google Scholar 

  30. Singh, J.P.: Parallel hierarchical N-body methods and their implications for multiprocessors (1993)

    Google Scholar 

  31. Springel, V., Yoshida, N., White, S.D.: GADGET: a code for collisionless and gasdynamical cosmological simulations. New Astron. 6(2), 79–117 (2001)

    Article  Google Scholar 

  32. Sundaram, S.: Fast algorithms for N-body simulation. Technical report, Cornell University (1993)

    Google Scholar 

  33. Swinehart, C.: The Barnes-Hut Algorithm, January 2011. http://arborjs.org/docs/barnes-hut. Accessed 07 May 2018]

  34. Totoo, P., Loidl, H.W.: Parallel Haskell implementations of the N-body problem. Concurr. Comput.: Pract. Exp. 26(4), 987–1019 (2014)

    Article  Google Scholar 

  35. Warren, M.S., Becker, D.J., Goda, M.P., Salmon, J.K., Sterling, T.L.: Parallel supercomputing with commodity components. In: PDPTA, pp. 1372–1381 (1997)

    Google Scholar 

  36. Warren, M.S., Salmon, J.K.: A parallel hashed oct-tree N-body algorithm. In: Proceedings of the 1993 ACM/IEEE conference on Supercomputing, pp. 12–21. ACM (1993)

    Google Scholar 

  37. Xue, G.: An o(n) time hierarchical tree algorithm for computing force field in N-body simulations. Theor. Comput. Sci. 197(1–2), 157–169 (1998)

    Article  MathSciNet  Google Scholar 

  38. Zhao, F., Johnsson, S.L.: The parallel multipole method on the connection machine. SIAM J. Sci. Stat. Comput. 12(6), 1420–1437 (1991)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tushaar Gangavarapu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gangavarapu, T., Pal, H., Prakash, P., Hegde, S., Geetha, V. (2019). Parallel OpenMP and CUDA Implementations of the N-Body Problem. In: Misra, S., et al. Computational Science and Its Applications – ICCSA 2019. ICCSA 2019. Lecture Notes in Computer Science(), vol 11619. Springer, Cham. https://doi.org/10.1007/978-3-030-24289-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-24289-3_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-24288-6

  • Online ISBN: 978-3-030-24289-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics