Skip to main content
Log in

Ultrafast scalable parallel algorithm for the radial distribution function histogramming using MPI maps

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In the present paper, an ultrafast and scalable parallel program for the Radial Pair Function (RPF) is presented via pure Message Passing Interface (MPI) paradigm. The parallel code computes the radial distribution function for the single-component as well as multi-component systems. The single-component and multi-component systems have been extracted for benchmarking purposes by means of user-written codes in C++ for the Graphite structure and MPI C++ for hydrogen adsorption via Grand Canonical Monte Carlo (GCMC) in the Single-Walled Carbon Nanotube (SWNT), respectively. The speedup and efficiency curves substantiate an excellent performance in terms of computing time and computation size as well. Additionally, the mentioned MPI paradigms are nearly five times (single-component systems) and two times (multi-component systems) faster than the relevant parallel codes using a machine with 48 CPUs and NVIDIA Quadro K5200/PCIe/SSE2. Some conclusions and outlooks pertaining to the numerical implementations, algorithm optimization involving the space decomposition idea have been discussed and provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. There are some codes in dealing with RDF, e.g. ISAACS [Interactive Structure Analysis of Amorphous an Crystalline Systems, (http://isaacs.sourceforge.net/)] [36], CHARMM [Chemistry at HARvard Macromolecular Mechanics, (http://www.charmm.org/), POLYANA-A, tool for the calculation of molecular radial distribution functions based on Molecular Dynamics trajectories [37] and VMD (Visual Molecular Dynamics, (http://www.ks.uiuc.edu/Research/vmd/)] [38]. As far as the authors’ knowledge, VMD plug-in is the only parallel code and it takes advantage of the Graphics Processing Units (GPUs) and CUDA parallel programming. It drastically decreases computation runtime on NVIDIA GPUs.

  2. The turbostratic carbon is a structure where the planes of atoms have been arranged at different angles and presenting the defects. Consequently, the displacement and rotation of planes are totally stochastic.

  3. The MPI communication speed is also affected by other applications running on the same cluster. Consequently, the numerical experiments have been carried out during dedicated time or when the cluster is not heavily used herein. Anyway, the computation time for one CPU experiment may last longer due to the runtime duration (nearly 56 days).

  4. It is well worth mentioning that there are many expressions for the potential expression, e.g. , Buckingham potential, Rydberg potential, Biswass-Hamann potential and Morrel-Morttram [5658] using quantum mechanics [59]. In this paper, the Lennard-Jones potential has been used which is less stable than those addressed before.

  5. It is essential to indicate that the GPUs parallel programming is out of scope of the current contribution. Evidently, there are several benefits in GPUs parallel programming and it is widely used [6472]. The VMD plugin has two major advantages. First, it is user-friendly and easy-to-use. Second, it is freely available and it promises less memory consumption and parallel computing using NVIDIA-based machines.

Abbreviations

g(r):

Radial distribution function or so-called pair distribution function in [−]

r :

Distance between particles in [Å]

N :

Number of particles in [−]

V :

Volume in [Å\(^3\)]

\(\rho \) :

Density of particles in [\(\frac{1}{ {\AA {}}^3 }\)]

\(\ell _x\) :

Length of the periodic simulation box in x direction in [Å]

\(\ell _y\) :

Length of the periodic simulation box in y direction in [Å]

\(\ell _z\) :

Length of the periodic simulation box in z direction in [Å]

\( U^{\text {LJ}}_{ij}(r)\) :

Lennard–Jones potential in [J]

\(r_{ij}\) :

Distance between the particles in [Å]

\(\epsilon _{ij}\) :

Depth of the potential well in [J]

\(\sigma _{ij}\) :

Finite distance at which the inter-particle potential is zero, in [Å]

\(k_{B}\) :

Boltzmann constant, in [\(\dfrac{J}{K}\)]

T :

Temperature in [K]

B :

Adam’s constant in [−]

\(E_c\) :

Energy in [J]

f :

Fugacity in [Pa]

\(\text {S}_p\) :

Speedup in [−]

\(\text {T}_s\) :

CPU runtime for a single-processor in [s]

\(\text {T}_p\) :

CPU runtime for \(N_p\) processors in [s]

\(\text {E}_p\) :

Efficiency in [−]

\(\text {N}_p\) :

CPU numbers in [−]

References

  1. Proffen TH, Billinge SJL, Egami T, Louca D, Kristallogr Z (2003) Structural analysis of complex materials using the atomic pair distribution function—a practical guide 218

  2. Proffen T, Page KL, McLainand SE,Clausen B, Darling TW, TenCate JA, Lee Seung-Yub, Ustundag E (2005) Atomic pair distribution function analysis of materials containing crystalline and amorphous phases. Zeitschrift für Kristallographie, p 220

  3. Billinge SJL (2003) The atomic pair distribution function: past and present. Department of Physics and Astronomy, Michigan State University, Michigan

    Google Scholar 

  4. Zernike F, Prins JA (1927) Die beugung von röntgenstrahlen in flüssigkeiten als effekt der molekülanordnung. Zeitschrift für Physik 41(2):184–194 (in German)

    Article  Google Scholar 

  5. Ueda S (1961) The pair correlation function of an imperfect electron gas in high densities. Prog Theor Phys 26

  6. Chihara J (1974) Calculation of pair correlations in a degenerate electron liquid. Prog Theor Phys 53(2)

  7. Kambayashi S, Chihara J (1994) Extraction of the bridge function for simple liquids from a molecular dynamics simulation and its application for correcting the pair distribution function. Am Phys Soc

  8. Deublein S, Eckl B, Stoll J, Lishchuk SV, Guevara-Carrion Gabriela, Glass Colin W, Merker Thorsten, Bernreuther Martin, Hasse Hans, Vrabec Jadran (2011) ms2: a molecular simulation tool for thermodynamic properties. Comput Phys Commun 182(11):2350–2367

    Article  Google Scholar 

  9. Glass CW, Reiser S, Rutkai G, Deublein S, Köster Andreas, Guevara-Carrion Gabriela, Wafai Amer, Horsch Martin, Bernreuther Martin, Windmann Thorsten, Hasse Hans, Vrabec Jadran (2014) ms2: a molecular simulation tool for thermodynamic properties, new version release. Comput Phys Commun 185(12):3302–3306

    Article  Google Scholar 

  10. Li K, Li D, Liang J, Ye Y, Liao Y, Liu R, Mo Y (2015) Performance analysis of parallel algorithms in physics simulation for molecular dynamics simulation liquid metals solidification processes. Comput Fluids 110:19–26. ParCFD 2013

  11. Alder BJ, Frankel SP, Lewinson VA (1955) Radial distribution function calculated by the monte carlo method for a hard sphere fluid. J Chem Phys 23(3)

  12. Tanaka S, Nakano M (2013) Classical density functional calculation of radial distribution functions of liquid water. Chem Phys 430

  13. Frenkel D, Smit B (2002) Understanding molecular simulation: from algorithms to applications. Computational science. Academic Press, USA

    MATH  Google Scholar 

  14. Liboff RL (1989) Correlation functions in statistical mechanics and astrophysics. Phys Rev A 39:4098–4102

    Article  MathSciNet  Google Scholar 

  15. Jerier JF, Imbault D, Donzé FV, Doremus P (2009) A geometric algorithm based on tetrahedral meshes to generate a dense polydisperse sphere packing. Granular Matter 11(1):43–52

    Article  MATH  Google Scholar 

  16. Jerier JF (2009) Modélisation de la compression haute densité des poudres métalliques ductiles par la méthode des éléments discrets. PhD thesis, Université Joseph Fourier de Grenoble, Grenoble, France, Novembre (in French)

  17. Jerier JF, Richefeu V, Imbault D, Donzé FV (2010) Packing spherical discrete elements for large scale simulations. Comput Methods Appl Mech Eng 199(25–28):1668–1676

    Article  MATH  Google Scholar 

  18. Abrahamsson PJ, Sasic S, Rasmuson A (2016) On continuum modelling of dense inelastic granular flows of relevance for high shear granulation. Powder Technol 294:323–329

    Article  Google Scholar 

  19. Jeong J, Mounanga P, Ramézani H, Bouasker M (2011) A new multi-scale modeling approach based on hygro-Cosserat theory for self-induced stress in hydrating cementitious mortars. Comput Mater Sci 50(7):2063–2074

    Article  Google Scholar 

  20. Ramézani H, Mounanga P, Jeong J, Bouasker M (2013) Role of cement paste composition on the self induced stress in early-age mortars: application of the cosserat size number. Cement Concrete Compos 39:43–59

    Article  Google Scholar 

  21. Jeong J, Ramézani H, Leklou N (2014) Thermo-chemical heterogeneous hydration gradient modeling of concrete and aggregates size effect on ITZ. Thermochim Acta 590:165–180

    Article  Google Scholar 

  22. Jeong J, Ramézani H, Sardini P, Kondo D, Ponson Laurent, Siitari-Kauppi Marja (2015) Porous media modeling and micro-structurally motivated material moduli determination via the micro-dilatation theory. Eur Phys J Spec Topics 224(9):1805–1816

    Article  Google Scholar 

  23. Jeong J, Ramézani H, Leklou N (2016) Why does the modified arrhenius’ law fail to describe the hydration modeling of recycled aggregate? Thermochim Acta 626:13–30

    Article  Google Scholar 

  24. Hosseini SY, Fattahi M, Ahmadi G (2016) CFD study of hydrodynamic and heat transfer in a 2d spouted bed. J Taiwan Inst Chem Eng 58:107–116

    Article  Google Scholar 

  25. Mansoori GA (1993) Radial distribution functions and their role in modeling of mixtures behavior. Fluid Phase Equ 87:1–22

    Article  Google Scholar 

  26. Matteoli E, Mansoori GA (1995) A simple expression for radial distribution functions of pure fluids and mixtures. J Chem Phy

  27. Griebel M, Knapek S, Zumbusch GW (2007) Numerical simulation in molecular dynamics: numerics, algorithms, parallelization, applications. Texts in computational science and engineering. Springer, Berlin

    MATH  Google Scholar 

  28. Gerhard N (2004) The physics of colloidal soft matter. Inst Fundam Technol Res

  29. Younge K, Johnston B, Christenson C, Bohara A, Jacobson J, Butler NM, Saulnier P (2006) The use of radial distribution and pair-correlation functions to analyze and describe biological aggregations -. Limnol Oceanogr Methods 4:382–391

    Article  Google Scholar 

  30. Snir M (1998) MPI the complete reference: the MPI Core. MIT Press, USA

    Google Scholar 

  31. Karniadakis G, Kirby RM (2003) Parallel scientific computing in C++ and MPI: a seamless approach to parallel algorithms and their implementation. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  32. Teixidó Ivan, Sebé Francesc, Conde Josep, Solsona Francesc (2014) MPI-based implementation of an enhanced algorithm to solve the LPN problem in a memory-constrained environment. Parallel Comput 40(5–6):100–112

    Article  Google Scholar 

  33. Gropp W, Lusk E, Skjellum A (2014) Using MPI: portable parallel programming with the message-passing interface. Scientific and engineering computation. MIT Press, USA

    Google Scholar 

  34. Gropp W, Hoefler T, Lusk E, Thakur R (2014) Using advanced MPI: modern features of the message-passing interface. Computer science and intelligent systems. MIT Press, USA

    Google Scholar 

  35. Nielsen F (2016) Introduction to HPC with MPI for data science. Undergraduate topics in computer science. Springer International Publishing, Berlin

    Book  Google Scholar 

  36. Le Roux S, Petkov V (2010) ISAACS—interactive structure analysis of amorphous and crystalline systems. J Appl Crystallogr 43(1):181–185

    Article  Google Scholar 

  37. Dimitroulis C, Raptis T, Raptis V (2015) POLYANA-A tool for the calculation of molecular radial distribution functions based on molecular dynamics trajectories. Comput Phys Commun 197:220–226

    Article  Google Scholar 

  38. Humphrey W, Dalke A, Schulten K (1996) Vmd: visual molecular dynamics. J Mol Graph 14(1):33–38

    Article  Google Scholar 

  39. Barney B (2015) Message passing interface (MPI). Lawrence Livermore National Laboratory

  40. Arabnia HR, Oliver MA (1987) A transputer network for the arbitrary rotation of digitised images. Comput J 30(5):425–432

    Article  Google Scholar 

  41. Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. Comput Graph Forum 8(1):3–11

    Article  Google Scholar 

  42. Arabnia HR (1990) A parallel algorithm for the arbitrary rotation of digitized images using process-and-data-decomposition approach. J Parallel Distrib Comput 10(2):188–192

    Article  Google Scholar 

  43. Arabnia HR, Smith JW (1993) A reconfigurable interconnection network for imaging operations and its implementation using a multi-stage switching box. In: Proceedings of the 7th Annual International High Performance Computing Conference. The 1993 High Performance Computing: New Horizons Supercomputing SymposiumCalgary, Alberta, Canada, pp 349–357

  44. Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor—theoretical properties and algorithms. Parallel Comput 21(11):1783–1805

    Article  Google Scholar 

  45. Bhandarkar SM, Arabnia HR (1995) The hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114

    Article  Google Scholar 

  46. Arabnia HR (1996) A distributed stereocorrelation algorithm. In: Selected papers from the Third International Conference on Computer Communications and Networks Distributed stereo-correlation algorithm, vol 19, pp 707–711

  47. Arabnia HR, Bhandarkar SM (1996) Parallel stereocorrelation on a reconfigurable multi-ring network. J Supercomput 10(3):243–269

    Article  MATH  Google Scholar 

  48. Bhandarkar SM, Arabnia HR (1997) Parallel computer vision on a reconfigurable multiprocessor network. IEEE Trans Parallel Distrib Syst 8(3):292–309

    Article  Google Scholar 

  49. Arif Wani M, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomput 25(1):43–62

    Article  MATH  Google Scholar 

  50. Kurniawan Y, Bhatia SK, Rudolph V (2005) Monte carlo simulation of binary mixture adsorption of methane and carbon dioxide in carbon slit pores. Technical report, University of Queensland

  51. Nguyen TX, Cohaut N, Bae J-S, Bhatia SK (2008) New method for atomistic modeling of the microstructure of activated carbons using hybrid reverse monte carlo simulation. Langmuir 24(15):7912–7922

    Article  Google Scholar 

  52. Gotzias A, Heiberg-Andersen H, Kainourgiakis M, Steriotis Th (2010) Grand canonical monte carlo simulations of hydrogen adsorption in carbon cones. Appl Surf Sci 256(17):5226–5231

    Article  Google Scholar 

  53. Konstantakou M, Gotzias A, Kainourgiakis M, Stubos Ak, Steriotis TA (2011) Applications of Monte Carlo method in science and engineering: GCMC simulations of gas adsorption in carbon pore structures. Chapter 26. InTech, pp 653–676

  54. Grama AY, Gupta A, Kumar V (1993) Isoefficiency: measuring the scalability of parallel algorithms and architectures. University of Minnesota, Minnesota

    Google Scholar 

  55. Ramézani H, Kouetcha DN, Cohaut N (2016) Scalable parallel grand canonical monte carlo simulation using MPI maps: micro-pollutant adsorption objective. In: WCCM-APCOM 2016 Congress, 24–29 July

  56. Kantor AL, Long LN, Micci MM (2000) Molecular dynamics simulation of dissociation kinetics. The Pennsylvania State University, Pennsylvania

    Book  Google Scholar 

  57. Lim T-C (2004) Connection among classical interatomic potential functions. J Math Chem 36(3):261–269

    Article  MATH  MathSciNet  Google Scholar 

  58. Kouetcha DN, Ramézani H, Cohaut N (2015) Lennard-jones potential determination via the schrödinger equation. In: Scientific committee of European Comsol Conference in Grenoble-France, editor. Excerpt from the Proceedings of the COMSOL Conference 2015. Grenoble, France, October 14–16

  59. Zettili N (2009) Quantum mechanics: concepts and applications. Wiley, USA

    Google Scholar 

  60. Konstantakou M et al (2011) Applications of Monte Carlo method in science and engineering

  61. Luo T, Lloyd JR (2014) Grand canonical monte carlo simulation of hydrogen adsorption in different carbon nano structures. Michigan State University, Michigan

    Google Scholar 

  62. Kouetcha D, Ramézani H, Cohaut N (2015) Etude structurale et détermination de la fonction de corrélation de paire du graphène et du graphite. In: Colloque Francophone du Carbone GFEC–2015. Karellis, Savoie, France, May (in French)

  63. Ramézani H, Chuta E (2014) Hydrogen adsorption simulation in the single-wall carbon nanotube (SWNT) network. In: Matériaux 2014, Montpellier, France, November Fédération Française des Matériaux (FFM) (in English)

  64. Stone JE, Phillips JC, Freddolino PL, Hardy DJ, Trabuco LG, Schulten K (2007) Accelerating molecular modeling applications with graphics processors. J Comput Chem 28

  65. Meredith JS, Alvarez G, Maier TA, Schulthess TC, Vetter JS (2009) Accuracy and performance of graphics processors: A quantum monte carlo application case study. Parallel Comput 35(3):151–163 (Revolutionary Technologies for Acceleration of Emerging Petascale Applications)

  66. Stone JE, Hardy DJ, Ufimtsev IS, Schulten K (2010) GPU-accelerated molecular modeling coming of age. J Mol Graph Model 29(2):116–125

    Article  Google Scholar 

  67. Levine BG, Stone JE, Kohlmeyer A (2011) Fast analysis of molecular dynamics trajectories with graphics processing units-radial distribution function histogramming. J Comput Phys 230(9):3556–3569

    Article  MATH  Google Scholar 

  68. Su C-C, Smith MR, Kuo F-A, Wu J-S, Hsieh C-W, Tseng K-C (2012) Large-scale simulations on multiple graphics processing units (GPUs) for the direct simulation monte carlo method. J Comput Phys 231(23):7932–7958

    Article  MathSciNet  Google Scholar 

  69. Hall C, Ji W, Blaisten-Barojas E (2014) The metropolis monte carlo method with CUDA enabled graphic processing units. J Comput Phys 258:871–879

    Article  MATH  Google Scholar 

  70. Sergey K, Igor K, Nikita N, Alexander N, Yulia Sagdeeva (2014) Scalable hybrid implementation of the schur complement method for multi-GPU systems. J Supercomput 69(1):81–88

    Article  Google Scholar 

  71. Zuwei X, Zhao H, Zheng C (2015) Accelerating population balance-monte carlo simulation for coagulation dynamics from the markov jump model, stochastic algorithm and GPU parallel computing. J Comput Phys 281:844–863

    Article  MATH  MathSciNet  Google Scholar 

  72. Menshov I, Pavlukhin P (2016) Highly scalable implementation of an implicit matrix-free solver for gas dynamics on GPU-accelerated clusters. J Supercomput 1–8

  73. Hasanov K, Lastovetsky A (2016) Hierarchical redesign of classic MPI reduction algorithms. J Supercomput 01–13

  74. Marendic P, Lemeire J, Vucinic D, Schelkens P (2016) A novel MPI reduction algorithm resilient to imbalances in process arrival times. J Supercomput 72(5):1973–2013

    Article  Google Scholar 

Download references

Acknowledgments

The second author gratefully thanks Pr. Suresh Bhatia for his scientific support pertaining to the GCMC simulations algorithm from School of Chemical Engineering at University of Queensland in Brisbane, Australia (http://www.chemeng.uq.edu.au/bhatia).

The authors thank CCSC (Centre de Calcul Scientifique en région Centre - CaSciModOT) computing center for the achieved computations in the present paper (http://cascimodot.fdpoisson.fr/ccsc).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamidréza Ramézani.

Appendix: Scalability, speedup and efficiency of MPI implementations

Appendix: Scalability, speedup and efficiency of MPI implementations

The speedup is used to express how many times a parallel program works faster than the serial program for a same problem, and is defined by:

$$\begin{aligned} \text {S}_\mathrm{p}=\frac{\text {T}_\mathrm{s}}{\text {T}_\mathrm{p}} \end{aligned}$$
(7)

where, \(\text {T}_\mathrm{s}\) and \(\text {T}_\mathrm{p}\) represent the CPU runtime for a single-processor and \(N_p\) processors, respectively. To measure the scalability of the parallel code, it is important to determine the efficiency. This parameter gives information about the performance of parallel algorithm, it is denoted as below:

$$\begin{aligned} \text {E}_\mathrm{p}=\frac{\text {T}_\mathrm{s}}{\text {T}_\mathrm{p} \times \text {N}_\mathrm{p}} \end{aligned}$$
(8)

where \(\text {N}_p\) is the CPUs number.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kouetcha, D.N., Ramézani, H. & Cohaut, N. Ultrafast scalable parallel algorithm for the radial distribution function histogramming using MPI maps. J Supercomput 73, 1629–1653 (2017). https://doi.org/10.1007/s11227-016-1854-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1854-0

Keywords

Navigation