ABSTRACT
Physical phenomenon such as protein folding requires simulation up to microseconds of physical time, which directly corresponds to the strong scaling of molecular dynamics(MD) on modern supercomputers. In this paper, we present a highly scalable implementation of the state-of-the-art MD code LAMMPS on Fugaku by exploiting the 6D mesh/torus topology of the TofuD network. Based on our detailed analysis of the MD communication pattern, we first adapt coarse-grained peer-to-peer ghost-region communication with uTofu interface, then further improve the scalability via fine-grained thread pool. Finally, Remote direct memory access (RDMA) primitives are utilized to avoid buffer overhead. Numerical results show that our optimized code can reduce 77% of the communication time, improving the performance of baseline LAMMPS by a factor of 2.9x and 2.2x for Lennard-Jones and embedded-atom method potentials when scaling to 36, 846 computing nodes. Our optimization techniques can also benefit other applications with stencil or domain decomposition methods.
- Bilge Acun, David J Hardy, Laxmikant V Kale, Keqin Li, James C Phillips, and John E Stone. 2018. Scalable molecular dynamics with NAMD on the summit system. IBM journal of research and development 62, 6 (2018), 4--1.Google Scholar
- Yuichiro Ajima, Takahiro Kawashima, Takayuki Okamoto, Naoyuki Shida, Kouichi Hirai, Toshiyuki Shimizu, Shinya Hiramoto, Yoshiro Ikeda, Takahide Yoshikawa, Kenji Uchida, et al. 2018. The tofu interconnect d. In 2018 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 646--654.Google ScholarCross Ref
- Maral Aminpour, Carlo Montemagno, and Jack A Tuszynski. 2019. An overview of molecular modeling for drug discovery with specific illustrative examples of applications. Molecules 24, 9 (2019), 1693.Google ScholarCross Ref
- Joshua A Anderson, Jens Glaser, and Sharon C Glotzer. 2020. HOOMD-blue: A Python package for high-performance molecular dynamics and hard particle Monte Carlo simulations. Computational Materials Science 173 (2020), 109363.Google ScholarCross Ref
- Christopher M Baker. 2015. Polarizable force fields for molecular dynamics simulations of biomolecules. Wiley Interdisciplinary Reviews: Computational Molecular Science 5, 2 (2015), 241--254.Google ScholarCross Ref
- Herman JC Berendsen, David van der Spoel, and Rudi van Drunen. 1995. GRO-MACS: A message-passing parallel molecular dynamics implementation. Computer physics communications 91, 1--3 (1995), 43--56.Google Scholar
- Rafael C Bernardi, Marcelo CR Melo, and Klaus Schulten. 2015. Enhanced sampling techniques in molecular dynamics simulations of biological systems. Biochimica et Biophysica Acta (BBA)-General Subjects 1850, 5 (2015), 872--877.Google ScholarCross Ref
- Kevin J Bowers, Edmond Chow, Huafeng Xu, Ron O Dror, Michael P Eastwood, Brent A Gregersen, John L Klepeis, Istvan Kolossvary, Mark A Moraes, Federico D Sacerdoti, et al. 2006. Scalable algorithms for molecular dynamics simulations on commodity clusters. In Proceedings of the 2006 ACM/IEEE Conference on Super-computing. 84--es.Google ScholarDigital Library
- David A Case, Thomas E Cheatham III, Tom Darden, Holger Gohlke, Ray Luo, Kenneth M Merz Jr, Alexey Onufriev, Carlos Simmerling, Bing Wang, and Robert J Woods. 2005. The Amber biomolecular simulation programs. Journal of computational chemistry 26, 16 (2005), 1668--1688.Google ScholarCross Ref
- Murray S Daw and Michael I Baskes. 1984. Embedded-atom method: Derivation and application to impurities, surfaces, and other defects in metals. Physical Review B 29, 12 (1984), 6443.Google ScholarCross Ref
- Xiaohui Duan, Ping Gao, Meng Zhang, Tingjian Zhang, Hongsong Meng, Yuxuan Li, Bertil Schmidt, Haohuan Fu, Lin Gan, Wei Xue, et al. 2020. Cell-list based molecular dynamics on many-core processors: a case study on sunway TaihuLight supercomputer. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 1--12.Google ScholarCross Ref
- Xiaohui Duan, Ping Gao, Tingjian Zhang, Meng Zhang, Weiguo Liu, Wusheng Zhang, Wei Xue, Haohuan Fu, Lin Gan, Dexun Chen, et al. 2018. Redesigning LAMMPS for peta-scale and hundred-billion-atom simulation on Sunway TaihuLight. In SC18: International conference for high performance computing, networking, storage and analysis. IEEE, 148--159.Google ScholarDigital Library
- Ping Gao, Xiaohui Duan, Jiaxu Guo, Jin Wang, Zhenya Song, Lizhen Cui, Xiangxu Meng, Xin Liu, Wusheng Zhang, Ming Ma, et al. 2021. LMFF: Efficient and scalable layered materials force field on heterogeneous many-core processors. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--14.Google ScholarDigital Library
- Jens Glaser, Trung Dac Nguyen, Joshua A Anderson, Pak Lui, Filippo Spiga, Jaime A Millan, David C Morse, and Sharon C Glotzer. 2015. Strong scaling of general-purpose molecular dynamics simulations on GPUs. Computer Physics Communications 192 (2015), 97--107.Google ScholarCross Ref
- Zhuoqiang Guo, Denghui Lu, Yujin Yan, Siyu Hu, Rongrong Liu, Guangming Tan, Ninghui Sun, Wanrun Jiang, Lijun Liu, Yixiao Chen, et al. 2022. Extending the limit of molecular dynamics with ab initio accuracy to 10 billion atoms. In Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 205--218.Google ScholarDigital Library
- Weile Jia, Han Wang, Mohan Chen, Denghui Lu, Lin Lin, Roberto Car, E Weinan, and Linfeng Zhang. 2020. Pushing the limit of molecular dynamics with ab initio accuracy to 100 million atoms with machine learning. In SC20: International conference for high performance computing, networking, storage and analysis. IEEE, 1--14.Google ScholarDigital Library
- Martin Karplus and John Kuriyan. 2005. Molecular dynamics and protein function. Proceedings of the National Academy of Sciences 102, 19 (2005), 6679--6685.Google ScholarCross Ref
- John E Lennard-Jones. 1931. Cohesion. Proceedings of the Physical Society 43, 5 (1931), 461.Google ScholarCross Ref
- Kai Liu and Hironori Kokubo. 2020. Prediction of ligand binding mode among multiple cross-docking poses by molecular dynamics simulations. Journal of Computer-Aided Molecular Design 34, 11 (2020), 1195--1205.Google ScholarCross Ref
- Sergei Manzhos, Richard Dawes, and Tucker Carrington. 2015. Neural network-based approaches for building high dimensional and quantum dynamics-friendly potential energy surfaces. International Journal of Quantum Chemistry 115, 16 (2015), 1012--1020.Google ScholarCross Ref
- Ryohei Okazaki, Takekazu Tabata, Sota Sakashita, Kenichi Kitamura, Noriko Takagi, Hideki Sakata, Takeshi Ishibashi, Takeo Nakamura, and Yuichiro Ajima. 2020. Supercomputer Fugaku Cpu A64fx realizing high performance, high-density packaging, and low power consumption. Fujitsu Technical Review (2020), 2020--03.Google Scholar
- Szilárd Páll, Mark James Abraham, Carsten Kutzner, Berk Hess, and Erik Lindahl. 2015. Tackling exascale software challenges in molecular dynamics simulations with GROMACS. In Solving Software Challenges for Exascale: International Conference on Exascale Applications and Software, EASC 2014, Stockholm, Sweden, April 2--3, 2014, Revised Selected Papers 2. Springer, 3--27.Google ScholarCross Ref
- James C Phillips, Rosemary Braun, Wei Wang, James Gumbart, Emad Tajkhorshid, Elizabeth Villa, Christophe Chipot, Robert D Skeel, Laxmikant Kale, and Klaus Schulten. 2005. Scalable molecular dynamics with NAMD. Journal of computational chemistry 26, 16 (2005), 1781--1802.Google ScholarCross Ref
- David E Shaw, Peter J Adams, Asaph Azaria, Joseph A Bank, Brannon Batson, Alistair Bell, Michael Bergdorf, Jhanvi Bhatt, J Adam Butts, Timothy Correia, et al. 2021. Anton 3: twenty microseconds of molecular dynamics simulation before lunch. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1--11.Google ScholarDigital Library
- David E Shaw, Ron O Dror, John K Salmon, JP Grossman, Kenneth M Mackenzie, Joseph A Bank, Cliff Young, Martin M Deneroff, Brannon Batson, Kevin J Bowers, et al. 2009. Millisecond-scale molecular dynamics simulations on Anton. In Proceedings of the conference on high performance computing networking, storage and analysis. 1--11.Google ScholarDigital Library
- David E Shaw, JP Grossman, Joseph A Bank, Brannon Batson, J Adam Butts, Jack C Chao, Martin M Deneroff, Ron O Dror, Amos Even, Christopher H Fenton, et al. 2014. Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer. In SC'14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 41--53.Google ScholarDigital Library
- Jerry Tersoff. 1988. New empirical approach for the structure and energy of covalent systems. Physical review B 37, 12 (1988), 6991.Google Scholar
- Aidan P. Thompson, H. Metin Aktulga, Richard Berger, Dan S. Bolintineanu, W. Michael Brown, Paul S. Crozier, Pieter J. in 't Veld, Axel Kohlmeyer, Stan G. Moore, Trung Dac Nguyen, Ray Shan, Mark J. Stevens, Julien Tranchida, Christian Trott, and Steven J. Plimpton. 2022. LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales. Computer Physics Communications 271 (2022), 108171. Google ScholarCross Ref
- TOP500.org. 2023. SUPERCOMPUTER FUGAKU - SUPERCOMPUTER FUGAKU, A64FX 48C 2.2GHZ, TOFU INTERCONNECT D. https://www.top500.org/system/179807/.Google Scholar
- TOP500.org. 2023. top500. https://www.top500.org/.Google Scholar
- Adri CT Van Duin, Siddharth Dasgupta, Francois Lorant, and William A Goddard. 2001. ReaxFF: a reactive force field for hydrocarbons. The Journal of Physical Chemistry A 105, 41 (2001), 9396--9409.Google ScholarCross Ref
- Han Wang, Linfeng Zhang, Jiequn Han, and E Weinan. 2018. DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics. Computer Physics Communications 228 (2018), 178--184.Google ScholarCross Ref
- Rohit Zambre, Megan Grodowitz, Aparna Chandramowlishwaran, and Pavel Shamis. 2019. Breaking band: a breakdown of high-performance communication. In Proceedings of the 48th International Conference on Parallel Processing. 1--10.Google ScholarDigital Library
- Zongxiao Zhu, Shi Jiao, Hui Wang, Linjun Wang, Min Zheng, Shengyu Zhu, Jun Cheng, and Jun Yang. 2022. Study on nanoscale friction and wear mechanism of nickel-based single crystal superalloy by molecular dynamics simulations. Tribology International 165 (2022), 107322.Google ScholarCross Ref
Index Terms
- Enhance the Strong Scaling of LAMMPS on Fugaku
Recommendations
LMFF: efficient and scalable layered materials force field on heterogeneous many-core processors
SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisLAMMPS is one of the most popular Molecular Dynamic (MD) packages and is widely used in the field of physics, chemistry and materials simulation. Layered Materials Force Field (LMFF) is our expansion of the LAMMPS potential function based on the Tersoff ...
Extending parallel scalability of LAMMPS and multiscale reactive molecular simulations
XSEDE '12: Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyondConducting molecular dynamics (MD) simulations involving chemical reactions in large-scale condensed phase systems (liquids, proteins, fuel cells, etc...) is a computationally prohibitive task even though many new ab initio based methodologies (i.e., ...
Homology modeling, molecular docking and MD simulation studies to investigate role of cysteine protease from Xanthomonas campestris in degradation of Aβ peptide
Cysteine protease is known to degrade amyloid beta peptide which is a causative agent of Alzheimer's disease. This cleavage mechanism has not been studied in detail at the atomic level. Hence, a three-dimensional structure of cysteine protease from ...
Comments