research-article

Regularizing Sparse and Imbalanced Communications for Voxel-based Brain Simulations on Supercomputers

Authors:

Jie WuAuthors Info & Claims

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing

Article No.: 81, Pages 1 - 11

https://doi.org/10.1145/3545008.3545019

Published: 13 January 2023 Publication History

Abstract

Inter-process communications form a performance bottleneck for large-scale brain simulations. The sparse and imbalanced communication patterns of human brain make it particularly challenging to design a communication system for supporting large-scale brain simulations. In this paper, we tackle the communication challenges posed by large-scale brain simulations with sparse and imbalanced communication patterns. We design a virtual communication topology with a merge and forward algorithm that exploits the sparsity to regularize inter-process communications. To balance the communication loads of different processes, we formulate voxel partition in brain simulations as a k-way graph partition problem and propose a constrained deterministic greedy algorithm to solve the problem effectively. We conducted extensive simulation experiments for evaluating the performance of the proposed communication scheme and found that the proposed method may significantly reduce communication overheads and shorten simulation time for large-scale brain models.

References

[1]

Katrin Amunts and Thomas Lippert. 2021. Brain research challenges supercomputing. Science 374, 6571 (2021), 1054–1055.

[2]

Jehoshua Bruck, Ching-Tien Ho, Shlomo Kipnis, Eli Upfal, and Derrick Weathersby. 1997. Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Transactions on parallel and distributed systems 8, 11 (1997), 1143–1156.

Digital Library

[3]

Thang Nguyen Bui and Curt Jones. 1992. Finding good approximate vertex and edge partitions is NP-hard. Inform. Process. Lett. 42, 3 (1992), 153–159. https://doi.org/10.1016/0020-0190(92)90140-Q.

Digital Library

[4]

George Chatzikonstantis, Harry Sidiropoulos, Christos Strydis, Mario Negrello, Georgios Smaragdos, Chris I De Zeeuw, and DJ Soudris. 2019. Multinode implementation of an extended hodgkin–huxley simulator. Neurocomputing 329(2019), 370–383.

[5]

Wenfei Fan, Ruochun Jin, Muyang Liu, Ping Lu, Xiaojian Luo, Ruiqi Xu, Qiang Yin, Wenyuan Yu, and Jingren Zhou. 2020. Application driven graph partitioning. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1765–1779.

Digital Library

[6]

Michael R Garey, David S Johnson, and Larry Stockmeyer. 1974. Some simplified NP-complete problems. In Proceedings of the sixth annual ACM symposium on Theory of computing. 47–63.

Digital Library

[7]

Marc-Oliver Gewaltig and Markus Diesmann. 2007. Nest (neural simulation tool). Scholarpedia 2, 4 (2007), 1430.

[8]

Dan FM Goodman and Romain Brette. 2008. Brian: a simulator for spiking neural networks in python. Frontiers in neuroinformatics 2 (2008), 5.

[9]

Moritz Helias, Susanne Kunkel, Gen Masumoto, Jun Igarashi, Jochen Eppler, Shin Ishii, Tomoki Fukai, Abigail Morrison, and Markus Diesmann. 2012. Supercomputers Ready for Use as Discovery Machines for Neuroscience. Frontiers in Neuroinformatics 6 (2012). https://doi.org/10.3389/fninf.2012.00026.

[10]

Roger V Hoang, Devyani Tanna, Laurence C Jayet Bray, Sergiu M Dascalu, and Frederick C Harris Jr. 2013. A novel CPU/GPU simulation environment for large-scale biologically realistic neural modeling. Frontiers in neuroinformatics 7 (2013), 19.

[11]

Torsten Hoefler and Timo Schneider. 2012. Optimization principles for collective neighborhood communications. In SC’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 1–10.

Digital Library

[12]

Jun Igarashi, Hiroshi Yamaura, and Tadashi Yamazaki. 2019. Large-Scale Simulation of a Layered Cortical Sheet of Spiking Network Model Using a Tile Partitioning Method. Frontiers in Neuroinformatics 13 (2019). https://doi.org/10.3389/fninf.2019.00071.

[13]

Jakob Jordan, Tammo Ippen, Moritz Helias, Itaru Kitayama, Mitsuhisa Sato, Jun Igarashi, Markus Diesmann, and Susanne Kunkel. 2018. Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers. Frontiers in Neuroinformatics 12 (2018). https://doi.org/10.3389/fninf.2018.00002.

[14]

George Karypis and Vipin Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM J. Sci. Comput. 20, 1 (dec 1998), 359–392.

Digital Library

[15]

George Karypis and Vipin Kumar. 1998. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel and Distrib. Comput. 48, 1 (1998), 71–95.

Digital Library

[16]

Tatsuro Kawamoto, Masashi Tsubaki, and Tomoyuki Obuchi. 2018. Mean-field theory of graph neural networks in graph partitioning. Advances in Neural Information Processing Systems 31 (2018).

[17]

James C Knight and Thomas Nowotny. 2020. Larger GPU-accelerated brain simulations with procedural connectivity. Nature Computational Science(2020). https://doi.org/10.1101/2020.04.27.063693.

[18]

Sameer Kumar, Amith Mamidala, Philip Heidelberger, Dong Chen, and Daniel Faraj. 2014. Optimization of MPI collective operations on the IBM Blue Gene/Q supercomputer. The International Journal of High Performance Computing Applications 28, 4(2014), 450–464.

Digital Library

[19]

Susanne Kunkel, Maximilian Schmidt, Jochen M. Eppler, Hans E. Plesser, Gen Masumoto, Jun Igarashi, Shin Ishii, Tomoki Fukai, Abigail Morrison, Markus Diesmann, and Moritz Helias. 2014. Spiking network simulation code for petascale computers. Frontiers in Neuroinformatics 8 (2014), 78. https://doi.org/10.3389/fninf.2014.00078.

[20]

Lingda Li, Robel Geda, Ari B Hayes, Yanhao Chen, Pranav Chaudhari, Eddy Z Zhang, and Mario Szegedy. 2017. A simple yet effective balanced edge partition model for parallel computing. Proceedings of the ACM on Measurement and Analysis of Computing Systems 1, 1(2017), 1–21.

Digital Library

[21]

Henning Meyerhenke, Peter Sanders, and Christian Schulz. 2017. Parallel graph partitioning for complex networks. IEEE Transactions on Parallel and Distributed Systems 28, 9 (2017), 2625–2638.

Digital Library

[22]

Seyed H Mirsadeghi, Jesper Larsson Traff, Pavan Balaji, and Ahmad Afsahi. 2017. Exploiting common neighborhoods to optimize MPI neighborhood collectives. In 2017 IEEE 24th international conference on high performance computing (HiPC). IEEE, 348–357.

[23]

Dharmendra S. Modha, Rajagopal Ananthanarayanan, Steven K. Esser, Anthony Ndirango, Anthony J. Sherbondy, and Raghavendra Singh. 2011. Cognitive Computing. Commun. ACM 54, 8 (Aug. 2011), 62–71. https://doi.org/10.1145/1978542.1978559.

Digital Library

[24]

Azade Nazi, Will Hang, Anna Goldie, Sujith Ravi, and Azalia Mirhoseini. 2019. Gap: Generalizable approximate graph partitioning framework. arXiv preprint arXiv:1903.00614(2019).

[25]

Fabio Petroni, Leonardo Querzoni, Khuzaima Daudjee, Shahin Kamali, and Giorgio Iacoboni. 2015. Hdrf: Stream-based partitioning for power-law graphs. In Proceedings of the 24th ACM international on conference on information and knowledge management. 243–252.

Digital Library

[26]

A Peyser. 2017. Nestmc: a prototype multicompartment neuronal network simulator for high-performance computing. (2017). http://hdl.handle.net/2128/14860.

[27]

Roberto Poveda and Jonatan Gómez. 2018. Solving the quadratic assignment problem (QAP) through a fine-grained parallel genetic algorithm implemented on GPUs. In International Conference on Computational Collective Intelligence. Springer, 145–154.

[28]

Oguz Selvitopi and Cevdet Aykanat. 2016. Reducing latency cost in 2D sparse matrix partitioning models. Parallel Comput. 57(2016), 1–24.

Digital Library

[29]

Oguz Selvitopi and Cevdet Aykanat. 2019. Regularizing irregularly sparse point-to-point communications. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 1–14.

Digital Library

[30]

George M Slota, Kamesh Madduri, and Sivasankaran Rajamanickam. 2014. PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks. In 2014 IEEE International Conference on Big Data (Big Data). IEEE, 481–490.

[31]

George M Slota, Kamesh Madduri, and Sivasankaran Rajamanickam. 2016. Complex network partitioning using label propagation. SIAM Journal on Scientific Computing 38, 5 (2016), S620–S645.

Digital Library

[32]

Georgios Smaragdos, Georgios Chatzikonstantis, Rahul Kukreja, Harry Sidiropoulos, Dimitrios Rodopoulos, Ioannis Sourdis, Zaid Al-Ars, Christoforos Kachris, Dimitrios Soudris, Chris I De Zeeuw, 2017. BrainFrame: a node-level heterogeneous accelerator platform for neuron simulations. Journal of neural engineering 14, 6 (2017), 066008.

[33]

Isabelle Stanton and Gabriel Kliot. 2012. Streaming graph partitioning for large distributed graphs. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. 1222–1230.

Digital Library

[34]

El Ghazali Talbi and Pierre Bessière. 1991. A Parallel Genetic Algorithm for the Graph Partitioning Problem. (1991), 312–320. https://doi.org/10.1145/109025.109102.

Digital Library

[35]

Pedro Valero-Lara, Raül Sirvent, Antonio J Peña, and Jesús Labarta. 2019. MPI+ OpenMP tasking scalability for multi-morphology simulations of the human brain. Parallel Comput. 84(2019), 50–61.

Digital Library

[36]

Michiel A van der Vlag, Georgios Smaragdos, Zaid Al-Ars, and Christos Strydis. 2019. Exploring complex brain-simulation workloads on multi-GPU deployments. ACM Transactions on Architecture and Code Optimization (TACO) 16, 4(2019), 1–25.

Digital Library

[37]

Tadashi Yamazaki, Jun Igarashi, and Hiroshi Yamaura. 2021. Human-scale Brain Simulation via Supercomputer: A Case Study on the Cerebellum. Neuroscience 462(2021), 235–246. https://doi.org/10.1016/j.neuroscience.2021.01.014.

Cited By

Du XWang MLu ZDuan QLiu YFeng JWang H(2024)HRCM: A Hierarchical Regularizing Mechanism for Sparse and Imbalanced Communication in Whole Human Brain SimulationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.338772035:6(1056-1073)Online publication date: Jun-2024
https://doi.org/10.1109/TPDS.2024.3387720
Lu WDu XWang JZeng LYe LXiang SZheng QZhang JXu NFeng JBao YChen BChen SChen ZDai FDing WDu XFeng JHou YJi MJi PLi CLi CLi XLiu YLu WLv ZMa HQi YRolls EWang HWang HWang SWang ZXia YXie CXue XZeng TZhang CZhang NZhang WZhao Y(2024)Simulation and assimilation of the digital human brainNature Computational Science10.1038/s43588-024-00731-34:12(890-898)Online publication date: 19-Dec-2024
https://doi.org/10.1038/s43588-024-00731-3
Bao YDu XLu ZYang JHuang SFeng JZheng Q(2024)Mitigating critical nodes in brain simulations via edge removalComputer Networks10.1016/j.comnet.2024.110860255(110860)Online publication date: Dec-2024
https://doi.org/10.1016/j.comnet.2024.110860

Index Terms

Regularizing Sparse and Imbalanced Communications for Voxel-based Brain Simulations on Supercomputers
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
    2. Robotics

Recommendations

Regularizing irregularly sparse point-to-point communications
SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

This work tackles the communication challenges posed by the latency-bound applications with irregular communication patterns, i.e., applications with high average and/or maximum message counts. We propose a novel algorithm for reorganizing a given set ...
HRCM: A Hierarchical Regularizing Mechanism for Sparse and Imbalanced Communication in Whole Human Brain Simulations
Brain simulation is one of the most important measures to understand how information is represented and processed in the brain, which usually needs to be realized in supercomputers with a large number of interconnected graphical processing units (GPUs). ...
A neural decomposition of visual search using voxel-based morphometry

The ability to search efficiently for visual targets among distractors can break down after a variety of brain lesions, but the specific processes affected by the lesions are unclear. We examined search over space conjunction search and over time plus ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICPP '22: Proceedings of the 51st International Conference on Parallel Processing

August 2022

976 pages

ISBN:9781450397339

DOI:10.1145/3545008

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Key Research and Development Program of China
National Natural Science Foundation of China under Grant

Conference

ICPP '22

ICPP '22: 51st International Conference on Parallel Processing

August 29 - September 1, 2022

Bordeaux, France

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
113
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Du XWang MLu ZDuan QLiu YFeng JWang H(2024)HRCM: A Hierarchical Regularizing Mechanism for Sparse and Imbalanced Communication in Whole Human Brain SimulationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.338772035:6(1056-1073)Online publication date: Jun-2024
https://doi.org/10.1109/TPDS.2024.3387720
Lu WDu XWang JZeng LYe LXiang SZheng QZhang JXu NFeng JBao YChen BChen SChen ZDai FDing WDu XFeng JHou YJi MJi PLi CLi CLi XLiu YLu WLv ZMa HQi YRolls EWang HWang HWang SWang ZXia YXie CXue XZeng TZhang CZhang NZhang WZhao Y(2024)Simulation and assimilation of the digital human brainNature Computational Science10.1038/s43588-024-00731-34:12(890-898)Online publication date: 19-Dec-2024
https://doi.org/10.1038/s43588-024-00731-3
Bao YDu XLu ZYang JHuang SFeng JZheng Q(2024)Mitigating critical nodes in brain simulations via edge removalComputer Networks10.1016/j.comnet.2024.110860255(110860)Online publication date: Dec-2024
https://doi.org/10.1016/j.comnet.2024.110860

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten