skip to main content
10.1145/3339186.3339196acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

Collective Communication for the RISC-V xBGAS ISA Extension

Published: 05 August 2019 Publication History

Abstract

Parallel programming methodologies are fundamentally dissimilar to those of conventional programming, and software developers without the requisite skillset often find it difficult to adapt to these new methods. This is particularly true for parallel programming in a distributed address space, which is necessary for any meaningful degree of scalability. As such, an approach that combines a more intuitive interface together with excellent performance within the distributed address space model is desired. In this work, we present our initial API design and implementation as well as the underlying algorithms for a collective communication library built for the Extended Base Global Address Space (xBGAS) extension to the RISC-V microarchitecture. Our runtime library is designed to enact the Partitioned Global Address Space model (PGAS) in an attempt to alleviate the difficulty associated with traditional distributed address space programming while the underlying collective implementation is formulated to prevent the loss of, and even improve, performance over traditional solutions.

References

[1]
Mike Barnett, Lance Shuler, Robert van De Geijn, Satya Gupta, David G Payne, and Jerrell Watts. 1994. Interprocessor collective communication library (InterCom). In Proceedings of IEEE Scalable High Performance Computing Conference. IEEE, 357--364.
[2]
Jehoshua Bruck, Ching-Tien Ho, Shlomo Kipnis, Eli Upfal, and Derrick Weathersby. 1997. Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Transactions on parallel and distributed systems 8, 11 (1997), 1143--1156.
[3]
Kiril Dichev, Vladimir Rychkov, and Alexey Lastovetsky. 2010. Two algorithms of irregular scatter/gather operations for heterogeneous platforms. In European MPI Users' Group Meeting. Springer, 289--293.
[4]
James Dinan and Mario Flajslik. 2014. Contexts: a mechanism for high throughput communication in OpenSHMEM. In Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models. ACM, 10.
[5]
Ananth Grama, Vipin Kumar, Anshul Gupta, and George Karypis. 2003. Introduction to parallel computing. Pearson Education.
[6]
Jithin Jose, Krishna Kandalla, Jie Zhang, Sreeram Potluri, and DKDK Panda. 2013. Optimizing collective communication in openshmem. In 7th International Conference on PGAS Programming Models. 185.
[7]
Oak Ridge National Labs. {n. d.}. Oak Ridge OpenSHMEM Benchmarks. https://github.com/ornl-languages/osb
[8]
Tactical Computing Labs. {n. d.}. RISC-V Extended Addressing Architecture Extension Specifi cation Codenamed: xBGAS. https://github.com/tactcomplabs/xbgas-archspec
[9]
Tactical Computing Labs. {n. d.}. xBGAS Machine-Level Runtime Library. https://github.com/tactcomplabs/xbgas-runtime
[10]
Tactical Computing Labs. {n. d.}. xBGAS RISC-V ToolChain. https://github.com/tactcomplabs/xbgas-tools
[11]
John D Leidel. 2018. Stake: a coupled simulation environment for RISC-V memory experiments. In Proceedings of the International Symposium on Memory Systems. ACM, 365--376.
[12]
John D Leidel, Xi Wang, Frank Conlon, Yong Chen, David Donofrio, Farzad Fatollahi-Fard, and Kurt Keville. 2018. xBGAS: Toward a RISC-V ISA Extension for Global, Scalable Shared Memory. In MCHPCâĂŹ18: Workshop on Memory Centric High Performance Computing.
[13]
Amith R Mamidala, Jiuxing Liu, and Dhabaleswar K Panda. 2004. Efficient Barrier and Allreduce on Infiniband clusters using multicast and adaptive algorithms. In 2004 IEEE International Conference on Cluster Computing (IEEE Cat. No. 04EX935). IEEE, 135--144.
[14]
Srđan Milaković, Zoran Budimlić, Howard Pritchard, Anthony Curtis, Barbara Chapman, and Vivek Sarkar. 2018. SHCOLL-A Standalone Implementation of OpenSHMEM-Style Collectives API. In Workshop on OpenSHMEM and Related Technologies. Springer, 90--106.
[15]
Rajesh Nishtala, Yili Zheng, Paul H Hargrove, and Katherine A Yelick. 2011. Tuning collective communication for Partitioned Global Address Space programming models. Parallel Comput. 37, 9 (2011), 576--591.
[16]
Ying Qian and Ahmad Afsahi. 2008. Efficient shared memory and RDMA based collectives on multi-rail QsNet II SMP clusters. Cluster Computing 11, 4 (2008), 341--354.
[17]
Rolf Rabenseifner. 2004. Optimization of collective reduction operations. In International Conference on Computational Science. Springer, 1--9.
[18]
Open Source Software Solutions. {n. d.}. OpenSHMEM 1.4 Specification. http://www.openshmem.org/site/sites/default/site_files/OpenSHMEM-1.4.pdf
[19]
Carlos Teijeiro, Guillermo L Taboada, Juan Touriño, Ramón Doallo, José C Mouriño, Damián A Mallón, and Brian Wibecan. 2013. Design and Implementation of an Extended Collectives Library for Unified Parallel C. Journal of Computer Science and Technology 28, 1 (2013), 72--89.
[20]
Rajeev Thakur, Rolf Rabenseifner, and William Gropp. 2005. Optimization of collective communication operations in MPICH. The International Journal of High Performance Computing Applications 19, 1 (2005), 49--66.
[21]
Jesper Larsson Traff. 2004. Hierarchical gather/scatter algorithms with graceful degradation. In 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings. IEEE, 80.
[22]
Xi Wang. {n. d.}. xBGAS Demo and Tutorial Video. https://www.youtube.com/watch?v=IeIpJkMjMuc&feature=youtu.be
[23]
Andrew Waterman, Yunsup Lee, David A Patterson, and Krste Asanovi. 2014. The RISC-V Instruction Set Manual. Volume 1: User-Level ISA, Version 2.0. Technical Report. CALIFORNIA UNIV BERKELEY DEPT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES.
[24]
Udayanga Wickramasinghe and Andrew Lumsdaine. 2016. A survey of methods for collective communication optimization and tuning. arXiv preprint arXiv:1611.06334 (2016).
[25]
Brody Williams. {n. d.}. xBGAS Collective Demo Video. https://www.youtube.com/watch?v=08CMiQ8XVnU&feature=youtu.be
[26]
Joachim Worringen. 2003. Pipelining and overlapping for MPI collective operations. In 28th Annual IEEE International Conference on Local Computer Networks, 2003. LCN'03. Proceedings. IEEE, 548--557.
[27]
Changil Yoon, Vikas Aggarwal, Vrishali Hajare, Alan D George, and Max Billingsley III. 2011. GSHMEM: A portable library for lightweight, shared-memory, parallel programming. Proc. of Partitioned Global Address Space, Galveston, Texas (2011).

Cited By

View all
  • (2024)Towards Cycle-accurate Simulation of xBGAS2024 International Conference on Computing, Networking and Communications (ICNC)10.1109/ICNC59896.2024.10556078(468-472)Online publication date: 19-Feb-2024
  • (2023)Towards xBGAS on CHERI: Supporting a Secure Global Memory2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00100(578-581)Online publication date: May-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP Workshops '19: Workshop Proceedings of the 48th International Conference on Parallel Processing
August 2019
241 pages
ISBN:9781450371964
DOI:10.1145/3339186
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

In-Cooperation

  • University of Tsukuba: University of Tsukuba

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Collectives
  2. PGAS
  3. Parallel Programming
  4. RISC-V
  5. Remote Memory Access

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPP 2019
ICPP 2019: Workshops
August 5 - 8, 2019
Kyoto, Japan

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Cycle-accurate Simulation of xBGAS2024 International Conference on Computing, Networking and Communications (ICNC)10.1109/ICNC59896.2024.10556078(468-472)Online publication date: 19-Feb-2024
  • (2023)Towards xBGAS on CHERI: Supporting a Secure Global Memory2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00100(578-581)Online publication date: May-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media