skip to main content
10.1145/335231.335236acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article
Free Access

Improving parallel system performance by changing the arrangement of the network links

Authors Info & Claims
Published:08 May 2000Publication History

ABSTRACT

The Midimew network is an excellent contender for implementing the communication subsystem of a high performance computer. This network is an optimal 2D topology in the sense there are no other symmetric direct networks of degree 4 with a lower average distance or diameter. In fact, it reduces the diameter of the well known torus network by approximately □2. Although the topology was proposed and analyzed a decade ago, the lack of simple deadlock avoidance mechanisms prevented its utilization up to date. This study solved this drawback by applying the Bubble switching mechanism, a low cost deadlock-avoidance strategy developed by the authors. Moreover, by using routing tables we can configure our Virtual Cut-Through adaptive router to implement either a torus or a Midimew network. Thus, we can exploit the topological advantages of Midimew networks by simply changing the disposition of the wrap-around connections of its torus counterpart, without increasing the network implementation cost. To prove this assertion, we have carried out a thorough evaluation, from the hardware cost of the router to the parallel system performance under real loads.

References

  1. 1.A. Agarwal, "Limits on Interconnection Network Performance", IEEE Trans. on Comp., vol. 2, no4, pp:398- 412, October 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.R. Beivide, E. Herrada, J. L. Balcazar, A. Armabarrena, "Optimal Distance Networks of Low Degree for Parallel Computers", IEEE Trans. on Comp., vol. 40, no. 10, pp. 1109-1123 November, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.A. Chien, "A cost and Speed Model for k-ary n-cube wormhole router", In Proc. of Hot Interconnects, August 1993.Google ScholarGoogle Scholar
  4. 4.W. J. Dally and C. L. Seitz, "Deadlock-Free Message Routing in Multiprocessor Interconnection Networks", IEEE Trans. on Comp., vol. C-36, 5, pp. 547-553, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.W.J. Dally, "Performance Analysis of k-ary n-cube Interconnection Networks", IEEE Trans. On Comp., Vol 39,No. 6 pp. 775-785, June 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.M.Galles, "Scalable Pipelined Interconnect for Distributed Endpoint Routing: The SGI Spider Chip", Proc. of Hot Interconnects IV, August 1996.Google ScholarGoogle Scholar
  7. 7.J.R. Jump. "NETSIM Reference Manual". Rice University Electrical and Computer Engineering Department, March 1993.Google ScholarGoogle Scholar
  8. 8.C.M. Lau and G. Chen, "Optimal Layouts of Midimew Networks", IEEE Trans. on Parallel and Distributed Systems vol. 7, no. 9, pp. 954-961, September 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.J. Laudon and D. Lenoski, "The SGI Origin: A cc-NUMA Highly Scalable Server", Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA-97), June 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.J Laudon, K. Gharachorloo, W. Weber, A. Gupta, J. Hennessy, M. Horowitz, M.S. Lam "The Stanford DASH Multiprocessor" IEEE Computer no. 25 vol 3, pp- 63-79, March 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.A. Nowatzyk, G. Aybay, M. Browne, E. Kelly, M. Parkin, B. Radke, and S. Vishin "The S3.mp Scalable Shared Memory Multiprocessor", Int Conf on Parallel Processing, August 1995.Google ScholarGoogle Scholar
  12. 12.V. S. Pai, P. Ranganathan, S. Adve "Rsim: An execution- Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors", IEEE TCCA Newsletter, pp. 1-10 October. 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.V. S. Pai, "RSIM Reference Manual. Version 1.0". Department of Electrical and Computer Engineering, Rice University. Technical Report 9705. July 1997.Google ScholarGoogle Scholar
  14. 14.J.M. Prellezo, V. Puente, J.A. Gregorio, R. Beivide, "SICOSYS: an into(connection network simulator for parallel computers," available at http:llwww.atc.unican.esl REPORTS/TR-ATC2-UC98.1xlf, June 1998.Google ScholarGoogle Scholar
  15. 15.V. Puente, J.A. Greg0rio, J. M. Prellezo, R.Beivide, J. Duato, and C. Izu, '!Adaptive Bubble Router: a Design to Balance Latency and Throughput in Networks for Parallel Computers", Proc. of International Conference on Parallel Computing, pp. 58-67, September. 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.S.Scott and G. Thorson, "The Cray T3E Network: Adaptive Routing in a High Performance 3D Toms", Hot Interconnects IV, August 1996.Google ScholarGoogle Scholar
  17. 17.C.L.Seitz, "Concurrent VLSI architectures", IEEE Trans. on Comp., C-33, pp. 1247-1265, December 1984.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.S. C. Woo et al., "The SPLASH-2 Programs: Characterization andMethodological Considerations", In Proceedings of the~22nd International Symposium on Computer Architecture, pp. 24.36. June 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving parallel system performance by changing the arrangement of the network links

                  Recommendations

                  Comments

                  Login options

                  Check if you have access through your login credentials or your institution to get full access on this article.

                  Sign in
                  • Published in

                    cover image ACM Conferences
                    ICS '00: Proceedings of the 14th international conference on Supercomputing
                    May 2000
                    347 pages
                    ISBN:1581132700
                    DOI:10.1145/335231

                    Copyright © 2000 ACM

                    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                    Publisher

                    Association for Computing Machinery

                    New York, NY, United States

                    Publication History

                    • Published: 8 May 2000

                    Permissions

                    Request permissions about this article.

                    Request Permissions

                    Check for updates

                    Qualifiers

                    • Article

                    Acceptance Rates

                    ICS '00 Paper Acceptance Rate33of122submissions,27%Overall Acceptance Rate584of2,055submissions,28%

                  PDF Format

                  View or Download as a PDF file.

                  PDF

                  eReader

                  View online with eReader.

                  eReader