Article

Free Access

Improving parallel system performance by changing the arrangement of the network links

Authors:
V. Puente

University of Cantabria, 39005 Santander, Spain

University of Cantabria, 39005 Santander, Spain
View Profile

,
C. Izu

University of Adelaide, SA 5005 Australia

University of Adelaide, SA 5005 Australia
View Profile

,
J. A. Gregorio

University of Cantabria, 39005 Santander, Spain

University of Cantabria, 39005 Santander, Spain
View Profile

,
R. Beivide

University of Cantabria, 39005 Santander, Spain

University of Cantabria, 39005 Santander, Spain
View Profile

,
J. M. Prellezo

University of Cantabria, 39005 Santander, Spain

University of Cantabria, 39005 Santander, Spain
View Profile

,
F. Vallejo

University of Cantabria, 39005 Santander, Spain

University of Cantabria, 39005 Santander, Spain
View Profile

ICS '00: Proceedings of the 14th international conference on SupercomputingMay 2000Pages 44–53https://doi.org/10.1145/335231.335236

Published:08 May 2000Publication History

ICS '00: Proceedings of the 14th international conference on Supercomputing

Pages 44–53

ABSTRACT

The Midimew network is an excellent contender for implementing the communication subsystem of a high performance computer. This network is an optimal 2D topology in the sense there are no other symmetric direct networks of degree 4 with a lower average distance or diameter. In fact, it reduces the diameter of the well known torus network by approximately □2. Although the topology was proposed and analyzed a decade ago, the lack of simple deadlock avoidance mechanisms prevented its utilization up to date. This study solved this drawback by applying the Bubble switching mechanism, a low cost deadlock-avoidance strategy developed by the authors. Moreover, by using routing tables we can configure our Virtual Cut-Through adaptive router to implement either a torus or a Midimew network. Thus, we can exploit the topological advantages of Midimew networks by simply changing the disposition of the wrap-around connections of its torus counterpart, without increasing the network implementation cost. To prove this assertion, we have carried out a thorough evaluation, from the hardware cost of the router to the parallel system performance under real loads.

References

1.A. Agarwal, "Limits on Interconnection Network Performance", IEEE Trans. on Comp., vol. 2, no4, pp:398- 412, October 1991. Google ScholarDigital Library
2.R. Beivide, E. Herrada, J. L. Balcazar, A. Armabarrena, "Optimal Distance Networks of Low Degree for Parallel Computers", IEEE Trans. on Comp., vol. 40, no. 10, pp. 1109-1123 November, 1991. Google ScholarDigital Library
3.A. Chien, "A cost and Speed Model for k-ary n-cube wormhole router", In Proc. of Hot Interconnects, August 1993.Google Scholar
4.W. J. Dally and C. L. Seitz, "Deadlock-Free Message Routing in Multiprocessor Interconnection Networks", IEEE Trans. on Comp., vol. C-36, 5, pp. 547-553, 1987. Google ScholarDigital Library
5.W.J. Dally, "Performance Analysis of k-ary n-cube Interconnection Networks", IEEE Trans. On Comp., Vol 39,No. 6 pp. 775-785, June 1990. Google ScholarDigital Library
6.M.Galles, "Scalable Pipelined Interconnect for Distributed Endpoint Routing: The SGI Spider Chip", Proc. of Hot Interconnects IV, August 1996.Google Scholar
7.J.R. Jump. "NETSIM Reference Manual". Rice University Electrical and Computer Engineering Department, March 1993.Google Scholar
8.C.M. Lau and G. Chen, "Optimal Layouts of Midimew Networks", IEEE Trans. on Parallel and Distributed Systems vol. 7, no. 9, pp. 954-961, September 1996. Google ScholarDigital Library
9.J. Laudon and D. Lenoski, "The SGI Origin: A cc-NUMA Highly Scalable Server", Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA-97), June 1997. Google ScholarDigital Library
10.J Laudon, K. Gharachorloo, W. Weber, A. Gupta, J. Hennessy, M. Horowitz, M.S. Lam "The Stanford DASH Multiprocessor" IEEE Computer no. 25 vol 3, pp- 63-79, March 1992. Google ScholarDigital Library
11.A. Nowatzyk, G. Aybay, M. Browne, E. Kelly, M. Parkin, B. Radke, and S. Vishin "The S3.mp Scalable Shared Memory Multiprocessor", Int Conf on Parallel Processing, August 1995.Google Scholar
12.V. S. Pai, P. Ranganathan, S. Adve "Rsim: An execution- Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors", IEEE TCCA Newsletter, pp. 1-10 October. 1997. Google ScholarDigital Library
13.V. S. Pai, "RSIM Reference Manual. Version 1.0". Department of Electrical and Computer Engineering, Rice University. Technical Report 9705. July 1997.Google Scholar
14.J.M. Prellezo, V. Puente, J.A. Gregorio, R. Beivide, "SICOSYS: an into(connection network simulator for parallel computers," available at http:llwww.atc.unican.esl REPORTS/TR-ATC2-UC98.1xlf, June 1998.Google Scholar
15.V. Puente, J.A. Greg0rio, J. M. Prellezo, R.Beivide, J. Duato, and C. Izu, '!Adaptive Bubble Router: a Design to Balance Latency and Throughput in Networks for Parallel Computers", Proc. of International Conference on Parallel Computing, pp. 58-67, September. 1999. Google ScholarDigital Library
16.S.Scott and G. Thorson, "The Cray T3E Network: Adaptive Routing in a High Performance 3D Toms", Hot Interconnects IV, August 1996.Google Scholar
17.C.L.Seitz, "Concurrent VLSI architectures", IEEE Trans. on Comp., C-33, pp. 1247-1265, December 1984.Google ScholarDigital Library
18.S. C. Woo et al., "The SPLASH-2 Programs: Characterization andMethodological Considerations", In Proceedings of the~22nd International Symposium on Computer Architecture, pp. 24.36. June 1995. Google ScholarDigital Library

Index Terms

Recommendations

On Improving the Performance of Hybrid Wired-Wireless Network-on-Chip Architectures
NoCArc '16: Proceedings of the 9th International Workshop on Network on Chip Architectures

Recently, hybrid wired-wireless Network-on-Chip (WiNoC) have been proposed to meet the performance and scalability demands of modern System-on-Chip (SoC) design. However, due to the presence of wirelines with multi-hop nodes in the hybrid architecture, ...
Read More
Rearranging links: a cost-effective approach to improve the reliability of multistage interconnection networks

The use of multiprocessor systems is the main method for providing a high computational power. Multistage interconnection networks MINs are widely used to connect processors and memory modules in multiprocessor systems. Therefore, the design of an ...
Read More
Changing with the times: adaptive interconnects and coherence for future chip multiprocessors
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICS '00: Proceedings of the 14th international conference on Supercomputing
May 2000
347 pages
ISBN:1581132700
DOI:10.1145/335231
Chairmen:
John Reynders
Los Alamos National Lab, Los Alamos, NM
,
Alex Veidenbaum
Univ. of California at Irvine, Irvine
Copyright © 2000 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 May 2000
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
ICS '00 Paper Acceptance Rate33of122submissions,27%Overall Acceptance Rate584of2,055submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 11
  Total Citations
  View Citations
- 368
  Total Downloads
- Downloads (Last 12 months)16
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Improving parallel system performance by changing the arrangement of the network links

ICS '00: Proceedings of the 14th international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

On Improving the Performance of Hybrid Wired-Wireless Network-on-Chip Architectures

Rearranging links: a cost-effective approach to improve the reliability of multistage interconnection networks

Changing with the times: adaptive interconnects and coherence for future chip multiprocessors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Improving parallel system performance by changing the arrangement of the network links

ICS '00: Proceedings of the 14th international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

On Improving the Performance of Hybrid Wired-Wireless Network-on-Chip Architectures

Rearranging links: a cost-effective approach to improve the reliability of multistage interconnection networks

Changing with the times: adaptive interconnects and coherence for future chip multiprocessors

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media