Skip to main content

The Impact of Noise on the Scaling of Collectives: The Nearest Neighbor Model [Extended Abstract]

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4873))

Abstract

This paper presents a theoretical study of the impact of noise on the scaling of a cluster when the processors participate in “local” collectives with their nearest neighbors. The model considered here is an extension of that introduced in [9] for understanding the effect of noise on the scaling of “global” collectives in large clusters. In this paper, the scaling is studied with respect to three fundamental aspects: (1) the distribution of noise: whether it is heavy or light tailed; (2) the temporal independence of noise; (3) the topology of the cluster. When the noise has a “light” tail and is temporally independent, it is shown that the cluster scales well, i.e., the slowdown per phase is just proportional to the (logarithm of the) maximum degree of the communication topology. This implies that for popular topologies such as grids and toruses the slowdown per phase is just a constant factor, which is independent of the number of processors. In the light tailed case, assuming only a weak temporal independence, a general upper bound is derived in terms of an “expansion” parameter of the communication topology. For grid-like graphs this establishes an exponential speedup compared to what was shown for global collective operations in [9].

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gioiosa, R., Petrini, F., Davis, K., Lebaillif-Delamare, F.: Analysis of System Overhead on Parallel Computers. In: ISSPIT 2004. The 4th IEEE International Symposium on Signal Processing and Information Technology, Rome, Italy (December 2004)

    Google Scholar 

  2. Jones, T.R., Brenner, L.B., Fier, J.M.: Impacts of Operating Systems on the Scalibility of Parallel Applications. Tech. Rep. UCRL-MI-202629, Lawrence Livermore National Laboratory, (March 2003)

    Google Scholar 

  3. Petrini, F., Kerbyson, D.J., Pakin, S.: The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q. In: SC 2003. ACM/IEEE Conference on Supercomputing, Phoenix, Arizona, USA (November 2003)

    Google Scholar 

  4. Frachtenberg, E., Petrini, F., Fernandez, J., Pakin, S., Coll, S.: STORM: Lightning-Fast Resource Management. In: ACM/IEEE Conference on Supercomputing (SC 2002), Baltimore, Maryland, USA (November 2002)

    Google Scholar 

  5. Hori, A., Tezuka, H., Ishikawa, Y.: Highly Efficient Gang Scheduling Implementation. In: SC 1998. ACM/IEEE Conference on Supercomputing, Orlando, FL, USA (November 1998)

    Google Scholar 

  6. Jones, T., Dawson, S., Neely, R., Tuel, W., Brenner, L., Fier, J., Blackmore, R., Caffrey, P., Maskell, B., Tomlinson, P., Roberts, M.: Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System. In: SC 2003. ACM/IEEE Conference on Supercomputing, Phoenix, Arizona, USA (November 2003)

    Google Scholar 

  7. Frachtenberg, E., Feitelson, D., Petrini, F., Fernández, J.: Flexible Coscheduling: Mitigating Load Imbalance and Improving Utilization of Heterogeneous Resources. In: IPDPS 2003. International Parallel and Distributed Processing Symposium 2003, Nice, France (April 2003)

    Google Scholar 

  8. Agarwal, S., Choi, G.S., Das, C.R., Yoo, A.B., Nagar, S.: Co-ordinated Coscheduling in Time-Sharing Clusters through a Generic Framework. In: CLUSTER 2003. IEEE International Conference on Cluster Computing, Hong Kong (December 2003)

    Google Scholar 

  9. Agarwal, S., Garg, R., Vishnoi, N.: The Impact of Noise on the Scaling of Collectives: A Theoretical Approach. In: HiPC (2005)

    Google Scholar 

  10. Garg, R., De, P.: Impact of Noise on Scaling of Collectives: An Empirical Evaluation. In: HiPC (2006)

    Google Scholar 

  11. Lipman, J., Stout, Q.F.: A performance analysis of local synchronization. In: SPAA. Symp. Parallelism in Algorithms and Architectures (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Srinivas Aluru Manish Parashar Ramamurthy Badrinath Viktor K. Prasanna

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vishnoi, N.K. (2007). The Impact of Noise on the Scaling of Collectives: The Nearest Neighbor Model [Extended Abstract]. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing – HiPC 2007. HiPC 2007. Lecture Notes in Computer Science, vol 4873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77220-0_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77220-0_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77219-4

  • Online ISBN: 978-3-540-77220-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics