Abstract
This paper presents a theoretical study of the impact of noise on the scaling of a cluster when the processors participate in “local” collectives with their nearest neighbors. The model considered here is an extension of that introduced in [9] for understanding the effect of noise on the scaling of “global” collectives in large clusters. In this paper, the scaling is studied with respect to three fundamental aspects: (1) the distribution of noise: whether it is heavy or light tailed; (2) the temporal independence of noise; (3) the topology of the cluster. When the noise has a “light” tail and is temporally independent, it is shown that the cluster scales well, i.e., the slowdown per phase is just proportional to the (logarithm of the) maximum degree of the communication topology. This implies that for popular topologies such as grids and toruses the slowdown per phase is just a constant factor, which is independent of the number of processors. In the light tailed case, assuming only a weak temporal independence, a general upper bound is derived in terms of an “expansion” parameter of the communication topology. For grid-like graphs this establishes an exponential speedup compared to what was shown for global collective operations in [9].
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gioiosa, R., Petrini, F., Davis, K., Lebaillif-Delamare, F.: Analysis of System Overhead on Parallel Computers. In: ISSPIT 2004. The 4th IEEE International Symposium on Signal Processing and Information Technology, Rome, Italy (December 2004)
Jones, T.R., Brenner, L.B., Fier, J.M.: Impacts of Operating Systems on the Scalibility of Parallel Applications. Tech. Rep. UCRL-MI-202629, Lawrence Livermore National Laboratory, (March 2003)
Petrini, F., Kerbyson, D.J., Pakin, S.: The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q. In: SC 2003. ACM/IEEE Conference on Supercomputing, Phoenix, Arizona, USA (November 2003)
Frachtenberg, E., Petrini, F., Fernandez, J., Pakin, S., Coll, S.: STORM: Lightning-Fast Resource Management. In: ACM/IEEE Conference on Supercomputing (SC 2002), Baltimore, Maryland, USA (November 2002)
Hori, A., Tezuka, H., Ishikawa, Y.: Highly Efficient Gang Scheduling Implementation. In: SC 1998. ACM/IEEE Conference on Supercomputing, Orlando, FL, USA (November 1998)
Jones, T., Dawson, S., Neely, R., Tuel, W., Brenner, L., Fier, J., Blackmore, R., Caffrey, P., Maskell, B., Tomlinson, P., Roberts, M.: Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System. In: SC 2003. ACM/IEEE Conference on Supercomputing, Phoenix, Arizona, USA (November 2003)
Frachtenberg, E., Feitelson, D., Petrini, F., Fernández, J.: Flexible Coscheduling: Mitigating Load Imbalance and Improving Utilization of Heterogeneous Resources. In: IPDPS 2003. International Parallel and Distributed Processing Symposium 2003, Nice, France (April 2003)
Agarwal, S., Choi, G.S., Das, C.R., Yoo, A.B., Nagar, S.: Co-ordinated Coscheduling in Time-Sharing Clusters through a Generic Framework. In: CLUSTER 2003. IEEE International Conference on Cluster Computing, Hong Kong (December 2003)
Agarwal, S., Garg, R., Vishnoi, N.: The Impact of Noise on the Scaling of Collectives: A Theoretical Approach. In: HiPC (2005)
Garg, R., De, P.: Impact of Noise on Scaling of Collectives: An Empirical Evaluation. In: HiPC (2006)
Lipman, J., Stout, Q.F.: A performance analysis of local synchronization. In: SPAA. Symp. Parallelism in Algorithms and Architectures (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vishnoi, N.K. (2007). The Impact of Noise on the Scaling of Collectives: The Nearest Neighbor Model [Extended Abstract]. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing – HiPC 2007. HiPC 2007. Lecture Notes in Computer Science, vol 4873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77220-0_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-77220-0_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77219-4
Online ISBN: 978-3-540-77220-0
eBook Packages: Computer ScienceComputer Science (R0)