Skip to main content

A Network Performance Sensitivity Metric for Parallel Applications

  • Conference paper
Parallel and Distributed Processing and Applications (ISPA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4742))

Abstract

Excessive run time variability of parallel application codes on commodity clusters is a significant challenge. To gain insight into this problem our earlier work developed a tools to emulate parallel applications (PACE) by simulating computation and using the cluster’s interconnection network for communication, and further study parallel application run time effects (PARSE). This work expands our previous efforts by presenting a metric derived from PARSE test results conducted on several widely used parallel benchmarks and application code fragments. The metric suggests that a parallel application’s sensitivity to network performance variation can be quantified relative to its behavior in optimal network performance conditions. Ideas on how this metric can be useful to parallel application development, cluster system performance management and system administration are also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, T., Culler, D., Patterson, D.: A Case for Networks of Workstations: NOW. IEEE Micro, 54–64 (February 1995)

    Google Scholar 

  2. Baik, S., Hood, C.: Decentralized route generation method for adaptive source routing in system area networks. In: The 8th World Multi-Conference on Systemics, Cybernetics and Informatics (July 2004)

    Google Scholar 

  3. Baik, S., Hood, C., Gropp, W.: Prototype of AM3: Active Mapper and Monitoring Module for the Myrinet Environment. In: Proceedings of the HSLN Workshop (November 2002)

    Google Scholar 

  4. Chakravarthi, S., Pillai, A., Padmanabhan, J., Apte, M., Skjellum, A.: A fine-grain synchronization mechanism for QoS based communication on Myrinet. In: The International Conference on Distributed Computing 2001 (submitted)

    Google Scholar 

  5. Coll, S., Flich, J., Malumbres, M.P., Lopez, P., Duato, J., Mora, F.J.: A first implementation of in-transit buffers on myrinet gm software. In: Proceedings of the 15th International Parallel and Distributed Processing Symposium, pp. 1640–1647 (2001)

    Google Scholar 

  6. Evans, J.J., Baik, S., Hood, C.S., Gropp, W.: Toward understanding soft faults in high performance cluster networks. In: Proceedings of the 8th IFIP/IEEE International Symposium on Integrated Network Management, pp. 117–120. IEEE Computer Society Press, Los Alamitos (March 2003)

    Google Scholar 

  7. Evans, J.J., Baik, S., Kroculick, J., Hood, C.S.: Network Adapability in Clusters and Grids. In: Proceedings from the Conference on Advances in Internet Technologies and Applications (CAITA). IPSI (CDROM) (July 2004)

    Google Scholar 

  8. Evans, J.J., Hood, C.S.: PARSE: a tool for parallel application run time sensitivity evaluation. In: ICPADS. Proceedings of the Twelfth International Conference on Parallel and Distributed Systems, pp. 475–484 (July 2006)

    Google Scholar 

  9. Evans, J.J., Hood, C.S., Gropp, W.D.: Exploring the relationship between parallel application run-time variability and network performance in clusters. In: Workshop on High Speed Local Networks (HSLN) from the Proceedings of the 28th IEEE Conference on Local Computer Networks (LCN), pp. 538–547. IEEE Computer Society Press, Los Alamitos (October 2003)

    Chapter  Google Scholar 

  10. Evans, J.J., Hood, C.S.: Network performance variability in NOW clusters. In: Proceedings of the 5th IEEE International Symposium on Cluster Computing and the Grid, (CCGrid 2005), CDROM (May 2005)

    Google Scholar 

  11. Gropp, W., Lusk, E.: Reproducible measurements of MPI performance characteristics. Technical Report ANL/MCS-P755-0699, Argonne National Laboratory (1999)

    Google Scholar 

  12. Grove, D., Coddington, P.: Precise MPI performance measurement using MPIBench. Technical report, Adelaide University, Adelaide SA 5005, Australia (2001)

    Google Scholar 

  13. Jurczyk, M.: Traffic control in wormhole-routing multistage interconnection networks. In: Proceedings of the International Conference on Parallel and Distributed Computing and Systems, vol. 1, pp. 157–162 (2000)

    Google Scholar 

  14. University of Tennessee: Top500 supercomputer sites (2004), online Document, http://www.top500.org/

  15. Reussner, R., Sanders, P., Prechelt, L., Muller, M.: SKaMPI: a detailed, accurate MPI benchmark. In: Alexandrov, V.N., Dongarra, J.J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 1497, pp. 52–59. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  16. Petrini, F., Coll, S., Frachtenberg, E., Gurvitts, L., Hoisie, A.: Using multirail networks in high-performance clusters. In: Proceedings of the 2001 IEEE International Conference on Cluster Computing, pp. 15–24. IEEE Computer Society Press, Los Alamitos (2001)

    Google Scholar 

  17. Sottile, M.J., Minnich, R.G.: Supermon: A high-speed cluster monitoring system. In: Proceedings of the IEEE International Conference on Cluster Computing, pp. 39–46. IEEE Computer Society Press, Los Alamitos (September 2002)

    Chapter  Google Scholar 

  18. Tam, A.T.C., Wang, C.L.: Contention-free complete exchange algorithm on clusters. In: Proceedings of the IEEE International Conference on Clusters, pp. 57–64. IEEE Computer Society Press, Los Alamitos (November-December 2000)

    Chapter  Google Scholar 

  19. P. H. Worley: Parallel Spectral Transform Shallow Water Model (December 2003), onine Document http://www.csm.ornl.gov/chammp/pstswm/

  20. Worley, P.H., Robinson, A.C., Mackay, D.R., Barragy, E.J.: A study of application sensitivity to variation in message passing latency and bandwidth. In: Concurrency: Practice and Experience, vol. 10, pp. 387–406. John Wiley & Sons, Chichester (April 1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ivan Stojmenovic Ruppa K. Thulasiram Laurence T. Yang Weijia Jia Minyi Guo Rodrigo Fernandes de Mello

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Evans, J.J., Hood, C.S. (2007). A Network Performance Sensitivity Metric for Parallel Applications. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds) Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science, vol 4742. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74742-0_81

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74742-0_81

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74741-3

  • Online ISBN: 978-3-540-74742-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics