Skip to main content
Log in

A resource query interface for network-aware applications

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Networked systems provide a cost-effective platform for parallel computing, but the applications have to deal with the changing availability of computation and communication resources. Network-awareness is a recent attempt to bridge the gap between the realities of networks and the demands of applications. Network-aware applications obtain information about their execution environment and dynamically adapt to enhance their performance. Adaptation is especially important for synchronous parallel applications because a single busy communication link can become the bottleneck and degrade overall performance dramatically. This paper presents Remos, a uniform API that allows applications to obtain relevant network information, and reports on the development of parallel applications in this environment. The challenges in defining a uniform interface include network heterogeneity, diversity and variability in network traffic, and resource sharing in the network and even inside an application. The first implementation of the Remos interface uses SNMP to monitor IP-based networks. This paper reports on our methodology for developing adaptive parallel applications for high-speed networks with Remos and presents experimental results using applications generated by the Fx parallelizing compiler. The results highlight the importance and effectiveness of adaptive parallel computing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. ATM User-Network Interface Specification. Version 4.0, ATM Forum document (1996).

  2. H. Bao, J. Bielak, O. Ghattas, D.R. O'Hallaron, L.F. Kallivokas, J.R. Shewchuk and J. Xu, Earthquake ground motion modeling on parallel computers, in: Proceedings of Supercomputing '96, Pittsburgh, PA (November 1996).

  3. J. Bolliger and T. Gross, A framework-based approach to the development of network-aware applications, IEEE Trans. Software Engrg. 24(5) (May 1998) 376-390.

    Article  Google Scholar 

  4. J. Case, K. McCloghrie, M. Rose and S. Waldbusser, Protocol Operations for Version 2 of the Simple Network Management Protocol (SNMPv2), RFC 1905 (January 1999).

  5. P. Dinda, Statistical properties of host load in a distributed environment, in: Fourth Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, Pittsburgh, PA (May 1998).

  6. T.M. Forum, MPI: A Message Passing Interface, in: Proceedings of Supercomputing '93, ACM/IEEE, Oregon (November 1993) pp. 878-883.

    Google Scholar 

  7. I. Foster and K. Kesselman, Globus: A metacomputing infrastructure toolkit, Journal of Supercomputer Applications 11(2) (1997) 115-128.

    Article  Google Scholar 

  8. G.A. Geist and V.S. Sunderam, The PVM system: Supercomputer level concurrent computation on a heterogeneous network of workstations, in: Proceedings of the 6th Distributed Memory Computing Conference, IEEE (April 1991) pp. 258-261.

  9. A. Grimshaw, W. Wulf and Legion Team, The Legion vision of a worldwide virtual computer, Communications of the ACM 40(1) (January 1997).

  10. T. Gross, D. O'Hallaron and J. Subhlok, Task parallelism in a high performance fortran framework, IEEE Parallel & Distributed Technology 2(3) (Fall 1994) 16-26.

    Article  Google Scholar 

  11. E.L. Hahne, Round-robin scheduling for max-min fairness in data networks, IEEE Journal on Selected Areas in Communications 9(7) (September 1991).

  12. J.M. Jaffe, Bottleneck flow control, IEEE Transactions on Communications 29(7) (July 1981) 954-962.

    Article  MathSciNet  Google Scholar 

  13. R. Jain, The Art of Computer Systems Performance Analysis (Wiley, New York, 1991).

    MATH  Google Scholar 

  14. R. Jain, Congestion control and traffic management in ATM networks: Recent advances and a survey, Computer Networks and ISDN Systems (February 1995).

  15. C. Koelbel, D. Loveman, G. Steele and M. Zosel, The High Performance Fortran Handbook (MIT Press, Cambridge, MA, 1994).

    Google Scholar 

  16. M. Litzkow, M. Livny and M. Mutka, Condor — A hunter of idle workstations, in: Proceedings of the 8th Conference on Distributed Computing Systems, San Jose, CA (June 1988).

  17. B. Noble, M. Satyanarayanan, D. Narayanan, J. Tilton, J. Flinn and K. Walker, Agile application-aware adaptation for mobility, in: Proceedings of the 16th Symposium on Operating System Principles (October 1997) pp. 276-287.

  18. K. Obraczka and G. Gheorghiu, The performance of a service for network-aware applications, Technical Report TR 97-660, Computer Science Department, University of Southern California (October 1997).

  19. J. Schopf and F. Berman, Performance prediction in production environments, in: 12th International Parallel Processing Symposium, Orlando, FL (April 1998) pp. 647-653.

  20. S. Sharma, R. Ponnusamy, B. Moon, Y. Hwang, R. Das and J. Saltz, Run-time and compile-time support for adaptive irregular problems, in: Proceedings of Supercomputing '94, Washington, DC (November 1994) pp. 97-106.

  21. B. Siegell, Automatic generation of parallel programs with dynamic load balancing for a network of workstations, Ph.D. thesis, Department of Computer and Electrical Engineering, Carnegie Mellon University (1995). Also appeared as Technical Report CMU-CS-95-168.

  22. B. Siegell and P. Steenkiste, Automatic selection of load balancing parameters using compile-time and run-time information, Concurrency — Practice and Experience 9(3) (1996) 275-317.

    Google Scholar 

  23. P. Steenkiste, Adaptation models for network-aware distributed computations, in: 3rd Workshop on Communication, Architecture, and Applications for Network-based Parallel Computing (CANPC'99), IEEE, Orlando, January 1999 (Springer, 1999).

  24. M. Stemm, S. Seshan and R. Katz, Spand: Shared passive network performance discovery, in: USENIX Symposium on Internet Technologies and Systems, Monterey, CA (June 1997).

  25. J. Subhlok, P. Steenkiste, J. Stichnoth and P. Lieu, Airshed pollution modeling: A case study in application development in an HPF environment, in: 12th International Parallel Processing Symposium, Orlando, FL (April 1998).

  26. J. Subhlok and G. Vondran, Optimal latency-throughput tradeoffs for data parallel pipelines, in: 8th Annual ACM Symposium on Parallel Algorithms and Architectures, Padua, Italy (June 1996) pp. 62-71.

  27. J. Subhlok and B. Yang, A new model for integrated nested task and data parallel programming, in: Proceedings of the Sixth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, ACM (June 1997).

  28. H. Tangmunarunkit and P. Steenkiste, Network-aware distributed computing: A case study, in: 2nd Workshop on Runtime Systems for Parallel Programming (RTSPP), IEEE, Orlando (March 1998). Proceedings to be published by Springer. Held in conjunction with IPPS '98.

  29. R. Wolski, N. Spring and C. Peterson, Implementing a performance forecasting system for metacomputing: The network weather service, Technical Report TR-CS97-540, University of California, San Diego (May 1997).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lowekamp, B., Miller, N., Gross, T. et al. A resource query interface for network-aware applications. Cluster Computing 2, 139–151 (1999). https://doi.org/10.1023/A:1019074608189

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1019074608189

Keywords

Navigation