Skip to main content

Multi-User System Management on SCI Clusters

  • Chapter
Book cover SCI: Scalable Coherent Interface

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1734))

  • 322 Accesses

Abstract

The growing maturity of hardware and software components has tempted researchers to build very large SCI clusters with several hundred processors that are operated as high-performance compute servers in multi-user mode.

In this chapter, we present a resource management software for the user access and system administration of high-performance compute clusters named Computing Center Software (CCS). It is in day-to-day use since 1992 on various parallel systems and has recently been adapted to the management of SCI clusters. CCS provides pluggable schedulers, optimal space partitioning for multiple users, reliable user access, and powerful tools for specifying resources and services by means of a specification language and a graphical user interface.

After a brief introduction in the remainder of this section, we describe the CCS system architecture and the characteristics of its resource description facilities.

The work presented in this chapter was done while all three authors were at Paderborn Center for Parallel Computing, http://www.upb.de/pc2

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramson, D., Sosic, R., Giddy, J., Hall, B.: Nimrod: A Tool for Performing Parameterized Simulations using Distributed Workstations. In: 4th IEEE Symp. High Performance and Distributed Computing (August 1995)

    Google Scholar 

  2. Baker, M., Fox, G., Yau, H.: Cluster Computing Review. Northeast Parallel Architectures Center, Syracuse University, New York (1995), http://www.npar.syr.edu/techreports/index.html

  3. Bauer, B., Ramme, F.: A General Purpose Resource Description Language. In: Grebe, B. (ed.) Parallele Datenverarbeitung mit dem Transputer, pp. 68–75. Springer, Berlin (1991)

    Google Scholar 

  4. Bayucan, A., Henderson, R., Proett, T., Tweten, D., Kelly, B.: Portable Batch System: External Reference Specification. Release 1.1.7, NASA Ames Research Center (June 1996)

    Google Scholar 

  5. Berman, F., Wolski, R., Figueira, S., Schopf, J., Shao, G.: Application-Level Scheduling on Distributed Heterogeneous Networks. Supercomputing (November 1996)

    Google Scholar 

  6. Boden, N., Cohen, D., Felderman, R.E., Kulawik, A.E., Seitz, C.L., Seizovic, J.N., Su, W.K.: Myrinet: A Gigabit-per-Second Local Area Network. IEEE Micro 15(1), 29–36 (1995)

    Article  Google Scholar 

  7. Brune, M., Gehring, J., Keller, A., Reinefeld, A.: RSD – Resource and Service Description. In: Intl. Symp. on High Performance Computing Systems and Applications HPCS 1998, Edmonton Canada, Kluwer Academic Press, Dordrecht (1998)

    Google Scholar 

  8. Epema, D., Livny, M., van Dantzig, R., Evers, X., Pruyne, J.: A Worldwide Flock of Condors: Load Sharing among Workstation Clusters. In: FGCS, vol. 12, pp. 53–66 (1996)

    Google Scholar 

  9. Gehring, J., Ramme, F.: Architecture-Independent Request-Scheduling with Tight Waiting-Time Estimations. In: Feitelson, D.G., Rudolph, L. (eds.) IPPS-WS 1996 and JSSPP 1996. LNCS, vol. 1162, pp. 41–54. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  10. GENIAS Software GmbH: Codine: Computing in Distributed Networked Environments (January 1999), http://www.genias.de/products/codine

  11. Grimshaw, A., Weissman, J., West, E., Loyot, E.: Metasystems: An Approach Combining Parallel Processing and Heterogeneous Distributed Computing Systems. J. Parallel Distributed Computing 21, 257–270 (1994)

    Article  Google Scholar 

  12. Jones, J., Brickell, C.: Second Evaluation of Job Queueing/Scheduling Software: Phase 1 Report. Nasa Ames Research Center, NAS Tech. Rep. NAS-97-013 (June 1997)

    Google Scholar 

  13. Keller, A., Reinefeld, A.: CCS Resource Management in Networked HPC Systems. In: 7th Heterogeneous Computing Workshop HCW 1998 at IPPS, Orlando Florida, pp. 44–56. IEEE Comp. Society Press, Los Alamitos (1998)

    Chapter  Google Scholar 

  14. Kinsbury, B.A.: The Network Queuing System. Cosmic Software, NASA Ames Research Center (1986)

    Google Scholar 

  15. Litzkow, M.J., Livny, M.: Condor – A Hunter of Idle Workstations. In: Procs. 8th IEEE Int. Conference on Distributed Computing Systems, June 1988, pp. 104–111 (1988)

    Google Scholar 

  16. LSF: Product Overview (January 1999), http://www.platform.com/content/products/

  17. NQE-Administration. Cray-Soft USA SG-2150 2.0 (May 1995)

    Google Scholar 

  18. Ramme, F., Römke, T., Kremer, K.: A Distributed Computing Center Software for the Efficient Use of Parallel Computer Systems. In: Gentzsch, W., Harms, U. (eds.) HPCN-Europe 1994. LNCS, vol. 797, pp. 129–136. Springer, Heidelberg (1994)

    Google Scholar 

  19. Tandiary, F., Kothari, S.C., Dixit, A., Anderson, E.W.: Batrun: Utilizing Idle Workstations for Large-Scale Computing. IEEE Parallel and Distributed Techn., 41–48 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Brune, M., Keller, A., Reinefeld, A. (1999). Multi-User System Management on SCI Clusters. In: Hellwagner, H., Reinefeld, A. (eds) SCI: Scalable Coherent Interface. Lecture Notes in Computer Science, vol 1734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/10704208_34

Download citation

  • DOI: https://doi.org/10.1007/10704208_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66696-7

  • Online ISBN: 978-3-540-47048-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics