Summary
Parallel operating systems are the interface between parallel computers (or computer systems) and the applications (parallel or not) that are executed on them. They translate the hardware’s capabilities into concepts usable by programming languages.
Great diversity marked the beginning of parallel architectures and their operating systems. This diversity has since been reduced to a small set of dominating configurations: symmetric multiprocessors running commodity applications and operating systems (UNIX and Windows NT) and multicomputers running custom kernels and parallel applications. Additionally, there is some (mostly experimental) work done towards the exploitation of the shared memory paradigm on top of networks of workstations or personal computers.
In this chapter, we discuss the operating system components that are essential to support parallel systems and the central concepts surrounding their operation: scheduling, synchronization, multi-threading, inter-process communication, memory management and fault tolerance.
Currently, SMP computers are the most widely used multiprocessors. Users find it a very interesting model to have a computer, which, although it derives its processing power from a set of processors, does not require any changes to applications and only minor changes to the operating system. Furthermore, the most popular parallel programming languages have been ported to SMP architectures enabling also the execution of demanding parallel applications on these machines.
However, users who want to exploit parallel processing to the fullest use those same parallel programming languages on top of NORMA computers. These multicomputers with fast interconnects are the ideal hardware support for message passing parallel applications. The surviving commercial models with NORMA architectures are very expensive machines, which one will find running calculus intensive applications, such as weather forecasting or fluid dynamics modelling.
We also discuss some of the experiments that have been made both in hardware (DASH, Alewife) and in software systems (TreadMarks, Shasta) to deal with the scalability issues of maintaining consistency in shared-memory systems and to prove their applicability on a large-scale.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, A., Chaiken, D., Johnson, K., Kranz, D., Kubiatowicz, J., Kurihara, K., Lim, B., Maa, G., Nussbaum, D., The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor, Scalable Shared Memory Multiprocessors, Kluwer Academic Publishers, 1991.
Bacon, J., Concurrent Systems, An Integrated Approach to Operating Systems, Database, and Distributed System, Addison-Wesley, 1993.
Bershad, B.N., Zekauskas, M., Midway, J., Shared memory parallel programming with entry consistency for distributed memory multiprocessors, Technical Report CMU-CS-91–170, School of Computer Science, Carnegie-Mellon University, 1991.
Blank, T., The MasPar MP-1 architecture, Proc. 35th IEEE Computer Society International Conference (COMPCON), 1990, 20–24.
Bricker, A., Litzkow, M., Livny, M., Condor technical summary, Technical Report, Computer Sciences Department, University of Wisconsin-Madison, 1992.
Brooks III, E.D., Warren, K.H., A study of performance on SMP and distributed memory architectures using a shared memory programming model, SC97: High Performance Networking and Computing,San Jose, 1997.
Charlesworth, A., Starfire: Extending the SMP envelope, IEEE Micro 18, 1998.
Corbett, P.F., Feitelson, D.G., Prost, J.P., Johnson-Baylor, S.B., Parallel access to files in the vesta file system, Proc. Supercomputing ‘83, 1993, 472–481.
Cormen, T.H., Kotz, D., Integrating theory and practice in parallel file systems, Proc. the 1993 DAGS/PC Symposium, 1993, 64–74.
Custer, H., Inside Windows NT, Microsoft Press, Washington, 1993.
Dijkstra, E.W., Guarded commands, nondeterminancy and the formal derivation of programs, Communications of the ACM 18, 1975, 453–457.
Douglis, F., Transparent process migration: Design alternatives and the Sprite implementation, Software Practice and Experience 21, 1991, 757785.
Elnozahy, E.N., Johnson, D.B., Wang, Y.M., A survey of rollback-recovery protocols in message-passing systems, Technical Report, School of Computer Science, Carnegie Mellon University, 1996.
Flynn, M., IEEE J., Some computer organizations and their effectivenessTransactions on Computers 21, 1972, 948–960.
Geist, A., Beguelin, A., Dongarra, J., PVM: Parallel Virtual Machine: A Users’ Guide and Tutorial for Networked Parallel Computing, MIT Press, Cambridge, 1994.
Guedes, P., Castro, M., Distributed shared object memory, Proc.Fourth Workshop on Workstation Operating Systems (WWOS-IV),Napa, 1993, 142–149.
Gupta, A., Singh, J.P., Truman, J., Hennessy, J.L., An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors, Proc. Supercomputing’93, 1993, 214–225.
Herdeg, G.A., Design and implementation of the Alpha Server 4100 CPU and memory architecture, Digital Technical Journal 8, 1997, 4860.
Hoare, C.A.R., Monitors: An operating system structuring concept,Communications of the ACM 17, 1974, 549–557.
Intel Corporation Supercomputer Systems Division, Paragon SystemUser’s Guide,1995.
Jiang, D., Singh, J.P., A methodology and an evaluation of the SGI Ori-gin2000, Proc. SIGMETRICS Conference on Measurement and Modeling of Computer Systems, 1998, 171–181.
Johnson, E.E., Completing an MIMD multiprocessor taxonomy, Com-puter Architecture News 16, 1988, 44–47.
Keleher, P., Dwarkadas, S., Cox, A.L., Zwaenepoel, W., TreadMarks: Distributed shared memory on standard workstations and operating systems, Proc. the Winter 94 Usenix Conference, 1994, 115–131.
Kim, K.H., You, J.H., Abouelnaga, A., A scheme for coordinated independent execution of independently designed recoverable distributed processes, Proc. IEEE Fault-Tolerant Computing Symposium,1996, 130135.
Lamport, L., Time, clocks and the ordering of events in a distributed system, Communications of the ACM 21, 1978, 558–565.
Laudon, J., Lenoski, D., The SGI Origin: A ccNUMA highly scalable server, Technical Report, Silicon Graphics, Inc., 1994.
Lea, R., Cool: System support for distributed programming, Commu-nications of the ACM 36, 1993, 37–46.
Lenoski, D., Laudon, J., Truman, J., Nakahira, D., Stevens, L., Gupta, A., Hennessy, J., The DASH prototype: Login overhead and performance, IEEE Transactions on Parallel and Distributed Systems 4, 1993, 41–61.
Message Passing Interface Forum, MPI: A message-passing interface standard - version 1. 1, 1995.
Moos, H., Verbaeten, P., Object migration in a heterogeneous world–a multi-dimensional affair, Proc. Third International Workshop on Object Orientation in Operating Systems, Asheville, 1993, 62–72.
Nuttall, M., A brief survey of systems providing process or object mi-gration facilities, ACM SIGOPS Operating Systems Review 28, 1994, 64–80.
Pierce, P., A concurrent file system for a highly parallel mass storage system, Proc. the Fourth Conference on Hypercube Concurrent Computers and Applications, 1989, 155–160.
Pratt, T.W., French, J.C., Dickens, P.M., Janet, S.A., Jr., A comparison of the architecture and performance of two parallel file systems, Proc. Fourth Conference on Hypercube Concurrent Computers and Applications, 1989, 161–166.
Reed, D.P., Kanodia, R.K., Synchronization with eventcounts and se-quencers, Communications of the ACM 22, 1979, 115–123.
Roy, P.J., Unix file access and caching in a multicomputer environment, Proc. the Usenix Mach III Symposium, 1993, 21–37.
Scales, D.J., Gharachorloo, K., Towards transparent and efficient soft-ware distributed shared memory, Proc. 16th ACM Symposium on Operating Systems Principles (SIGOPS’97), ACM SIGOPS Operating Systems Review 31, 1997, 157–169.
Silberschatz, A., Galvin, P.B., Operating System Concepts, Addison-Wesley, 1994.
Singhal, M., Shivaratri, N.G., Advanced Concepts in Operating Systems,McGraw Hill, Inc., New York, 1994.
Smith, P., Hutchinson, N.C., Heterogeneous process migration: The Tui system, Technical Report, Department of Computer Science, University of British Columbia, 1997.
Thinking Machines Corporation, Connection Machine Model CM-2 Technical Summary,1990.
Thinking Machines Corporation, Connection Machine Model CM-5 Technical Summary,1990.
Zayas, E.R., Attacking the process migration bottleneck, Proc. Sympo-sium on Operating Systems Principles, Austin, 1987, 13–22.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Garcia, J., Ferreira, P., Guedes, P. (2000). Parallel Operating Systems. In: Błażewicz, J., Ecker, K., Plateau, B., Trystram, D. (eds) Handbook on Parallel and Distributed Processing. International Handbooks on Information Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04303-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-662-04303-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-08571-0
Online ISBN: 978-3-662-04303-5
eBook Packages: Springer Book Archive