Parallel Operating Systems

Garcia, João; Ferreira, Paulo; Guedes, Paulo

doi:10.1007/978-3-662-04303-5_5

João Garcia⁶,
Paulo Ferreira⁶ &
Paulo Guedes⁶

Part of the book series: International Handbooks on Information Systems ((INFOSYS))

1079 Accesses
1 Citations

Summary

Parallel operating systems are the interface between parallel computers (or computer systems) and the applications (parallel or not) that are executed on them. They translate the hardware’s capabilities into concepts usable by programming languages.

Great diversity marked the beginning of parallel architectures and their operating systems. This diversity has since been reduced to a small set of dominating configurations: symmetric multiprocessors running commodity applications and operating systems (UNIX and Windows NT) and multicomputers running custom kernels and parallel applications. Additionally, there is some (mostly experimental) work done towards the exploitation of the shared memory paradigm on top of networks of workstations or personal computers.

In this chapter, we discuss the operating system components that are essential to support parallel systems and the central concepts surrounding their operation: scheduling, synchronization, multi-threading, inter-process communication, memory management and fault tolerance.

Currently, SMP computers are the most widely used multiprocessors. Users find it a very interesting model to have a computer, which, although it derives its processing power from a set of processors, does not require any changes to applications and only minor changes to the operating system. Furthermore, the most popular parallel programming languages have been ported to SMP architectures enabling also the execution of demanding parallel applications on these machines.

However, users who want to exploit parallel processing to the fullest use those same parallel programming languages on top of NORMA computers. These multicomputers with fast interconnects are the ideal hardware support for message passing parallel applications. The surviving commercial models with NORMA architectures are very expensive machines, which one will find running calculus intensive applications, such as weather forecasting or fluid dynamics modelling.

We also discuss some of the experiments that have been made both in hardware (DASH, Alewife) and in software systems (TreadMarks, Shasta) to deal with the scalability issues of maintaining consistency in shared-memory systems and to prove their applicability on a large-scale.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, A., Chaiken, D., Johnson, K., Kranz, D., Kubiatowicz, J., Kurihara, K., Lim, B., Maa, G., Nussbaum, D., The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor, Scalable Shared Memory Multiprocessors, Kluwer Academic Publishers, 1991.
Google Scholar
Bacon, J., Concurrent Systems, An Integrated Approach to Operating Systems, Database, and Distributed System, Addison-Wesley, 1993.
Google Scholar
Bershad, B.N., Zekauskas, M., Midway, J., Shared memory parallel programming with entry consistency for distributed memory multiprocessors, Technical Report CMU-CS-91–170, School of Computer Science, Carnegie-Mellon University, 1991.
Google Scholar
Blank, T., The MasPar MP-1 architecture, Proc. 35th IEEE Computer Society International Conference (COMPCON), 1990, 20–24.
Google Scholar
Bricker, A., Litzkow, M., Livny, M., Condor technical summary, Technical Report, Computer Sciences Department, University of Wisconsin-Madison, 1992.
Google Scholar
Brooks III, E.D., Warren, K.H., A study of performance on SMP and distributed memory architectures using a shared memory programming model, SC97: High Performance Networking and Computing,San Jose, 1997.
Google Scholar
Charlesworth, A., Starfire: Extending the SMP envelope, IEEE Micro 18, 1998.
Google Scholar
Corbett, P.F., Feitelson, D.G., Prost, J.P., Johnson-Baylor, S.B., Parallel access to files in the vesta file system, Proc. Supercomputing ‘83, 1993, 472–481.
Google Scholar
Cormen, T.H., Kotz, D., Integrating theory and practice in parallel file systems, Proc. the 1993 DAGS/PC Symposium, 1993, 64–74.
Google Scholar
Custer, H., Inside Windows NT, Microsoft Press, Washington, 1993.
Google Scholar
Dijkstra, E.W., Guarded commands, nondeterminancy and the formal derivation of programs, Communications of the ACM 18, 1975, 453–457.
Article Google Scholar
Douglis, F., Transparent process migration: Design alternatives and the Sprite implementation, Software Practice and Experience 21, 1991, 757785.
Google Scholar
Elnozahy, E.N., Johnson, D.B., Wang, Y.M., A survey of rollback-recovery protocols in message-passing systems, Technical Report, School of Computer Science, Carnegie Mellon University, 1996.
Google Scholar
Flynn, M., IEEE J., Some computer organizations and their effectivenessTransactions on Computers 21, 1972, 948–960.
Google Scholar
Geist, A., Beguelin, A., Dongarra, J., PVM: Parallel Virtual Machine: A Users’ Guide and Tutorial for Networked Parallel Computing, MIT Press, Cambridge, 1994.
Google Scholar
Guedes, P., Castro, M., Distributed shared object memory, Proc.Fourth Workshop on Workstation Operating Systems (WWOS-IV),Napa, 1993, 142–149.
Google Scholar
Gupta, A., Singh, J.P., Truman, J., Hennessy, J.L., An empirical comparison of the Kendall Square Research KSR-1 and Stanford DASH multiprocessors, Proc. Supercomputing’93, 1993, 214–225.
Google Scholar
Herdeg, G.A., Design and implementation of the Alpha Server 4100 CPU and memory architecture, Digital Technical Journal 8, 1997, 4860.
Google Scholar
Hoare, C.A.R., Monitors: An operating system structuring concept,Communications of the ACM 17, 1974, 549–557.
Google Scholar
Intel Corporation Supercomputer Systems Division, Paragon SystemUser’s Guide,1995.
Google Scholar
Jiang, D., Singh, J.P., A methodology and an evaluation of the SGI Ori-gin2000, Proc. SIGMETRICS Conference on Measurement and Modeling of Computer Systems, 1998, 171–181.
Google Scholar
Johnson, E.E., Completing an MIMD multiprocessor taxonomy, Com-puter Architecture News 16, 1988, 44–47.
Article Google Scholar
Keleher, P., Dwarkadas, S., Cox, A.L., Zwaenepoel, W., TreadMarks: Distributed shared memory on standard workstations and operating systems, Proc. the Winter 94 Usenix Conference, 1994, 115–131.
Google Scholar
Kim, K.H., You, J.H., Abouelnaga, A., A scheme for coordinated independent execution of independently designed recoverable distributed processes, Proc. IEEE Fault-Tolerant Computing Symposium,1996, 130135.
Google Scholar
Lamport, L., Time, clocks and the ordering of events in a distributed system, Communications of the ACM 21, 1978, 558–565.
Article Google Scholar
Laudon, J., Lenoski, D., The SGI Origin: A ccNUMA highly scalable server, Technical Report, Silicon Graphics, Inc., 1994.
Google Scholar
Lea, R., Cool: System support for distributed programming, Commu-nications of the ACM 36, 1993, 37–46.
Article Google Scholar
Lenoski, D., Laudon, J., Truman, J., Nakahira, D., Stevens, L., Gupta, A., Hennessy, J., The DASH prototype: Login overhead and performance, IEEE Transactions on Parallel and Distributed Systems 4, 1993, 41–61.
Article Google Scholar
Message Passing Interface Forum, MPI: A message-passing interface standard - version 1. 1, 1995.
Google Scholar
Moos, H., Verbaeten, P., Object migration in a heterogeneous world–a multi-dimensional affair, Proc. Third International Workshop on Object Orientation in Operating Systems, Asheville, 1993, 62–72.
Google Scholar
Nuttall, M., A brief survey of systems providing process or object mi-gration facilities, ACM SIGOPS Operating Systems Review 28, 1994, 64–80.
Article Google Scholar
Pierce, P., A concurrent file system for a highly parallel mass storage system, Proc. the Fourth Conference on Hypercube Concurrent Computers and Applications, 1989, 155–160.
Google Scholar
Pratt, T.W., French, J.C., Dickens, P.M., Janet, S.A., Jr., A comparison of the architecture and performance of two parallel file systems, Proc. Fourth Conference on Hypercube Concurrent Computers and Applications, 1989, 161–166.
Google Scholar
Reed, D.P., Kanodia, R.K., Synchronization with eventcounts and se-quencers, Communications of the ACM 22, 1979, 115–123.
Article Google Scholar
Roy, P.J., Unix file access and caching in a multicomputer environment, Proc. the Usenix Mach III Symposium, 1993, 21–37.
Google Scholar
Scales, D.J., Gharachorloo, K., Towards transparent and efficient soft-ware distributed shared memory, Proc. 16th ACM Symposium on Operating Systems Principles (SIGOPS’97), ACM SIGOPS Operating Systems Review 31, 1997, 157–169.
Article Google Scholar
Silberschatz, A., Galvin, P.B., Operating System Concepts, Addison-Wesley, 1994.
Google Scholar
Singhal, M., Shivaratri, N.G., Advanced Concepts in Operating Systems,McGraw Hill, Inc., New York, 1994.
Google Scholar
Smith, P., Hutchinson, N.C., Heterogeneous process migration: The Tui system, Technical Report, Department of Computer Science, University of British Columbia, 1997.
Google Scholar
Thinking Machines Corporation, Connection Machine Model CM-2 Technical Summary,1990.
Google Scholar
Thinking Machines Corporation, Connection Machine Model CM-5 Technical Summary,1990.
Google Scholar
Zayas, E.R., Attacking the process migration bottleneck, Proc. Sympo-sium on Operating Systems Principles, Austin, 1987, 13–22.
Google Scholar

Download references

Author information

Authors and Affiliations

IST/INESC, Lisbon, Portugal
João Garcia, Paulo Ferreira & Paulo Guedes

Authors

João Garcia
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Guedes
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Instytut Informatyki, Politechnika Poznanska, ul. Piotrowo 3a, 60-965, Poznań, Poland
Jacek Błażewicz
Institut für Informatik, Technische Universität Clausthal, Julius Albert Str. 4, D-78678, Clausthal-Zellerfeld, Germany
Klaus Ecker
Institut Fourier, LMC, 100 Rue des Mathematiques, BP 53X, F-38041, Grenoble Cedex 9, France
Brigitte Plateau
Institut National Polytechniques de Grenoble, 46, avenue Felix Viallet, F-38031, Grenoble Cedex, France
Denis Trystram

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Garcia, J., Ferreira, P., Guedes, P. (2000). Parallel Operating Systems. In: Błażewicz, J., Ecker, K., Plateau, B., Trystram, D. (eds) Handbook on Parallel and Distributed Processing. International Handbooks on Information Systems. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-04303-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-662-04303-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-08571-0
Online ISBN: 978-3-662-04303-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics