Abstract
Write buffering is one of many successful mechanisms that improves the performance and scalability of multiprocessors. However, it leads to more complex memory system behavior, which cannot be described using intuitive consistency models, such as Sequential Consistency. It is crucial to provide programmers with a specification of the exact behavior of such complex memories. This article presents a uniform framework for describing systems at different levels of abstraction and proving their equivalence. The framework is used to derive and prove correct simple specifications in terms of program-level instructions of the sparc total store order and partial store order memories.The framework is also used to examine the sparc relaxed memory order. We show that it is not a memory consistency model that corresponds to any implementation on a multiprocessor that uses write-buffers, even though we suspect that the sparc version 9 specification of relaxed memory order was intended to capture a general write-buffer architecture. The same technique is used to show that Coherence does not correspond to a write-buffer architecture. A corollary, which follows from the relationship between Coherence and Alpha, is that any implementation of Alpha consistency using write-buffers cannot produce all possible Alpha computations. That is, there are some computations that satisfy the Alpha specification but cannot occur in the given write-buffer implementation.
- Adir, A., Attiya, H., and Shurek, G. 2003. Information-flow models for shared memory with an application to the PowerPC architecture. IEEE Trans. Parallel Distrib. Syst. 14, 5, 502--515.]] Google ScholarDigital Library
- Ahamad, M., Bazzi, R., John, R., Kohli, P., and Neiger, G. 1993. The power of processor consistency. In Proceedings of the 5th International on Parallel Algorithms and Architectures. ACM, New York, 251--260. (Technical Report GIT-CC-92/34, College of Computing, Georgia Institute of Technology.)]] Google ScholarDigital Library
- Ahamad, M., Neiger, G., Burns, J., Kohli, P., and Hutto, P. 1995. Causal memory: Definitions, implementations, and programming. Distrib. Comput. 9, 37--49.]]Google ScholarDigital Library
- Anger, F. 1989. On Lamport's interprocessor communication model. ACM Trans. Prog. Lang. Syst. 11, 404--417.]] Google ScholarDigital Library
- Attiya, H., Chaudhuri, S., Friedman, R., and Welch, J. 1998. Shared memory consistency conditions for non-sequential execution: Definitions and programming strategies. SIAM J. Comput. 27, 1 (Feb.), 65--89.]] Google ScholarDigital Library
- Attiya, H. and Friedman, R. 1992. A correctness condition for high performance multiprocessors. In Proceedings of the 24th International Symposium on Theory of Computing. ACM, New York, 679--690.]] Google ScholarDigital Library
- Attiya, H. and Friedman, R. 1994. Programming DEC-Alpha based multiprocessors the easy way. In Proceedings of the 6th International Symposium on Parallel Algorithms and Architectures. ACM, New York, 157--166. (Technical Report LPCR 9411, Computer Science Department, Technion.)]] Google ScholarDigital Library
- Compaq Computer Corporation 1998. The Alpha Architecture Handbook. Compaq Computer Corporation. Order number: EC-QD2KC-TE.]]Google Scholar
- Dubois, M., Scheurich, C., and Briggs, F. 1986. Memory access buffering in multiprocessors. In Proceedings of the 13th International Symposium on Computer Architecture. ACM, New York, 434--442.]] Google ScholarDigital Library
- Friedman, R. 1995. Implementing hybrid consistency with high-level synchronization operations. Distr. Comput. 9, 3 (Dec.), 119--129.]] Google ScholarDigital Library
- Frigo, M. 1998. The weakest reasonable memory model. M.S. dissertation. Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA.]]Google Scholar
- Gibbons, P. and Merritt, M. 1992. Specifying nonblocking shared memories. In Proceedings of the 4th International Symposium on Parallel Algorithms and Architectures. ACM, New York, 306--315.]] Google ScholarDigital Library
- Gontmakher, A. and Schuster, A. 2000. Java consistency: Nonoperational characterizations for Java memory behavior. ACM Trans. Comput. Syst. 18, 4, 333--386.]] Google ScholarDigital Library
- Goodman, J. 1989. Cache consistency and sequential consistency. Tech. Rep. 61, IEEE Scalable Coherent Interface Working Group. March.]]Google Scholar
- Herlihy, M. and Wing, J. 1990. Linearizability: A correctness condition for concurrent objects. ACM Trans. Prog. Lang. Syst. 12, 3 (July), 463--492.]] Google ScholarDigital Library
- Higham, L. and Kawash, J. 1998. Java: Memory consistency and process coordination (extended abstract). In Proceedings of the 12th International Symposium on Distributed Computing. Lecture Notes in Computer Science, vol. 1499. Springer-Verlag, New York, 201--215.]] Google ScholarDigital Library
- Higham, L. and Kawash, J. 2000. Memory consistency and process coordination for SPARC multiprocessors. In Proceedings of the 7th International Conference on High Performance Computing. Lecture Notes in Computer Science, vol. 1970, Springer-Verlag, New York, 355--366.]] Google ScholarDigital Library
- Higham, L. and Kawash, J. 2005. Process coordination in the absence of sequential consistency. In Preparation.]]Google Scholar
- Hoare, C. A. R. 1972. Towards a theory of parallel programming. In Operating System Techniques, C. A. R. Hoare and R. H. Perrott, Eds. Academic Press, Orland, FL.]]Google Scholar
- Intel Corporation 2002. Intel itanium architecture software developer's manual, Volume 2: System architecture. http://www.intel.com/.]]Google Scholar
- International Business Machines Corporation 1997. PowerPC microprocessor family: The programming environments for 32-bit microprocessor. http://www-3.ibm.com/chips/techlib/techlib.nsf/productfamilies/PowerPC.]]Google Scholar
- Kawash, J. 2000. Limitations and capabilities of weak memory consistency systems. Ph.D. dissertation, Department of Computer Science, The University of Calgary, Calgary, B.L., Canada.]] Google ScholarDigital Library
- Kohli, P., Neiger, G., and Ahamad, M. 1993. A characterization of scalable shared memories. In Proceedings of the 1993 International Conference on Parallel Processing.]] Google ScholarDigital Library
- Lamport, L. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July), 558--565.]] Google ScholarDigital Library
- Lamport, L. 1979a. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. C-28, 9 (Sept.), 690--691.]]Google ScholarDigital Library
- Lamport, L. 1979b. A new approach to proving the correctness of multiprocess programs. ACM Trans. Prog. Lang. Syst. 1, 1 (July), 84--97.]] Google ScholarDigital Library
- Lamport, L. 1986a. The mutual exclusion problem (Parts I and II). J. ACM 33, 2 (Apr.), 313--326 and 327--348.]] Google ScholarDigital Library
- Lamport, L. 1986b. On interprocess communication (Parts I and II). Distr. Comput. 1, 2, 77--85 and 86--101.]]Google ScholarCross Ref
- Lamport, L. 1997. How to make a correct multiprocess program execute correctly on a multiprocessor. IEEE Trans. Comput. 46, 7 (July), 779--782.]] Google ScholarDigital Library
- Lynch, N. 1996. Distributed Algorithms. Morgan Kaufmann, San Mateo, CA.]] Google ScholarDigital Library
- Lynch, N. and Tuttle, M. 1989. An introduction to input/output automata. CWI Quarterly 2, 3 (Sept.), 219--246.]]Google Scholar
- Misra, J. 1986. Axioms for memory access in asynchronous hardware systems. ACM Trans. Prog. Lang. Syst. 8, 1, 142--153.]] Google ScholarDigital Library
- Owicki, S. and Gries, D. 1976. Verifying properties of parallel programs: An axiomatic approach. Commun. ACM 19, 5 (May), 279--285.]] Google ScholarDigital Library
- Park, S. and Dill, D. 1999. An executable specification and verifier for relaxed memory order. IEEE Trans. Comput. 48, 2 (Feb.), 227--235.]] Google ScholarDigital Library
- SPARC International, Inc. 1992. The SPARC Architecture Manual version 8. Prentice-Hall, Englewood Cliffs, NJ.]] Google ScholarDigital Library
- Sun Microsystems. 2004. http://www.sun.com/processors/whitepapers/us4_whitepaper.pdf.]]Google Scholar
- Weaver, D. and Germond, T., Eds. 1994--2000. The SPARC Architecture Manual version 9. Prentice-Hall, Englewood Sliffs, NJ. http://developers.sun.com/solaris/articles/sparcv9.pdf.]]Google Scholar
Index Terms
- Specifying memory consistency of write buffer multiprocessors
Recommendations
Write buffer design for cache-coherent shared-memory multiprocessors
ICCD '95: Proceedings of the 1995 International Conference on Computer Design: VLSI in Computers and ProcessorsWe evaluate the performance impact of two different write-buffer configurations (one word per buffer entry and one block per buffer entry) and two different write policies (write-through and write-back), when using the partial block invalidation ...
Consistency models for Internet caching
WISICT '04: Proceedings of the winter international synposium on Information and communication technologiesCaching continues to be an indispensable mechanism to achieve performance, scalability, and availability in the continuously growing Internet applications, such as the WWW. However, replication introduces the overhead of keeping the caches consistent. ...
Comments