skip to main content
article

Specifying memory consistency of write buffer multiprocessors

Published:01 February 2007Publication History
Skip Abstract Section

Abstract

Write buffering is one of many successful mechanisms that improves the performance and scalability of multiprocessors. However, it leads to more complex memory system behavior, which cannot be described using intuitive consistency models, such as Sequential Consistency. It is crucial to provide programmers with a specification of the exact behavior of such complex memories. This article presents a uniform framework for describing systems at different levels of abstraction and proving their equivalence. The framework is used to derive and prove correct simple specifications in terms of program-level instructions of the sparc total store order and partial store order memories.The framework is also used to examine the sparc relaxed memory order. We show that it is not a memory consistency model that corresponds to any implementation on a multiprocessor that uses write-buffers, even though we suspect that the sparc version 9 specification of relaxed memory order was intended to capture a general write-buffer architecture. The same technique is used to show that Coherence does not correspond to a write-buffer architecture. A corollary, which follows from the relationship between Coherence and Alpha, is that any implementation of Alpha consistency using write-buffers cannot produce all possible Alpha computations. That is, there are some computations that satisfy the Alpha specification but cannot occur in the given write-buffer implementation.

References

  1. Adir, A., Attiya, H., and Shurek, G. 2003. Information-flow models for shared memory with an application to the PowerPC architecture. IEEE Trans. Parallel Distrib. Syst. 14, 5, 502--515.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ahamad, M., Bazzi, R., John, R., Kohli, P., and Neiger, G. 1993. The power of processor consistency. In Proceedings of the 5th International on Parallel Algorithms and Architectures. ACM, New York, 251--260. (Technical Report GIT-CC-92/34, College of Computing, Georgia Institute of Technology.)]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ahamad, M., Neiger, G., Burns, J., Kohli, P., and Hutto, P. 1995. Causal memory: Definitions, implementations, and programming. Distrib. Comput. 9, 37--49.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Anger, F. 1989. On Lamport's interprocessor communication model. ACM Trans. Prog. Lang. Syst. 11, 404--417.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Attiya, H., Chaudhuri, S., Friedman, R., and Welch, J. 1998. Shared memory consistency conditions for non-sequential execution: Definitions and programming strategies. SIAM J. Comput. 27, 1 (Feb.), 65--89.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Attiya, H. and Friedman, R. 1992. A correctness condition for high performance multiprocessors. In Proceedings of the 24th International Symposium on Theory of Computing. ACM, New York, 679--690.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Attiya, H. and Friedman, R. 1994. Programming DEC-Alpha based multiprocessors the easy way. In Proceedings of the 6th International Symposium on Parallel Algorithms and Architectures. ACM, New York, 157--166. (Technical Report LPCR 9411, Computer Science Department, Technion.)]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Compaq Computer Corporation 1998. The Alpha Architecture Handbook. Compaq Computer Corporation. Order number: EC-QD2KC-TE.]]Google ScholarGoogle Scholar
  9. Dubois, M., Scheurich, C., and Briggs, F. 1986. Memory access buffering in multiprocessors. In Proceedings of the 13th International Symposium on Computer Architecture. ACM, New York, 434--442.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Friedman, R. 1995. Implementing hybrid consistency with high-level synchronization operations. Distr. Comput. 9, 3 (Dec.), 119--129.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Frigo, M. 1998. The weakest reasonable memory model. M.S. dissertation. Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA.]]Google ScholarGoogle Scholar
  12. Gibbons, P. and Merritt, M. 1992. Specifying nonblocking shared memories. In Proceedings of the 4th International Symposium on Parallel Algorithms and Architectures. ACM, New York, 306--315.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Gontmakher, A. and Schuster, A. 2000. Java consistency: Nonoperational characterizations for Java memory behavior. ACM Trans. Comput. Syst. 18, 4, 333--386.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Goodman, J. 1989. Cache consistency and sequential consistency. Tech. Rep. 61, IEEE Scalable Coherent Interface Working Group. March.]]Google ScholarGoogle Scholar
  15. Herlihy, M. and Wing, J. 1990. Linearizability: A correctness condition for concurrent objects. ACM Trans. Prog. Lang. Syst. 12, 3 (July), 463--492.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Higham, L. and Kawash, J. 1998. Java: Memory consistency and process coordination (extended abstract). In Proceedings of the 12th International Symposium on Distributed Computing. Lecture Notes in Computer Science, vol. 1499. Springer-Verlag, New York, 201--215.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Higham, L. and Kawash, J. 2000. Memory consistency and process coordination for SPARC multiprocessors. In Proceedings of the 7th International Conference on High Performance Computing. Lecture Notes in Computer Science, vol. 1970, Springer-Verlag, New York, 355--366.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Higham, L. and Kawash, J. 2005. Process coordination in the absence of sequential consistency. In Preparation.]]Google ScholarGoogle Scholar
  19. Hoare, C. A. R. 1972. Towards a theory of parallel programming. In Operating System Techniques, C. A. R. Hoare and R. H. Perrott, Eds. Academic Press, Orland, FL.]]Google ScholarGoogle Scholar
  20. Intel Corporation 2002. Intel itanium architecture software developer's manual, Volume 2: System architecture. http://www.intel.com/.]]Google ScholarGoogle Scholar
  21. International Business Machines Corporation 1997. PowerPC microprocessor family: The programming environments for 32-bit microprocessor. http://www-3.ibm.com/chips/techlib/techlib.nsf/productfamilies/PowerPC.]]Google ScholarGoogle Scholar
  22. Kawash, J. 2000. Limitations and capabilities of weak memory consistency systems. Ph.D. dissertation, Department of Computer Science, The University of Calgary, Calgary, B.L., Canada.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Kohli, P., Neiger, G., and Ahamad, M. 1993. A characterization of scalable shared memories. In Proceedings of the 1993 International Conference on Parallel Processing.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lamport, L. 1978. Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21, 7 (July), 558--565.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lamport, L. 1979a. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. C-28, 9 (Sept.), 690--691.]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lamport, L. 1979b. A new approach to proving the correctness of multiprocess programs. ACM Trans. Prog. Lang. Syst. 1, 1 (July), 84--97.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Lamport, L. 1986a. The mutual exclusion problem (Parts I and II). J. ACM 33, 2 (Apr.), 313--326 and 327--348.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lamport, L. 1986b. On interprocess communication (Parts I and II). Distr. Comput. 1, 2, 77--85 and 86--101.]]Google ScholarGoogle ScholarCross RefCross Ref
  29. Lamport, L. 1997. How to make a correct multiprocess program execute correctly on a multiprocessor. IEEE Trans. Comput. 46, 7 (July), 779--782.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Lynch, N. 1996. Distributed Algorithms. Morgan Kaufmann, San Mateo, CA.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Lynch, N. and Tuttle, M. 1989. An introduction to input/output automata. CWI Quarterly 2, 3 (Sept.), 219--246.]]Google ScholarGoogle Scholar
  32. Misra, J. 1986. Axioms for memory access in asynchronous hardware systems. ACM Trans. Prog. Lang. Syst. 8, 1, 142--153.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Owicki, S. and Gries, D. 1976. Verifying properties of parallel programs: An axiomatic approach. Commun. ACM 19, 5 (May), 279--285.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Park, S. and Dill, D. 1999. An executable specification and verifier for relaxed memory order. IEEE Trans. Comput. 48, 2 (Feb.), 227--235.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. SPARC International, Inc. 1992. The SPARC Architecture Manual version 8. Prentice-Hall, Englewood Cliffs, NJ.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Sun Microsystems. 2004. http://www.sun.com/processors/whitepapers/us4_whitepaper.pdf.]]Google ScholarGoogle Scholar
  37. Weaver, D. and Germond, T., Eds. 1994--2000. The SPARC Architecture Manual version 9. Prentice-Hall, Englewood Sliffs, NJ. http://developers.sun.com/solaris/articles/sparcv9.pdf.]]Google ScholarGoogle Scholar

Index Terms

  1. Specifying memory consistency of write buffer multiprocessors

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader